kubernetes-troubleshooting
π―Skillfrom nik-kale/sre-skills
Helps SREs diagnose and resolve Kubernetes cluster issues through comprehensive troubleshooting techniques and diagnostic commands.
Part of
nik-kale/sre-skills(5 items)
Installation
npx skills add nik-kale/sre-skillsnpx skills add nik-kale/sre-skills/incident-responseMore from this repository4
Generates standardized, actionable operational runbooks and incident response playbooks with executable steps and best practices.
Systematically evaluates service readiness for production by assessing reliability, security, observability, and operational requirements through a comprehensive checklist.
Systematically investigates and resolves production incidents through structured triage, data gathering, impact assessment, and root cause analysis.
Automates infrastructure observability setup by configuring monitoring, logging, and tracing tools for comprehensive system visibility.