🎯

observability-analyzer

🎯Skill

from adaptationio/skrillz

VibeIndex|
What it does

Analyzes Claude Code observability data to generate insights on performance, costs, errors, tool usage, sessions, conversations, and subagents through advanced metrics and log querying.

observability-analyzer

Installation

Add MarketplaceAdd marketplace to Claude Code
/plugin marketplace add adaptationio/Skrillz
Install PluginInstall plugin from marketplace
/plugin install skrillz@adaptationio-Skrillz
Claude CodeAdd plugin in Claude Code
/plugin enable skrillz@adaptationio-Skrillz
Add MarketplaceAdd marketplace to Claude Code
/plugin marketplace add /path/to/skrillz
Install PluginInstall plugin from marketplace
/plugin install skrillz@local

+ 4 more commands

πŸ“– Extracted from docs: adaptationio/skrillz
3
-
Last UpdatedJan 16, 2026

Skill Details

SKILL.md

Query and analyze Claude Code observability data (metrics, logs, traces). Use when analyzing performance, costs, errors, tool usage, sessions, conversations, or subagents.

Overview

# Observability Analyzer

Query Claude Code telemetry and generate insights from metrics, logs, and traces. Works with both default OTEL telemetry and enhanced hook-based telemetry.

Data Sources

| Source | Job Name | Contains |

|--------|----------|----------|

| Default OTEL | claude_code | API metrics, token usage, costs |

| Enhanced Hooks | claude_code_enhanced | Sessions, conversations, tools, subagents |

Operations

`query-metrics <promql>`

Execute PromQL query against Prometheus.

```bash

query-metrics 'sum(claude_code_token_usage)[7d]'

```

`query-logs <logql>`

Execute LogQL query against Loki.

```bash

query-logs '{job="claude_code_enhanced", event_type="tool_call"} | json' --since 24h

```

`analyze-errors`

Detect and group error patterns from enhanced telemetry.

```logql

{job="claude_code_enhanced", event_type="tool_result", status="error"} | json

```

Output: Error types, frequencies, affected tools, recommendations.

`analyze-performance`

Identify slow operations and response sizes.

```logql

{job="claude_code_enhanced", event_type="tool_result"} | json | response_length > 50000

```

Output: Large responses, estimated token costs, slow patterns.

`analyze-costs`

Calculate token usage from content size estimates.

```logql

sum by (repo) (sum_over_time({job="claude_code_enhanced", event_type="context_utilization"} | json | unwrap estimated_session_tokens [24h]))

```

Output: Token estimates by repo, session costs, projections.

`analyze-tools`

Tool usage statistics and sequences.

```logql

sum by (tool) (count_over_time({job="claude_code_enhanced", event_type="tool_call"} | json [24h]))

```

Output: Call frequency, success rates, tool sequences, common patterns.

`analyze-sessions`

Session lifecycle and duration analytics.

```logql

{job="claude_code_enhanced", event_type="session_end"} | json

```

Output: Session durations, turn counts, tools per session, termination reasons.

`analyze-conversations`

Conversation and prompt analytics.

```logql

sum by (pattern) (count_over_time({job="claude_code_enhanced", event_type="user_prompt"} | json [24h]))

```

Output: Prompt patterns (question/debugging/creation/ultrathink), turn distribution.

`analyze-subagents`

Subagent/Task tool usage.

```logql

{job="claude_code_enhanced", event_type="tool_call", tool="Task"} | json

```

Output: Subagent types used, completion rates, parallel execution patterns.

`analyze-skills`

Skill invocation analytics.

```logql

sum by (skill_name) (count_over_time({job="claude_code_enhanced", event_type="skill_usage"} | json [24h]))

```

Output: Most used skills, skill usage by repo, trends.

`analyze-context`

Context window utilization.

```logql

{job="claude_code_enhanced", event_type="context_utilization"} | json | context_percentage > 50

```

Output: High utilization sessions, compaction events, token efficiency.

`analyze-repos`

Repository/project activity.

```logql

sum by (repo, tool) (count_over_time({job="claude_code_enhanced", event_type="tool_call"} | json [24h]))

```

Output: Activity per repo, tool usage by project, branch patterns.

`generate-report`

Comprehensive analysis report (all dimensions).

Output: Markdown report with errors, performance, costs, sessions, conversations, tools.

Key Queries

Enhanced Telemetry (Loki)

```logql

# All events (last hour)

{job="claude_code_enhanced"} | json

# Session analytics

{job="claude_code_enhanced", event_type="session_end"} | json | duration_seconds > 300

# Tool errors

{job="claude_code_enhanced", event_type="tool_result", status="error"} | json

# High context usage

{job="claude_code_enhanced", event_type="context_utilization"} | json | context_percentage > 75

# Subagent spawns

{job="claude_code_enhanced", event_type="tool_call", tool="Task"} | json

# Skill invocations

{job="claude_code_enhanced", event_type="skill_usage"} | json

# Prompt patterns

{job="claude_code_enhanced", event_type="user_prompt"} | json | pattern="ultrathink"

# Tool sequences

{job="claude_code_enhanced", event_type="tool_call"} | json | line_format "{{.tool_name}} β†’ {{.previous_tool}}"

# Context compaction

{job="claude_code_enhanced", event_type="context_compact"} | json

# Permission requests

{job="claude_code_enhanced", event_type="permission_request"} | json

```

Default OTEL (Prometheus)

```promql

# Total token usage (7 days)

sum(increase(claude_code_token_usage[7d]))

# Error rate by tool

sum by (tool_name) (rate(claude_code_tool_result{status="failure"}[1h]))

# P95 tool latency

histogram_quantile(0.95, claude_code_tool_duration_bucket)

# Daily costs

sum(increase(claude_code_cost_usage[24h]))

```

Event Types Reference

| Event Type | Description | Key Fields |

|------------|-------------|------------|

| session_start | Session initialization | source, permission_mode |

| session_end | Session termination | duration_seconds, turn_count, tools_used |

| user_prompt | User message submitted | pattern, prompt_length, estimated_tokens |

| tool_call | Tool invocation | tool_name, tool_details, sequence_position |

| tool_result | Tool completion | status, response_length, is_error |

| skill_usage | Skill invoked | skill_name |

| context_utilization | Token estimate | estimated_session_tokens, context_percentage |

| context_compact | Compaction event | trigger (manual/auto) |

| subagent_complete | Task agent finished | total_subagents |

| permission_request | Permission dialog | notification_type |

| notification | System notification | notification_type |

Grafana Dashboards

  • Claude Code Overview - High-level metrics
  • Tool Performance - Tool latencies and success rates
  • Cost Analysis - Token usage and costs
  • Error Tracking - Error patterns and trends
  • Session Analytics - Session-level insights
  • Enhanced Analytics - Model/skill/context/repo tracking
  • Deep Analytics - Comprehensive conversation and tool analysis

Access: http://localhost:3000 (admin/admin)

Scripts

  • scripts/query-prometheus.sh - PromQL query helper
  • scripts/query-loki.sh - LogQL query helper
  • scripts/analyze-errors.sh - Error analysis automation
  • scripts/analyze-sessions.sh - Session analytics
  • scripts/generate-report.sh - Full analysis report