🎯

skill-validator

🎯Skill

from panaversity/agentfactory

VibeIndex|
What it does

Validates skills comprehensively across 9 quality categories, scoring structure, content, interaction, documentation, and technical robustness to provide actionable improvement recommendations.

πŸ“¦

Part of

panaversity/agentfactory(23 items)

skill-validator

Installation

πŸ“‹ No install commands found in docs. Showing default command. Check GitHub for actual instructions.
Quick InstallInstall with npx
npx skills add panaversity/agentfactory --skill skill-validator
2Installs
-
AddedFeb 4, 2026

Skill Details

SKILL.md

|

Overview

# Skill Validator

Validate any skill against production-level quality criteria.

Before Implementation

| Source | Gather |

|--------|--------|

| Skill Directory | SKILL.md, references/, scripts/, assets/ |

| Skill Type | Builder, Guide, Automation, Analyzer, or Validator |

| Conversation | Validation purpose (audit, improvement, review) |

What This Skill Does NOT Do

  • Test skills in production environments
  • Automatically fix identified issues
  • Validate skill runtime behavior (only structure/content)
  • Replace human judgment on domain accuracy

Validation Workflow

Phase 1: Gather Context

  1. Read the skill's SKILL.md completely
  2. Identify skill type from frontmatter description:

- Builder skill (creates artifacts)

- Guide skill (provides instructions)

- Automation skill (executes workflows)

- Analyzer skill (extracts insights)

- Validator skill (enforces quality)

- Hybrid skill (combination of above)

  1. Read all reference files in references/ directory
  2. Check for assets/scripts directories
  3. Note frontmatter fields (name, description, allowed-tools, model)

Phase 2: Apply Criteria

Evaluate against 9 criteria categories. Each criterion scores 0-3:

  • 0: Missing/Absent
  • 1: Present but inadequate
  • 2: Adequate implementation
  • 3: Excellent implementation

---

Criteria Categories

1. Structure & Anatomy (Weight: 12%)

| Criterion | What to Check |

|-----------|---------------|

| SKILL.md exists | Root file present |

| Line count | <500 lines (context is precious) |

| Frontmatter complete | name and description present in YAML |

| Name constraints | 1-64 chars; lowercase alphanumeric + hyphens; no consecutive hyphens; can't start/end with hyphen; must match directory name |

| Description format | [What] + [When] format; ≀1024 chars |

| Description style | Third-person: "This skill should be used when..." |

| No extraneous files | No README.md, CHANGELOG.md, LICENSE in skill dir |

| Progressive disclosure | Details in references/, not bloated SKILL.md |

| Asset organization | Templates in assets/, scripts in scripts/ |

| Large file guidance | If references >10k words, grep patterns in SKILL.md |

Fail condition: Missing SKILL.md or >800 lines = automatic fail

2. Content Quality (Weight: 15%)

| Criterion | What to Check |

|-----------|---------------|

| Conciseness | No verbose explanations, context is public good |

| Imperative form | Instructions use "Do X" not "You should do X" |

| Appropriate freedom | Constraints where needed, flexibility where safe |

| Scope clarity | Clear what skill does AND does not do |

| No hallucination risk | No instructions that encourage making up info |

| Output specification | Clear expected outputs defined |

3. User Interaction (Weight: 12%)

| Criterion | What to Check |

|-----------|---------------|

| Clarification triggers | Asks questions before acting on ambiguity |

| Required vs optional | Distinguishes must-know from nice-to-know |

| Graceful handling | What to do when user doesn't answer |

| No over-asking | Doesn't ask obvious or inferrable questions |

| Question pacing | Avoids too many questions in single message |

| Context awareness | Uses available context before asking |

Key pattern to look for:

```markdown

Required Clarifications

  1. Question about X
  2. Question about Y

Optional Clarifications

  1. Question about Z (if relevant)

Note: Avoid asking too many questions in a single message.

```

4. Documentation & References (Weight: 10%)

| Criterion | What to Check |

|-----------|---------------|

| Source URLs | Official documentation links provided |

| Reference files | Complex details in references/ not main file |

| Fetch guidance | Instructions to fetch docs for unlisted patterns |

| Version awareness | Notes about checking for latest patterns |

| Example coverage | Good/bad examples for key patterns |

Key pattern to look for:

```markdown

| Resource | URL | Use For |

|----------|-----|---------|

| Official Docs | https://... | Complex cases |

```

5. Domain Standards (Weight: 10%)

| Criterion | What to Check |

|-----------|---------------|

| Best practices | Follows domain conventions (e.g., WCAG, OWASP) |

| Enforcement mechanism | Checklists, validation steps, must-verify items |

| Anti-patterns | Lists what NOT to do |

| Quality gates | Output checklist before delivery |

Key pattern to look for:

```markdown

Must Follow

  • [ ] Requirement 1
  • [ ] Requirement 2

Must Avoid

  • Antipattern 1
  • Antipattern 2

```

6. Technical Robustness (Weight: 8%)

| Criterion | What to Check |

|-----------|---------------|

| Error handling | Guidance for failure scenarios |

| Security considerations | Input validation, secrets handling if relevant |

| Dependencies | External tools/APIs documented |

| Edge cases | Common edge cases addressed |

| Testability | Can outputs be verified? |

7. Maintainability (Weight: 8%)

| Criterion | What to Check |

|-----------|---------------|

| Modularity | References are self-contained topics |

| Update path | Easy to update when standards change |

| No hardcoded values | Uses placeholders/variables where appropriate |

| Clear organization | Logical section ordering |

8. Zero-Shot Implementation (Weight: 12%)

Skills should enable single-interaction implementation with embedded expertise.

| Criterion | What to Check |

|-----------|---------------|

| Before Implementation section | Context gathering guidance present |

| Codebase context | Guidance to scan existing structure/patterns |

| Conversation context | Uses discussed requirements/decisions |

| Embedded expertise | Domain knowledge in references/, not runtime discovery |

| User-only questions | Only asks for USER requirements, not domain knowledge |

Key pattern to look for:

```markdown

Before Implementation

Gather context to ensure successful implementation:

| Source | Gather |

|--------|--------|

| Codebase | Existing structure, patterns, conventions |

| Conversation | User's specific requirements |

| Skill References | Domain patterns from references/ |

| User Guidelines | Project-specific conventions |

```

Red flag: Skill instructs to "research" or "discover" domain knowledge at runtime instead of embedding it.

9. Reusability (Weight: 13%)

Skills should handle variations, not single requirements.

| Criterion | What to Check |

|-----------|---------------|

| Handles variations | Not hardcoded to single use case |

| Variable elements | Clarifications capture what VARIES |

| Constant patterns | Domain best practices encoded as constants |

| Not requirement-specific | Avoids hardcoded data, tools, configs |

| Abstraction level | Appropriate generalization for domain |

Good example:

```markdown

"Create visualizations - adaptable to data shape, chart type, library"

```

Bad example (too specific):

```markdown

"Create bar chart with sales data using Recharts"

```

Key check: Does the skill work for multiple use cases within its domain?

---

Type-Specific Validation

After scoring general criteria, verify type-specific requirements:

| Type | Must Have |

|------|-----------|

| Builder | Clarifications, Output Spec, Domain Standards, Output Checklist |

| Guide | Workflow Steps, Examples (Good/Bad), Official Docs links |

| Automation | Scripts in scripts/, Dependencies, Error Handling, I/O Spec |

| Analyzer | Analysis Scope, Evaluation Criteria, Output Format, Synthesis |

| Validator | Quality Criteria, Scoring Rubric, Thresholds, Remediation |

Scoring: Deduct 10 points if type-specific requirements missing for identified type.

---

Scoring Guide

Category Scores

Calculate each category score:

```

Category Score = (Sum of criterion scores) / (Max possible) * 100

```

Overall Score

```

Overall = Ξ£(Category Score Γ— Weight)

```

Rating Thresholds

| Score | Rating | Meaning |

|-------|--------|---------|

| 90-100 | Production | Ready for wide use |

| 75-89 | Good | Minor improvements needed |

| 60-74 | Adequate | Functional but needs work |

| 40-59 | Developing | Significant gaps |

| 0-39 | Incomplete | Major rework required |

---

Output Format

Generate validation report:

```markdown

# Skill Validation Report: [skill-name]

Rating: [Production/Good/Adequate/Developing/Incomplete]

Overall Score: [X]/100

Summary

[2-3 sentence assessment]

Category Scores

| Category | Score | Weight | Weighted |

|----------|-------|--------|----------|

| Structure & Anatomy | X/100 | 12% | X |

| Content Quality | X/100 | 15% | X |

| User Interaction | X/100 | 12% | X |

| Documentation | X/100 | 10% | X |

| Domain Standards | X/100 | 10% | X |

| Technical Robustness | X/100 | 8% | X |

| Maintainability | X/100 | 8% | X |

| Zero-Shot Implementation | X/100 | 12% | X |

| Reusability | X/100 | 13% | X |

| Type-Specific Deduction | -X | - | -X |

Critical Issues (if any)

  • [Issue requiring immediate fix]

Improvement Recommendations

  1. High Priority: [Specific action]
  2. Medium Priority: [Specific action]
  3. Low Priority: [Specific action]

Strengths

  • [What skill does well]

```

---

Quick Validation Checklist

For rapid assessment, check these critical items:

Structure & Frontmatter

  • [ ] SKILL.md <500 lines
  • [ ] Frontmatter: name (≀64 chars, lowercase, hyphens) + description (≀1024 chars)
  • [ ] Description uses third-person style ("This skill should be used when...")
  • [ ] No README.md/CHANGELOG.md in skill directory

Content & Interaction

  • [ ] Has clarification questions (Required vs Optional)
  • [ ] Has output specification
  • [ ] Has official documentation links

Zero-Shot & Reusability

  • [ ] Has "Before Implementation" section (context gathering)
  • [ ] Domain expertise embedded in references/ (not runtime discovery)
  • [ ] Handles variations (not requirement-specific)

Type-Specific (check based on skill type)

  • [ ] Builder: Clarifications + Output Spec + Standards + Checklist
  • [ ] Guide: Workflow + Examples + Docs
  • [ ] Automation: Scripts + Dependencies + Error Handling
  • [ ] Analyzer: Scope + Criteria + Output Format
  • [ ] Validator: Criteria + Scoring + Thresholds + Remediation

If 10+ checked: Likely Production (90+)

If 7-9 checked: Likely Good (75-89)

If 5-6 checked: Likely Adequate (60-74)

If <5 checked: Needs significant work

---

Reference Files

| File | When to Read |

|------|--------------|

| references/detailed-criteria.md | Deep evaluation of specific criterion |

| references/scoring-examples.md | Example validations for calibration |

| references/improvement-patterns.md | Common fixes for common issues |

---

Usage Examples

Validate a skill

```

Validate the chatgpt-widget-creator skill against production criteria

```

Quick audit

```

Quick validation check on mcp-builder skill

```

Focused review

```

Check if skill-creator skill has proper user interaction patterns

```

More from this repository10

🎯
content-evaluation-framework🎯Skill

Evaluates educational content systematically using a 6-category weighted rubric, scoring technical accuracy, pedagogical effectiveness, and constitutional compliance.

🎯
content-refiner🎯Skill

Refines content that failed Gate 4 by precisely trimming verbosity, strengthening lesson connections, and ensuring targeted improvements based on specific diagnostic criteria.

🎯
canonical-format-checker🎯Skill

Checks and validates content formats against canonical sources to prevent inconsistent pattern implementations across platform documentation.

🎯
assessment-architect🎯Skill

Generates comprehensive skill assessments by dynamically creating evaluation frameworks, rubrics, and scoring mechanisms for educational and professional contexts.

🎯
concept-scaffolding🎯Skill

Designs progressive learning sequences by breaking complex concepts into manageable steps, managing cognitive load, and validating understanding across different learning tiers.

🎯
summary-generator🎯Skill

Generates concise, Socratic-style lesson summaries by extracting core concepts, mental models, patterns, and AI collaboration insights from educational markdown files.

🎯
chapter-evaluator🎯Skill

Evaluates educational chapters by analyzing chapters chapters through aable student and teacher perspectives, generating structured ratings,, identifying content gaps, providing and prioritableized...

🎯
pptx🎯Skill

Generates, edits, and analyzes PowerPoint presentations with precise content control and narrative coherence.

🎯
docx🎯Skill

Generates, edits, and analyzes Microsoft Word documents (.docx) with advanced capabilities like tracked changes, comments, and text extraction.

🎯
technical-clarity🎯Skill

Refines technical writing by analyzing readability, reducing jargon, and ensuring content is comprehensible across different learner proficiency levels.