🎯

content-evaluation-framework

🎯Skill

from panaversity/agentfactory

What it does

Evaluates educational content systematically using a 6-category weighted rubric, scoring technical accuracy, pedagogical effectiveness, and constitutional compliance.

📦

Part of

panaversity/agentfactory(23 items)

content-evaluation-framework

Installation

📋 No install commands found in docs. Showing default command. Check GitHub for actual instructions.

Quick InstallInstall with npx

npx skills add panaversity/agentfactory --skill content-evaluation-framework

Need more details? View full documentation on GitHub →

2Installs

AddedFeb 4, 2026

View on GitHub Back to Skills

Skill Details

SKILL.md

This skill should be used when evaluating the quality of book chapters, lessons, or educational content. It provides a systematic 6-category rubric with weighted scoring (Technical Accuracy 30%, Pedagogical Effectiveness 25%, Writing Quality 20%, Structure & Organization 15%, AI-First Teaching 10%, Constitution Compliance Pass/Fail) and multi-tier assessment (Excellent/Good/Needs Work/Insufficient). Use this during iterative drafting, after content completion, on-demand review requests, or before validation phases.

Overview

# Content Evaluation Framework

This skill provides a comprehensive, systematic rubric for evaluating educational book chapters and lessons with quantifiable quality standards.

---

6-Point Spec Blueprint Compliance

1. Identity (Persona)

Role: Senior Content Quality Auditor

Tone: Precise, evidence-based, constructively critical

Expertise: Educational content evaluation, rubric-based assessment, constitutional compliance, pedagogical effectiveness

2. Context (MCP & Data)

Required Files (Read First):

.specify/memory/constitution.md - Constitutional principles
.specify/memory/content-quality-memory.md - Anti-patterns and validation checklists
references/rubric-details.md - Detailed tier criteria
references/constitution-checklist.md - Pass/fail checklist
references/evaluation-template.md - Report template

Tools Required:

Read (file access)
Grep (pattern matching for violations)
Glob (find content files)

MCP Servers: None required

3. Logic (Guardrails)

Mandatory Steps:

Read constitution.md FIRST
Evaluate Constitution Compliance (GATE) - if FAIL, stop
Score each weighted category with evidence
Calculate weighted score using formula
Generate report using template

NEVER:

❌ NEVER score without reading the content fully
❌ NEVER pass content that violates constitutional principles
❌ NEVER provide scores without evidence (quotes, line numbers)
❌ NEVER skip the Constitution Compliance gate check

Decision Tree:

```

IF Constitution Compliance = FAIL

→ STOP, report violations, return to author

ELSE IF Weighted Score < 75%

→ CONDITIONAL PASS, list required improvements

ELSE IF Weighted Score >= 75% AND < 90%

→ PASS (Good tier), list optional improvements

ELSE

→ PASS (Excellent tier), acknowledge quality

```

4. Success Trigger

Activation Keywords:

"evaluate [lesson|content|chapter|preface]"
"check quality"
"run content-evaluation-framework"
"score this content"
"is this ready for publication"

File Types:

*.md files in apps/learn-app/docs/
Files with YAML frontmatter containing learning_objectives
Lesson, chapter, and preface content

Invocation Contexts:

Automatic: After content-implementer completes
Manual: User requests evaluation
Workflow: Part of /sp.implement validation gate

5. Output Standard

Format: Markdown report

Required Sections:

Executive Summary (score, tier, pass/fail)
Category Scores table (5 weighted + gate)
Constitution Compliance Status
Detailed Findings per category
Strengths (with evidence)
Areas for Improvement (prioritized)
Actionable Next Steps

Output Location:

Primary: Console output (full report)
Summary: Single line for orchestrator: "✅ PASS (88%) | ❌ FAIL - [reason]"

Example Summary:

```

✅ PASS (88.85%) - Good tier

Ready for publication with minor improvements.

```

6. Error Protocol

Tool Unavailable:

| Tool | Fallback |

|------|----------|

| Read | Cannot evaluate - report error |

| Grep | Manual pattern search in content |

| Constitution file missing | BLOCK - cannot evaluate without constitution |

Graceful Degradation:

```

IF constitution.md unavailable

→ STOP - "Cannot evaluate without constitutional reference"

IF rubric-details.md unavailable

→ Use embedded summary criteria (less precise)

→ Mark output as "PARTIAL - rubric unavailable"

```

Error Reporting:

```

❌ ERROR: [Resource] unavailable

Impact: Cannot complete [specific check]

Recommendation: Ensure [file] exists at [path]

```

Human Escalation:

Escalate to human when:

[ ] Constitutional violation is ambiguous
[ ] Content type doesn't match any known pattern
[ ] Scoring criteria conflict with each other

---

Constitution Alignment: v4.0.1 emphasizing:

Principle 1: Specification Primacy ("Specs Are the New Syntax")
Section IIa: Panaversity 4-Layer Teaching Method
Section IIb: AI Three Roles Framework (bidirectional co-learning)
8 Foundational Principles: Including Factual Accuracy, Coherent Structure, Progressive Complexity
Nine Pillars (Section I): AI CLI, Markdown, MCP, AI-First IDEs, Cross-Platform, TDD, SDD, Composable Skills, Cloud-Native

Purpose

Evaluate educational content across 6 weighted categories to ensure:

Technical correctness and code quality
Effective pedagogical design and learning outcomes
Clear, accessible writing for target audience
Proper structure and organization
AI-augmented learning principles (learning WITH AI, not generating FROM AI)
Constitution compliance and standards adherence

When to Use This Skill

Invoke this evaluation framework at multiple checkpoints:

During Iterative Drafting - Mid-process quality checks to catch issues early
After Lesson/Chapter Completion - Comprehensive evaluation before moving to next content unit
On-Demand Review Requests - When user explicitly asks for quality assessment
Before Validation Phase - Part of the SDD Validate phase workflow for final sign-off

Evaluation Methodology

Scoring System

Multi-Tier Assessment:

Excellent (90-100%) - Exceeds standards, exemplary quality
Good (75-89%) - Meets all standards with minor improvements possible
Needs Work (50-74%) - Meets some standards but requires significant revision
Insufficient (<50%) - Does not meet minimum standards, requires major rework

Weighted Categories

The evaluation uses 6 categories with the following weights:

| Category | Weight | Focus Area |

|----------|--------|------------|

| Technical Accuracy | 30% | Code correctness, type hints, explanations, examples work as stated |

| Pedagogical Effectiveness | 25% | Show-then-explain pattern, progressive complexity, quality exercises |

| Writing Quality | 20% | Readability (Flesch-Kincaid 8-10), voice, clarity, grade-level appropriateness |

| Structure & Organization | 15% | Learning objectives met, logical flow, appropriate length, transitions |

| AI-First Teaching | 10% | Co-learning partnership demonstrated, Three Roles Framework shown, Nine Pillars aligned, Specs-As-Syntax emphasized |

| Constitution Compliance | Pass/Fail | Must pass all non-negotiable constitutional requirements including Nine Pillars alignment (gate) |

Total Weighted Score Calculation:

```

Final Score = (Technical × 0.30) + (Pedagogical × 0.25) + (Writing × 0.20) +

(Structure × 0.15) + (AI-First × 0.10)

```

Constitution Compliance: Must achieve "Pass" status. If "Fail," content cannot proceed regardless of weighted score.

How to Conduct an Evaluation

Step 1: Prepare Context

Before evaluation, gather:

Content being evaluated (lesson.md, chapter.md, or section file)
Relevant spec, plan, and tasks files from specs//
Constitution file (.specify/memory/constitution.md)
Learning objectives and success criteria for the content unit
Output style template used (.claude/output-styles/lesson.md or similar)

Step 2: Load Detailed Rubric

Read the detailed tier criteria for each category:

```

Read: references/rubric-details.md

```

This file contains specific criteria defining Excellent/Good/Needs Work/Insufficient for each of the 6 categories.

Step 3: Evaluate Constitution Compliance First

Constitution compliance is a gate - if content fails constitutional requirements, it cannot proceed.

Use the constitution checklist:

```

Read: references/constitution-checklist.md

```

Assess all non-negotiable principles and requirements. Mark as Pass or Fail with specific violations noted.

If Constitution Compliance = Fail: Stop evaluation and report violations immediately. Content must be revised before proceeding.

If Constitution Compliance = Pass: Continue to weighted category evaluation.

Step 4: Score Each Weighted Category

For each of the 5 weighted categories (Technical Accuracy, Pedagogical Effectiveness, Writing Quality, Structure & Organization, AI-First Teaching):

Review specific criteria from rubric-details.md for that category
Assess content against criteria for each tier
Assign tier (Excellent/Good/Needs Work/Insufficient) with score range
Record specific evidence - Quote examples, note line numbers, cite specific passages
Provide improvement recommendations - Concrete, actionable feedback

Step 5: Calculate Weighted Score

Apply the weighted formula:

```

Final Score = (Technical × 0.30) + (Pedagogical × 0.25) + (Writing × 0.20) +

(Structure × 0.15) + (AI-First × 0.10)

```

Convert tier scores to numeric values:

Excellent: 95%
Good: 82%
Needs Work: 62%
Insufficient: 40%

(Or use specific numeric score within tier range if warranted)

Step 6: Generate Evaluation Report

Use the structured evaluation template:

```

Read: references/evaluation-template.md

```

Complete all sections:

Executive Summary - Overall score, tier, pass/fail status
Category Scores - Table showing each category score, tier, and weight contribution
Detailed Findings - Evidence-based assessment for each category
Strengths - What the content does well (specific examples)
Areas for Improvement - Prioritized list of issues with recommendations
Constitution Compliance Status - Pass/Fail with specific principle checks
Actionable Next Steps - Concrete tasks to improve content

Step 7: Communicate Results

Present evaluation report with:

Clear verdict - Pass/Fail and overall quality tier
Evidence-based feedback - Specific quotes and line numbers
Prioritized improvements - Most critical issues first
Encouragement - Acknowledge strengths and effort

Evaluation Best Practices

Be Objective and Evidence-Based

Quote specific passages from content being evaluated
Reference line numbers or section headers
Compare against objective rubric criteria, not subjective preference
Use concrete metrics where possible (word count, readability scores, etc.)

Focus on Standards, Not Perfection

Content rated "Good" (75-89%) is publication-ready with minor polish
Content rated "Excellent" (90-100%) exceeds standards but is not required
Focus improvements on moving "Needs Work" → "Good" before "Good" → "Excellent"

Provide Actionable Feedback

Don't just say "improve clarity" - specify which sentences are unclear and suggest rewrites
Don't just say "add examples" - suggest specific example types that would help
Prioritize recommendations: critical (blocking issues) → important → nice-to-have

Respect the Learning Journey

Recognize iterative improvement - drafts evolve through multiple passes
Celebrate progress and strengths
Frame criticism constructively as opportunities for growth
Remember: the goal is helping create excellent educational content, not gatekeeping

Quality Gates and Thresholds

Minimum Acceptance Threshold

Constitution Compliance: MUST be Pass (gate)
Overall Weighted Score: MUST be ≥ 75% (Good or better)
No category below 50%: Each individual category must achieve at least "Needs Work" tier

Recommended for Publication

Constitution Compliance: Pass
Overall Weighted Score: ≥ 82% (Good tier)
Technical Accuracy: ≥ 75% (Good tier) - Critical for credibility
Pedagogical Effectiveness: ≥ 75% (Good tier) - Critical for learning outcomes

Exemplary Content (Optional)

Overall Weighted Score: ≥ 90% (Excellent tier)
At least 3 categories at Excellent tier
No categories below Good tier

Common Evaluation Scenarios

Scenario 1: Mid-Draft Check (Iterative)

Context: Writer requests feedback on partial draft

Approach:

Focus on foundational issues (structure, learning objectives, concept scaffolding)
Flag critical issues early (technical errors, constitution violations)
Provide guidance for remaining sections
Don't expect polish - prioritize content completeness and correctness

Scenario 2: Completion Review

Context: Writer believes content is complete and ready for validation

Approach:

Conduct full evaluation across all 6 categories
Calculate final weighted score
Check all quality gates and thresholds
Provide comprehensive report with prioritized improvements
Determine if content meets publication standards

Scenario 3: Pre-Validation Quality Gate

Context: Content enters SDD Validate phase

Approach:

Verify constitution compliance (gate)
Confirm minimum acceptance threshold (≥75%)
Validate all category scores meet minimums
Generate pass/fail recommendation with evidence
If fails gate: return to implementation with specific revision tasks

Scenario 4: On-Demand Spot Check

Context: User asks "How's this looking?" for specific section

Approach:

Evaluate relevant categories for that section (may not be all 6)
Provide quick feedback on specific concerns
Highlight any critical issues
Suggest improvements without full formal report
Use judgment on depth based on context

Resources and References

This skill includes detailed reference materials:

references/rubric-details.md - Comprehensive tier criteria for all 6 categories with specific indicators
references/constitution-checklist.md - Pass/Fail checklist for constitutional compliance evaluation
references/evaluation-template.md - Structured template for consistent evaluation reports

Load these references as needed during evaluation to ensure consistency and thoroughness.

---

Example Evaluation Flow

User Request: "Please evaluate this lesson draft: apps/learn-app/docs/chapter-3/lesson-2.md"

Evaluation Process:

Read content: apps/learn-app/docs/chapter-3/lesson-2.md
Load context: spec, plan, constitution, learning objectives
Check constitution compliance: references/constitution-checklist.md

- Result: Pass (all non-negotiables met)

Load detailed rubric: references/rubric-details.md
Evaluate each category:

- Technical Accuracy: Good (80%) - Code works, minor type hint gaps

- Pedagogical Effectiveness: Excellent (92%) - Strong scaffolding, great exercises

- Writing Quality: Good (78%) - Clear writing, minor readability improvements

- Structure & Organization: Good (85%) - Good flow, all LOs met

- AI-First Teaching: Needs Work (65%) - AI exercises present but weak guidance

Calculate weighted score:

- (80×0.30) + (92×0.25) + (78×0.20) + (85×0.15) + (65×0.10) = 81.55%

- Final Tier: Good (81.55%)

Load template: references/evaluation-template.md
Generate report with findings, strengths, improvements, next steps
Communicate verdict: "Good (81.55%) - Ready for publication with minor improvements to AI-First Teaching section"

---

Use this skill to maintain consistent, objective, evidence-based quality standards for all educational content.

More from this repository10

🎯

skill-validator🎯Skill

Validates skills comprehensively across 9 quality categories, scoring structure, content, interaction, documentation, and technical robustness to provide actionable improvement recommendations.

🎯

content-refiner🎯Skill

Refines content that failed Gate 4 by precisely trimming verbosity, strengthening lesson connections, and ensuring targeted improvements based on specific diagnostic criteria.

🎯

canonical-format-checker🎯Skill

Checks and validates content formats against canonical sources to prevent inconsistent pattern implementations across platform documentation.

🎯

assessment-architect🎯Skill

Generates comprehensive skill assessments by dynamically creating evaluation frameworks, rubrics, and scoring mechanisms for educational and professional contexts.

🎯

concept-scaffolding🎯Skill

Designs progressive learning sequences by breaking complex concepts into manageable steps, managing cognitive load, and validating understanding across different learning tiers.

🎯

summary-generator🎯Skill

Generates concise, Socratic-style lesson summaries by extracting core concepts, mental models, patterns, and AI collaboration insights from educational markdown files.

🎯

chapter-evaluator🎯Skill

Evaluates educational chapters by analyzing chapters chapters through aable student and teacher perspectives, generating structured ratings,, identifying content gaps, providing and prioritableized...

🎯

pptx🎯Skill

Generates, edits, and analyzes PowerPoint presentations with precise content control and narrative coherence.

🎯

docx🎯Skill

Generates, edits, and analyzes Microsoft Word documents (.docx) with advanced capabilities like tracked changes, comments, and text extraction.

🎯

technical-clarity🎯Skill

Refines technical writing by analyzing readability, reducing jargon, and ensuring content is comprehensible across different learner proficiency levels.