🎯

project-genome

🎯Skill

from horace4444/extend-my-claude-code

What it does

Generates a comprehensive YAML project map that provides instant codebase understanding, mapping architecture, file structure, documentation, and key resources in under 5k tokens.

📦

Part of

horace4444/extend-my-claude-code(9 items)

project-genome

Installation

PythonRun Python server

python3 .claude/skills/project-genome/scripts/update_genome.py

PythonRun Python server

python3 .claude/skills/project-genome/scripts/update_genome.py --validate

📖 Extracted from docs: horace4444/extend-my-claude-code

Need more details? View full documentation on GitHub →

1Installs

AddedFeb 4, 2026

View on GitHub Back to Skills

Skill Details

SKILL.md

"Run FIRST when starting work on any codebase. Creates PROJECT-GENOME.yaml with complete project map: architecture, file structure, key file paths, and authoritative documentation locations. One 4K token read replaces hours of exploratory searching. Refresh with /project-genome after major changes."

Overview

# Project Genome

Generate and maintain a comprehensive (<5k tokens) YAML bootstrap file that gives AI agents instant codebase understanding, including AI-analyzed documentation mapping.

CRITICAL: Pre-Read Protocol

Before reading PROJECT-GENOME.yaml, ALWAYS execute this skill first.

This ensures:

Genome is fresh (skills_map current, file_structure accurate)
Documentation map is current (new docs discovered, stale docs flagged)
CLAUDE.md properly references genome at top
New skills are discovered and added to skills_map

Pre-Read Checklist (Execute Every Time)

```bash

# 1. Update genome with latest changes

python3 .claude/skills/project-genome/scripts/update_genome.py

# 2. Validate genome is under token budget

python3 .claude/skills/project-genome/scripts/update_genome.py --validate

```

Self-Verification: CLAUDE.md Integration

After updating, verify CLAUDE.md contains:

Line 3: > Bootstrap: Read [PROJECT-GENOME.yaml]... reference
Key Rules section: Rule about refreshing genome before reading
Skills table: project-genome skill listed with trigger

If any missing, auto-fix by reading CLAUDE.md and adding required sections.

Core Concept

PROJECT-GENOME.yaml is a seed file, not a full system. It provides:

Instant project orientation (purpose, stack, structure)
Semantic navigation (modules, key functions, dependencies)
Documentation map with AI-scored importance (authoritative vs ephemeral)
Agent-specific hints for efficient exploration
Links to deeper resources (not duplicated content)

When to Use

| Action | Trigger |

|--------|---------|

| Generate | New project setup, /init, "create genome" |

| Update | Major refactor, new modules, architecture changes |

| Read | Start of any coding session (automatic) |

| Validate | Before commits affecting structure |

| Review Docs | --review-docs to classify discovered documentation |

---

Documentation Map Feature

The documentation_map section tracks all markdown documentation in the repo, distinguishing between authoritative (user-confirmed important) and ephemeral (temporary plans, working notes).

Why This Matters

AI agents frequently generate temporary documentation:

Implementation plans (*_PLAN.md)
Debugging notes (debugging-*.md)
Session-specific scratch files

These should NOT be treated as authoritative project documentation. The documentation map:

Auto-discovers all markdown files
AI-analyzes each for importance signals
Auto-skips low-quality/ephemeral docs
Preserves user-confirmed authoritative docs across updates

Documentation Map Structure

```yaml

documentation_map:

# User-confirmed authoritative docs (PRESERVED across updates)

authoritative:

system_architecture:

- path: "docs/ARCHITECTURE.md"

purpose: "High-level system design and component interactions"

last_verified: "2026-01-22"

api_reference:

- path: "backend/API_ENDPOINTS.md"

purpose: "REST API documentation with schemas"

component_guides:

- path: "backend/CLAUDE.md"

purpose: "Backend development patterns"

# Auto-discovered docs (REFRESHED on each update)

discovered:

recent_plans:

- path: "docs/QA_PLAN_20260122.md"

importance_score: 0.45

category: "implementation_plan"

archived:

directory: "docs/archive/"

count: 12

# Docs needing user review (cleared after --review-docs)

pending_review:

- path: "docs/NEW_FEATURE_SPEC.md"

importance_score: 0.78

suggested_category: "system_architecture"

ai_reasoning: "Well-structured spec with diagrams. Covers new subsystem."

# Validation state

_meta:

last_scan: "2026-01-22T14:30:00Z"

total_docs_scanned: 47

auto_skipped: 23

missing_authoritative: []

```

---

AI Documentation Analysis

When this skill runs, the agent analyzes discovered markdown files to determine importance.

Analysis Process

For each discovered .md file (read first 3000 chars):

Evaluate Quality Signals (30% weight)

- Clear H1/H2 structure

- Contains code blocks, diagrams, or tables

- References specific files/functions in codebase

- Professional/authoritative tone

Evaluate Freshness Signals (25% weight)

- Modified within last 30 days

- References files that still exist

- No "TODO", "DRAFT", "WIP" markers in title

- Current tech stack mentioned

Evaluate Scope Signals (25% weight)

- Covers entire system/module vs single task

- Located in structured docs directory

- Has "Architecture", "Guide", "Reference" in name

Evaluate Deprecation Signals (20% weight)

- Located in /archive/ directory

- Contains "deprecated", "outdated", "old" language

- References removed features/files

- Date in filename older than 30 days (e.g., plan-20251201.md)

Importance Score Calculation

```

importance_score = (quality 0.30) + (freshness 0.25) + (scope 0.25) + ((1 - deprecation) 0.20)

```

Auto-Skip Criteria (importance_score < 0.35)

Automatically skip (don't prompt user) for docs matching:

Located in /archive/, /old/, /deprecated/ directories
Filename contains date older than 60 days
Title contains "DRAFT", "WIP", "TODO", "SCRATCH", "NOTES" (informal)
Less than 500 bytes (stub files)
Filename pattern: -debug-.md, -test-.md, debugging-*.md
Content starts with "# Notes" or "# Scratch"

Category Assignment

| Score Range | Suggested Category |

|-------------|-------------------|

| >= 0.85 | system_architecture or api_reference (based on content) |

| 0.70 - 0.84 | component_guide or testing |

| 0.50 - 0.69 | implementation_plan |

| 0.35 - 0.49 | working_notes (ephemeral, not authoritative) |

| < 0.35 | Auto-skip (don't include in pending_review) |

---

Execution Modes

Mode 1: Standard Update (Default)

```bash

python3 .claude/skills/project-genome/scripts/update_genome.py

```

What happens:

Script discovers all markdown files
Script outputs docs_pending_analysis.json
Agent reads each pending doc (first 3000 chars)
Agent calculates importance_score for each
Agent updates genome with documentation_map

Agent instructions for this mode:

After running the script, if docs_pending_analysis.json exists:

```

Read docs_pending_analysis.json
For each doc with needs_analysis=true:

a. Read the file (first 3000 chars)

b. Evaluate: quality, freshness, scope, deprecation signals

c. Calculate importance_score (0.0-1.0)

d. Determine suggested_category

e. Write 1-2 sentence reasoning

Update PROJECT-GENOME.yaml:

- Preserve existing authoritative section

- Update discovered section with scored docs

- Add high-score docs (>=0.50) to pending_review

- Auto-skip low-score docs (<0.35)

Delete docs_pending_analysis.json
Report summary to user

```

Mode 2: Documentation Review

```bash

python3 .claude/skills/project-genome/scripts/update_genome.py --review-docs

```

What happens:

Script reads existing genome
Script outputs docs in pending_review for user confirmation
Agent presents each doc to user with AI analysis
User confirms or skips each doc
Agent moves confirmed docs to authoritative section

Agent instructions for this mode:

Present each pending doc to user:

```

For docs with importance_score >= 0.85 (RECOMMENDED):

"⭐ RECOMMENDED: {path}

AI Score: {score} | Suggested: {category}

{ai_reasoning}

Promote to authoritative? [Y/n]: "

(Default YES - just press Enter to confirm)

For docs with score 0.50-0.84:

"{path}

AI Score: {score} | Suggested: {category}

{ai_reasoning}

Promote to authoritative? [y/n/skip]: "

For docs with score 0.35-0.49:

"(Low score - likely ephemeral)

{path} - Score: {score}

{ai_reasoning}

[Auto-skipping - press Enter to continue, or 'p' to promote anyway]: "

```

When user confirms a doc:

```

"Purpose (1 line) [{suggested_purpose}]: "

(User can press Enter to accept suggestion or type custom)

```

Mode 3: Bootstrap (No Existing Genome)

When PROJECT-GENOME.yaml doesn't exist:

Run full discovery
ALL docs go to pending_review (nothing is authoritative yet)
Inform user: "No existing genome. Run --review-docs to classify documentation."

---

Genome Structure (Complete YAML)

```yaml

project_name: "Project Name"

last_updated: "2026-01-22T06:30:00Z"

purpose:

summary: "Brief: Business goal, key features, users. <100 words."

tech_stack: ["React", "Node.js", "PostgreSQL"]

repo_info:

branches: {main: "Production", dev: "Development"}

file_structure:

tree: |

project-root/

├── src/ # Core logic

├── docs/ # Documentation

└── tests/ # Test suites

total_files: 42

architecture:

overview: "High-level C4 context summary"

patterns: ["MVC", "Event-driven"]

diagram: |

graph TD

A[User] --> B[App]

B --> C[API]

semantic_map:

modules:

auth: {path: "src/auth", files: 5}

payments: {path: "src/payments", files: 3}

flows: {}

navigation_hints:

- "Payment logic: src/services/payments"

- "DB schema: docs/schema.sql"

- "Skills: .claude/skills/"

skills_map:

skill-name:

description: "What this skill does..."

trigger: "/skill-name"

# NEW: Documentation map with AI analysis

documentation_map:

authoritative:

system_architecture: []

api_reference: []

component_guides: []

testing: []

discovered:

recent_plans: []

archived: {directory: "", count: 0}

pending_review: []

_meta:

last_scan: ""

total_docs_scanned: 0

auto_skipped: 0

missing_authoritative: []

recent_changes: "Auto-generated from last 5 git commits"

```

---

Token Budget Guidelines

| Section | Target | Notes |

|---------|--------|-------|

| purpose | 100-200 | Detailed summary with key features |

| file_structure | 300-600 | Top 3 levels, include key subdirectories |

| architecture | 200-400 | C4 context + key patterns, include diagram |

| semantic_map | 400-800 | Major modules, key functions |

| navigation_hints | 100-200 | 5-10 actionable prompts with file paths |

| skills_map | 200-400 | All skills with descriptions |

| documentation_map | 400-600 | Authoritative docs with purposes |

| Total | <5000 | Leave headroom for YAML syntax |

---

Anti-Patterns

Duplicating README - Genome is seed, not docs
Full code snippets - Use function names, not implementations
Listing all files - Top-level structure only
ADR content - Link to docs/, don't inline
Updating every commit - Major changes only
Including ephemeral docs in authoritative - Only user-confirmed docs
Keeping stale pending_review - Clear after each review session

---

Example AI Analysis Output

When analyzing monorepo-docs/system-docs/MESSAGE_HANDLING_ARCHITECTURE.md:

```yaml

path: "monorepo-docs/system-docs/MESSAGE_HANDLING_ARCHITECTURE.md"

importance_score: 0.92

suggested_category: "system_architecture"

ai_reasoning: |

High-quality architecture doc. Clear H1/H2 structure with Mermaid diagrams.

Covers critical realtime messaging subsystem. Updated 2026-01-21.

References active code: realtime-sync.ts, [threadId].tsx.

Located in structured system-docs directory. No deprecation signals.

signals:

quality: 0.95

freshness: 0.90

scope: 0.90

deprecation: 0.05

```

When analyzing monorepo-docs/debugging-carpet-issue.md:

```yaml

path: "monorepo-docs/debugging-carpet-issue.md"

importance_score: 0.22

suggested_category: "auto_skip"

ai_reasoning: |

Debugging notes from a specific session. Informal structure.

Contains "debugging" in filename. Likely ephemeral working doc.

Not suitable for authoritative documentation.

signals:

quality: 0.30

freshness: 0.40

scope: 0.10

deprecation: 0.20

auto_skip: true

skip_reason: "Filename pattern matches debugging-*.md"

```

---

Integration with CLAUDE.md

After running this skill, CLAUDE.md should reference key authoritative docs:

```markdown

# Project Name

> Bootstrap: Read [PROJECT-GENOME.yaml](PROJECT-GENOME.yaml) first.

Key Documentation

| Category | Authoritative Docs |

|----------|-------------------|

| Architecture | system-docs/OVERVIEW.md, MESSAGE_HANDLING.md |

| API | BACKEND_API_COMPLETE.md |

| Components | backend/CLAUDE.md, mobile-app/CLAUDE.md |

See documentation_map in PROJECT-GENOME.yaml for full list.

```

More from this repository8

🎯

watermark-removal🎯Skill

Intelligently removes watermarks from any image using ML-based inpainting, auto-detection, and multiple removal methods across various sources.

🎯

google-image-creator🎯Skill

Generates high-quality images using Google's Imagen and Gemini AI models with automatic pricing and cost tracking.

🎯

claude-agent-builder-typescript🎯Skill

Builds production-ready Claude AI agents in TypeScript using the @anthropic-ai/claude-agent-sdk, enabling custom tool development, multi-agent workflows, and enterprise-grade automation.

🎯

image-converter🎯Skill

Converts, resizes, compresses, and optimizes images across multiple formats with high-quality Python Pillow processing.

🎯

web-design-guidelines🎯Skill

Reviews UI code against web design best practices, checking accessibility, UX, and interface guidelines for web projects.

🎯

skill-creator🎯Skill

A comprehensive guide for creating skills that extend Claude's capabilities with specialized knowledge and workflows. ``` Generates a concise, practical description that captures the skill's purpo...

🎯

ai-api-integrations🎯Skill

Seamlessly integrates AI models, databases, and authentication for building scalable, production-ready AI applications with best-practice model selection and API management.

🎯

vercel-react-best-practices🎯Skill

Implements best practices and optimized configurations for React projects deployed on Vercel, enhancing performance and development workflow.