🎯

end-to-end-orchestrator

🎯Skill

from adaptationio/skrillz

VibeIndex|
What it does

end-to-end-orchestrator skill from adaptationio/skrillz

πŸ“¦

Part of

adaptationio/skrillz(191 items)

end-to-end-orchestrator

Installation

Add MarketplaceAdd marketplace to Claude Code
/plugin marketplace add adaptationio/Skrillz
Install PluginInstall plugin from marketplace
/plugin install skrillz@adaptationio-Skrillz
Claude CodeAdd plugin in Claude Code
/plugin enable skrillz@adaptationio-Skrillz
Add MarketplaceAdd marketplace to Claude Code
/plugin marketplace add /path/to/skrillz
Install PluginInstall plugin from marketplace
/plugin install skrillz@local

+ 4 more commands

πŸ“– Extracted from docs: adaptationio/skrillz
1Installs
3
-
Last UpdatedJan 16, 2026

Skill Details

SKILL.md

Complete development workflow orchestrator coordinating all multi-ai skills (research β†’ planning β†’ implementation β†’ testing β†’ verification) with quality gates, failure recovery, and state management. Single-command complete workflows from objective to production-ready code. Use when implementing complete features requiring full pipeline, coordinating multiple skills automatically, or executing production-grade development cycles end-to-end.

Overview

# End-to-End Orchestrator

Overview

end-to-end-orchestrator provides single-command complete development workflows, coordinating all 5 multi-ai skills from research through production deployment.

Purpose: Transform "I want feature X" into production-ready code through automated skill coordination

Pattern: Workflow-based (5-stage pipeline with quality gates)

Key Innovation: Automatic orchestration of research β†’ planning β†’ implementation β†’ testing β†’ verification with failure recovery and quality gates

The Complete Pipeline:

```

Input: Feature description

↓

  1. Research (multi-ai-research) [optional]

↓ [Quality Gate: Research complete]

  1. Planning (multi-ai-planning)

↓ [Quality Gate: Plan β‰₯90/100]

  1. Implementation (multi-ai-implementation)

↓ [Quality Gate: Tests pass, coverage β‰₯80%]

  1. Testing (multi-ai-testing)

↓ [Quality Gate: Coverage β‰₯95%, verified]

  1. Verification (multi-ai-verification)

↓ [Quality Gate: Score β‰₯90/100, all layers pass]

Output: Production-ready code

```

---

When to Use

Use end-to-end-orchestrator when:

  • Implementing complete features (not quick fixes)
  • Want automated workflow (not manual skill chaining)
  • Production-quality required (all gates must pass)
  • Time optimization important (parallel where possible)
  • Need failure recovery (automatic retry/rollback)

When NOT to Use:

  • Quick fixes (<30 minutes)
  • Exploratory work (uncertain requirements)
  • Manual control preferred (step through each phase)

---

Prerequisites

Required

  • All 5 multi-ai skills installed:

- multi-ai-research

- multi-ai-planning

- multi-ai-implementation

- multi-ai-testing

- multi-ai-verification

Optional

  • agent-memory-system (for learning from past work)
  • hooks-manager (for automation)
  • Gemini CLI, Codex CLI (for tri-AI research)

---

Complete Workflow

Stage 1: Research (Optional)

Purpose: Ground implementation in proven patterns

Process:

  1. Determine if Research Needed:

```typescript

// Check if objective is familiar

const similarWork = await recallMemory({

type: 'episodic',

query: objective

});

if (similarWork.length === 0) {

// Unfamiliar domain β†’ research needed

needsResearch = true;

} else {

// Familiar β†’ can skip research, use past learnings

needsResearch = false;

}

```

  1. Execute Research (if needed):

```

Use multi-ai-research for "[domain] implementation patterns and best practices"

```

What It Provides:

- Claude research: Official docs, codebase patterns

- Gemini research: Web best practices, latest trends

- Codex research: GitHub patterns, code examples

- Quality: β‰₯95/100 with 100% citations

  1. Quality Gate: Research Complete:

```markdown

βœ… Research findings documented

βœ… Patterns identified (minimum 2)

βœ… Best practices extracted (minimum 3)

βœ… Quality score β‰₯95/100

```

If Fail: Research incomplete β†’ retry research OR proceed without (user decides)

Outputs:

  • Research findings (.analysis/ANALYSIS_FINAL.md)
  • Patterns and best practices
  • Implementation recommendations

Time: 30-60 minutes (can skip if familiar domain)

Next: Proceed to Stage 2

---

Stage 2: Planning

Purpose: Create agent-executable plan with quality β‰₯90/100

Process:

  1. Load Research Context (if research done):

```typescript

let context = "";

if (researchDone) {

context = await readFile('.analysis/ANALYSIS_FINAL.md');

}

```

  1. Invoke Planning:

```

Use multi-ai-planning to create plan for [objective]

${context ? Research findings available in: .analysis/ANALYSIS_FINAL.md : ''}

Create comprehensive plan following 6-step workflow.

```

What It Does:

- Analyzes objective

- Hierarchical decomposition (8-15 tasks)

- Maps dependencies, identifies parallel

- Plans verification for all tasks

- Scores quality (0-100)

  1. Quality Gate: Plan Approved:

```markdown

βœ… Plan created

βœ… Quality score β‰₯90/100

βœ… All tasks have verification

βœ… Dependencies mapped

βœ… No circular dependencies

```

If Fail (score <90):

- Review gap analysis

- Apply recommended fixes

- Re-verify

- Retry up to 2 times

- If still <90: Escalate to human review

  1. Save Plan to Shared State:

```bash

# Save for next stage

cp plans/[plan-id]/plan.json .multi-ai-context/plan.json

```

Outputs:

  • plan.json (machine-readable)
  • PLAN.md (human-readable)
  • COORDINATION.md (execution guide)
  • Quality β‰₯90/100

Time: 1.5-3 hours

Next: Proceed to Stage 3

---

Stage 3: Implementation

Purpose: Execute plan with TDD, produce working code

Process:

  1. Load Plan:

```typescript

const plan = JSON.parse(readFile('.multi-ai-context/plan.json'));

console.log(πŸ“‹ Loaded plan: ${plan.objective});

console.log( Tasks: ${plan.tasks.length});

console.log( Estimated: ${plan.metadata.estimated_total_hours} hours);

```

  1. Invoke Implementation:

```

Use multi-ai-implementation following plan in .multi-ai-context/plan.json

Execute all 6 steps:

1. Explore & gather context

2. Plan architecture (plan already created, refine as needed)

3. Implement incrementally with TDD

4. Coordinate multi-agent (if parallel tasks)

5. Integration & E2E testing

6. Quality verification before commit

Success criteria from plan.

```

What It Does:

- Explores codebase (progressive disclosure)

- Implements incrementally (<200 lines per commit)

- Test-driven development (tests first)

- Multi-agent coordination for parallel tasks

- Continuous testing during implementation

- Doom loop prevention (max 3 retries)

  1. Quality Gate: Implementation Complete:

```markdown

βœ… All plan tasks implemented

βœ… All tests passing

βœ… Coverage β‰₯80% (gate), ideally β‰₯95%

βœ… No regressions

βœ… Doom loop avoided (< max retries)

```

If Fail:

- Identify failing task

- Retry with different approach

- If 3 failures: Escalate to human

- Save state for recovery

  1. Save Implementation State:

```bash

# Save for next stage

echo '{

"status": "implemented",

"files_changed": [...],

"tests_run": 95,

"tests_passed": 95,

"coverage": 87,

"commits": ["abc123", "def456"]

}' > .multi-ai-context/implementation-status.json

```

Outputs:

  • Working code
  • Tests passing
  • Coverage β‰₯80%
  • Commits created

Time: 3-10 hours (varies by complexity)

Next: Proceed to Stage 4

---

Stage 4: Testing (Independent Verification)

Purpose: Verify tests are comprehensive and prevent gaming

Process:

  1. Load Implementation Context:

```typescript

const implStatus = JSON.parse(

readFile('.multi-ai-context/implementation-status.json')

);

console.log(πŸ§ͺ Testing implementation:);

console.log( Files changed: ${implStatus.files_changed.length});

console.log( Current coverage: ${implStatus.coverage}%);

```

  1. Invoke Independent Testing:

```

Use multi-ai-testing independent verification workflow

Verify:

- Tests in: tests/

- Code in: src/

- Specifications in: .multi-ai-context/plan.json

Workflows to execute:

1. Test quality verification (independent agent)

2. Coverage validation (β‰₯95% target)

3. Edge case discovery (AI-powered)

4. Multi-agent ensemble scoring (if critical feature)

Score test quality (0-100).

```

What It Does:

- Independent verification (separate agent from impl)

- Checks tests match specifications (not just what code does)

- Generates additional edge case tests

- Multi-agent ensemble for quality scoring

- Prevents overfitting

  1. Quality Gate: Testing Verified:

```markdown

βœ… Test quality score β‰₯90/100

βœ… Coverage β‰₯95% (target achieved)

βœ… Independent verification passed

βœ… No test gaming detected

βœ… Edge cases covered

```

If Fail:

- Review test quality issues

- Generate additional tests

- Re-verify

- Max 2 retries, then escalate

  1. Save Testing State:

```bash

echo '{

"status": "tested",

"test_quality_score": 92,

"coverage": 96,

"tests_total": 112,

"edge_cases": 23,

"gaming_detected": false

}' > .multi-ai-context/testing-status.json

```

Outputs:

  • Test quality β‰₯90/100
  • Coverage β‰₯95%
  • Independent verification passed

Time: 1-3 hours

Next: Proceed to Stage 5

---

Stage 5: Verification (Multi-Layer QA)

Purpose: Final quality assurance before production

Process:

  1. Load All Context:

```typescript

const plan = JSON.parse(readFile('.multi-ai-context/plan.json'));

const implStatus = JSON.parse(readFile('.multi-ai-context/implementation-status.json'));

const testStatus = JSON.parse(readFile('.multi-ai-context/testing-status.json'));

console.log(πŸ” Final verification:);

console.log( Objective: ${plan.objective});

console.log( Implementation: ${implStatus.status});

console.log( Testing: ${testStatus.coverage}% coverage);

```

  1. Invoke Multi-Layer Verification:

```

Use multi-ai-verification for complete quality check

Verify:

- Code: src/

- Tests: tests/

- Plan: .multi-ai-context/plan.json

Execute all 5 layers:

1. Rules-based (linting, types, schema, SAST)

2. Functional (tests, coverage, examples)

3. Visual (if UI: screenshots, a11y)

4. Integration (E2E, API compatibility)

5. Quality scoring (LLM-as-judge, 0-100)

All 5 quality gates must pass.

```

What It Does:

- Runs all 5 verification layers

- Each layer is independent

- LLM-as-judge for holistic assessment

- Agent-as-a-Judge can execute tools to verify claims

- Multi-agent ensemble for critical features

  1. Quality Gate: Production Ready:

```markdown

βœ… Layer 1 (Rules): PASS

βœ… Layer 2 (Functional): PASS, coverage 96%

βœ… Layer 3 (Visual): PASS or SKIPPED

βœ… Layer 4 (Integration): PASS

βœ… Layer 5 (Quality): 92/100 β‰₯90 βœ…

ALL GATES PASSED β†’ PRODUCTION APPROVED

```

If Fail:

- Review gap analysis from failed layer

- Apply recommended fixes

- Re-verify from failed layer (not all 5)

- Max 2 retries per layer

- If still failing: Escalate to human

  1. Generate Final Report:

```markdown

# Feature Implementation Complete

Objective: [from plan]

## Pipeline Execution Summary

### Stage 1: Research

- Status: βœ… Complete

- Quality: 97/100

- Time: 52 minutes

### Stage 2: Planning

- Status: βœ… Complete

- Quality: 94/100

- Tasks: 23

- Time: 1.8 hours

### Stage 3: Implementation

- Status: βœ… Complete

- Files changed: 15

- Lines added: 847

- Commits: 12

- Time: 6.2 hours

### Stage 4: Testing

- Status: βœ… Complete

- Test quality: 92/100

- Coverage: 96%

- Tests: 112

- Time: 1.5 hours

### Stage 5: Verification

- Status: βœ… Complete

- Quality score: 92/100

- All layers: PASS

- Time: 1.2 hours

## Final Metrics

- Total Time: 11.3 hours

- Quality: 92/100

- Coverage: 96%

- Status: βœ… PRODUCTION READY

## Commits

- abc123: feat: Add database schema

- def456: feat: Implement OAuth integration

- [... 10 more ...]

## Next Steps

- Create PR for team review

- Deploy to staging

- Production release

```

  1. Save to Memory (if agent-memory-system available):

```typescript

await storeMemory({

type: 'episodic',

event: {

description: Complete implementation: ${objective},

outcomes: {

total_time: 11.3,

quality_score: 92,

test_coverage: 96,

stages_completed: 5

},

learnings: extractedDuringPipeline

}

});

```

Outputs:

  • Production-ready code
  • Comprehensive final report
  • Commits created
  • PR ready (if requested)
  • Memory saved for future learning

Time: 30-90 minutes

Result: βœ… PRODUCTION READY

---

Failure Recovery

Failure Handling at Each Stage

Stage Fails β†’ Recovery Strategy:

Research Fails:

  • Retry with different sources
  • Skip research (use memory if available)
  • Escalate to human if critical gap

Planning Fails (score <90):

  • Review gap analysis
  • Apply fixes automatically if possible
  • Retry planning (max 2 attempts)
  • Escalate if still <90

Implementation Fails:

  • Identify failing task
  • Automatic rollback to last checkpoint
  • Retry with alternative approach
  • Doom loop prevention (max 3 retries)
  • Escalate with full error context

Testing Fails (coverage <80% or quality <90):

  • Generate additional tests for gaps
  • Retry verification
  • Max 2 retries
  • Escalate with coverage report

Verification Fails (score <90 or layer fails):

  • Apply auto-fixes for Layer 1-2 issues
  • Manual fixes needed for Layer 3-5
  • Re-verify from failed layer (not all 5)
  • Max 2 retries per layer
  • Escalate with quality report

---

Escalation Protocol

When to Escalate to Human:

  1. Any stage fails 3 times (doom loop)
  2. Planning quality <80 after 2 retries
  3. Implementation doom loop detected
  4. Verification score <80 after 2 retries
  5. Budget exceeded (if cost tracking enabled)
  6. Circular dependency detected
  7. Irrecoverable error (file system, permissions)

Escalation Format:

```markdown

# ⚠️ ESCALATION REQUIRED

Stage: Implementation (Stage 3)

Failure: Doom loop detected (3 failed attempts)

Context

  • Objective: Implement user authentication
  • Failing Task: 2.2.2 Token generation
  • Error: Tests fail with "undefined userId" repeatedly

Attempts Made

  1. Attempt 1: Added userId to payload β†’ Same error
  2. Attempt 2: Changed payload structure β†’ Same error
  3. Attempt 3: Different JWT library β†’ Same error

Root Cause Analysis

  • Tests expect user.id but implementation uses user.userId
  • Mismatch in data model between test and implementation
  • Auto-fix failed 3 times

Recommended Actions

  1. Review test specifications vs. implementation
  2. Align data model (user.id vs. user.userId)
  3. Manual intervention required

State Saved

  • Checkpoint: checkpoint-003 (before attempts)
  • Rollback available: git checkout checkpoint-003
  • Continue after fix: Resume from Task 2.2.2

```

---

Parallel Execution Optimization

Identifying Parallel Opportunities

From Plan:

```typescript

const plan = readFile('.multi-ai-context/plan.json');

// Plan identifies parallel groups

const parallelGroups = plan.parallel_groups;

// Example:

// Group 1: Tasks 2.1, 2.2, 2.3 (independent)

// Can execute in parallel

```

Executing Parallel Tasks

Pattern:

```typescript

// Stage 3: Implementation with parallel tasks

const parallelGroup = plan.parallel_groups.find(g => g.group_id === 'pg2');

// Spawn parallel implementation agents

const results = await Promise.all(

parallelGroup.tasks.map(taskId => {

const task = plan.tasks.find(t => t.id === taskId);

return task({

description: Implement ${task.description},

prompt: `Implement task ${task.id}: ${task.description}

Specifications from plan:

${JSON.stringify(task, null, 2)}

Success criteria:

${task.verification.success_criteria.join('\n')}

Write implementation and tests.

Report completion status.`

});

})

);

// Verify all parallel tasks completed

const allSucceeded = results.every(r => r.status === 'complete');

if (allSucceeded) {

// Proceed to integration

} else {

// Handle failures

}

```

Time Savings: 20-40% faster than sequential execution

---

State Management

Cross-Skill State Sharing

Shared Context Directory: .multi-ai-context/

Standard Files:

```

.multi-ai-context/

β”œβ”€β”€ research-findings.json # From multi-ai-research

β”œβ”€β”€ plan.json # From multi-ai-planning

β”œβ”€β”€ implementation-status.json # From multi-ai-implementation

β”œβ”€β”€ testing-status.json # From multi-ai-testing

β”œβ”€β”€ verification-report.json # From multi-ai-verification

β”œβ”€β”€ pipeline-state.json # Orchestrator state

└── failure-history.json # For doom loop detection

```

Benefits:

  • Skills don't duplicate work
  • Later stages read earlier outputs
  • Failure recovery knows full state
  • Memory can be saved from shared state

---

Progress Tracking

Real-Time Progress:

```json

{

"pipeline_id": "pipeline_20250126_1200",

"objective": "Implement user authentication",

"started_at": "2025-01-26T12:00:00Z",

"current_stage": 3,

"stages": [

{

"stage": 1,

"name": "Research",

"status": "complete",

"duration_minutes": 52,

"quality": 97

},

{

"stage": 2,

"name": "Planning",

"status": "complete",

"duration_minutes": 108,

"quality": 94

},

{

"stage": 3,

"name": "Implementation",

"status": "in_progress",

"started_at": "2025-01-26T13:48:00Z",

"tasks_total": 23,

"tasks_complete": 15,

"tasks_remaining": 8,

"percent_complete": 65

},

{

"stage": 4,

"name": "Testing",

"status": "pending"

},

{

"stage": 5,

"name": "Verification",

"status": "pending"

}

],

"estimated_completion": "2025-01-26T20:00:00Z",

"quality_target": 90,

"current_quality_estimate": 92

}

```

Query Progress:

```bash

# Check current status

cat .multi-ai-context/pipeline-state.json | jq '.current_stage, .stages[2].percent_complete'

# Output: Stage 3, 65% complete

```

---

Workflow Modes

Standard Mode (Full Pipeline)

All 5 Stages:

```

Research β†’ Planning β†’ Implementation β†’ Testing β†’ Verification

```

Time: 8-20 hours

Quality: Maximum (all gates, β‰₯90)

Use For: Production features, complex implementations

---

Fast Mode (Skip Research)

4 Stages (familiar domains):

```

Planning β†’ Implementation β†’ Testing β†’ Verification

```

Time: 6-15 hours

Quality: High (all gates except research)

Use For: Familiar domains, time-sensitive features

---

Quick Mode (Essential Gates Only)

Implementation + Basic Verification:

```

Planning β†’ Implementation β†’ Testing (basic) β†’ Verification (Layers 1-2 only)

```

Time: 3-8 hours

Quality: Good (essential gates only)

Use For: Internal tools, prototypes

---

Best Practices

1. Always Run Planning Stage

Even for "simple" features - planning quality β‰₯90 prevents issues

2. Use Memory to Skip Research

If similar work done before, recall patterns instead of researching

3. Monitor Progress

Check .multi-ai-context/pipeline-state.json to track progress

4. Trust the Quality Gates

If gate fails, there's a real issue - don't skip fixes

5. Save State Frequently

Each stage completion saves state (enables recovery)

6. Review Final Report

Complete understanding of what was built and quality achieved

---

Integration Points

With All 5 Multi-AI Skills

Coordinates:

  1. multi-ai-research (Stage 1)
  2. multi-ai-planning (Stage 2)
  3. multi-ai-implementation (Stage 3)
  4. multi-ai-testing (Stage 4)
  5. multi-ai-verification (Stage 5)

Provides:

  • Automatic skill invocation
  • Quality gate enforcement
  • Failure recovery
  • State management
  • Progress tracking
  • Final reporting

---

With agent-memory-system

Before Pipeline:

  • Recall similar past work
  • Load learned patterns
  • Skip research if memory sufficient

After Pipeline:

  • Save complete episode to memory
  • Extract learnings
  • Update procedural patterns
  • Improve estimation accuracy

---

With hooks-manager

Session Hooks:

  • SessionStart: Load pipeline state
  • SessionEnd: Save pipeline progress
  • PostToolUse: Track stage completions

Notification Hooks:

  • Send telemetry on stage completions
  • Alert on gate failures
  • Track quality scores

---

Quick Reference

The 5-Stage Pipeline

| Stage | Skill | Time | Quality Gate | Output |

|-------|-------|------|--------------|--------|

| 1 | multi-ai-research | 30-60m | β‰₯95/100 | Research findings |

| 2 | multi-ai-planning | 1.5-3h | β‰₯90/100 | Executable plan |

| 3 | multi-ai-implementation | 3-10h | Tests pass, β‰₯80% cov | Working code |

| 4 | multi-ai-testing | 1-3h | β‰₯95% cov, quality β‰₯90 | Verified tests |

| 5 | multi-ai-verification | 1-3h | β‰₯90/100, all layers | Production ready |

Total: 8-20 hours β†’ Production-ready feature

Workflow Modes

| Mode | Stages | Time | Quality | Use For |

|------|--------|------|---------|---------|

| Standard | All 5 | 8-20h | Maximum | Production features |

| Fast | 2-5 (skip research) | 6-15h | High | Familiar domains |

| Quick | 2,3,4,5 (basic) | 3-8h | Good | Internal tools |

Quality Gates

  • Research: β‰₯95/100, patterns identified
  • Planning: β‰₯90/100, all tasks verifiable
  • Implementation: Tests pass, coverage β‰₯80%
  • Testing: Quality β‰₯90/100, coverage β‰₯95%
  • Verification: β‰₯90/100, all 5 layers pass

---

end-to-end-orchestrator provides complete automation from feature description to production-ready code, coordinating all 5 multi-ai skills with quality gates, failure recovery, and state management - delivering enterprise-grade development workflows in a single command.

For examples, see examples/. For failure recovery, see Failure Recovery section.