🎯

goal-seeking-agent-pattern

🎯Skill

from rysweet/amplihack

VibeIndex|
What it does

goal-seeking-agent-pattern skill from rysweet/amplihack

πŸ“¦

Part of

rysweet/amplihack(81 items)

goal-seeking-agent-pattern

Installation

pip installInstall Python package
pip install amplihack
πŸ“– Extracted from docs: rysweet/amplihack
14Installs
17
-
Last UpdatedJan 26, 2026

Skill Details

SKILL.md

|

Overview

# Goal-Seeking Agent Pattern Skill

1. What Are Goal-Seeking Agents?

Goal-seeking agents are autonomous AI agents that execute multi-phase objectives by:

  1. Understanding High-Level Goals: Accept natural language objectives without explicit step-by-step instructions
  2. Planning Execution: Break goals into phases with dependencies and success criteria
  3. Autonomous Execution: Make decisions and adapt behavior based on intermediate results
  4. Self-Assessment: Evaluate progress against success criteria and adjust approach
  5. Resilient Operation: Handle failures gracefully and explore alternative solutions

Core Characteristics

Autonomy: Agents decide HOW to achieve goals, not just follow prescriptive steps

Adaptability: Adjust strategy based on runtime conditions and intermediate results

Goal-Oriented: Focus on outcomes (what to achieve) rather than procedures (how to achieve)

Multi-Phase: Complex objectives decomposed into manageable phases with dependencies

Self-Monitoring: Track progress, detect failures, and course-correct autonomously

Distinction from Traditional Agents

| Traditional Agent | Goal-Seeking Agent |

| ----------------------------- | ----------------------------- |

| Follows fixed workflow | Adapts workflow to context |

| Prescriptive steps | Outcome-oriented objectives |

| Human intervention on failure | Autonomous recovery attempts |

| Single-phase execution | Multi-phase with dependencies |

| Rigid decision tree | Dynamic strategy adjustment |

When Goal-Seeking Makes Sense

Goal-seeking agents excel when:

  • Problem space is large: Many possible paths to success
  • Context varies: Runtime conditions affect optimal approach
  • Failures are expected: Need autonomous recovery without human intervention
  • Objectives are clear: Success criteria well-defined but path is flexible
  • Multi-step complexity: Requires coordination across phases with dependencies

When to Avoid Goal-Seeking

Use traditional agents or scripts when:

  • Single deterministic path: Only one way to achieve goal
  • Latency-critical: Need fastest possible execution (no decision overhead)
  • Safety-critical: Human verification required at each step
  • Simple workflow: Complexity of goal-seeking exceeds benefit
  • Audit requirements: Need deterministic, reproducible execution

2. When to Use This Pattern

Problem Indicators

Use goal-seeking agents when you observe these patterns:

#### Pattern 1: Workflow Variability

Indicators:

  • Same objective requires different approaches based on context
  • Manual decisions needed at multiple points
  • "It depends" answers when mapping workflow

Example: Release workflow that varies by:

  • Environment (staging vs production)
  • Change type (hotfix vs feature)
  • Current system state (healthy vs degraded)

Solution: Goal-seeking agent evaluates context and adapts workflow

#### Pattern 2: Multi-Phase Complexity

Indicators:

  • Objective requires 3-5+ distinct phases
  • Phases have dependencies (output of phase N feeds phase N+1)
  • Parallel execution opportunities exist
  • Success criteria differ per phase

Example: Data pipeline with phases:

  1. Data collection (multiple sources, parallel)
  2. Transformation (depends on collection results)
  3. Validation (depends on transformation output)
  4. Publishing (conditional on validation pass)

Solution: Goal-seeking agent orchestrates phases, handles dependencies

#### Pattern 3: Autonomous Recovery Needed

Indicators:

  • Failures are expected and recoverable
  • Multiple retry/fallback strategies exist
  • Human intervention is expensive or slow
  • Can verify success programmatically

Example: CI diagnostic workflow:

  • Test failures (retry with different approach)
  • Environment issues (reconfigure and retry)
  • Dependency conflicts (resolve and rerun)

Solution: Goal-seeking agent tries strategies until success or escalation

#### Pattern 4: Adaptive Decision Making

Indicators:

  • Need to evaluate trade-offs at runtime
  • Multiple valid solutions with different characteristics
  • Optimization objectives (speed vs quality vs cost)
  • Context-dependent best practices

Example: Fix agent pattern matching:

  • QUICK mode for obvious issues
  • DIAGNOSTIC mode for unclear problems
  • COMPREHENSIVE mode for complex solutions

Solution: Goal-seeking agent selects strategy based on problem analysis

#### Pattern 5: Domain Expertise Required

Indicators:

  • Requires specialized knowledge to execute
  • Multiple domain-specific tools/approaches
  • Best practices vary by domain
  • Coordination of specialized sub-agents

Example: AKS SRE automation:

  • Azure-specific operations (ARM, CLI)
  • Kubernetes expertise (kubectl, YAML)
  • Networking knowledge (CNI, ingress)
  • Security practices (RBAC, Key Vault)

Solution: Goal-seeking agent with domain expertise coordinates specialized actions

Decision Framework

Use this 5-question framework to evaluate goal-seeking applicability:

#### Question 1: Is the objective well-defined but path flexible?

YES if:

  • Clear success criteria exist
  • Multiple valid approaches
  • Runtime context affects optimal path

NO if:

  • Only one correct approach
  • Path is deterministic
  • Success criteria ambiguous

Example YES: "Ensure AKS cluster is production-ready" (many paths, clear criteria)

Example NO: "Run specific kubectl command" (one path, prescriptive)

#### Question 2: Are there multiple phases with dependencies?

YES if:

  • Objective naturally decomposes into 3-5+ phases
  • Phase outputs feed subsequent phases
  • Some phases can execute in parallel
  • Failures in one phase affect downstream phases

NO if:

  • Single-phase execution sufficient
  • No inter-phase dependencies
  • Purely sequential with no branching

Example YES: Data pipeline (collect β†’ transform β†’ validate β†’ publish)

Example NO: Format code with ruff (single atomic operation)

#### Question 3: Is autonomous recovery valuable?

YES if:

  • Failures are common and expected
  • Multiple recovery strategies exist
  • Human intervention is expensive/slow
  • Can verify success automatically

NO if:

  • Failures are rare edge cases
  • Manual investigation always required
  • Safety-critical (human verification needed)
  • Cannot verify success programmatically

Example YES: CI diagnostic workflow (try multiple fix strategies)

Example NO: Deploy to production (human approval required)

#### Question 4: Does context significantly affect approach?

YES if:

  • Environment differences change strategy
  • Current system state affects decisions
  • Trade-offs vary by situation (speed vs quality vs cost)
  • Domain-specific best practices apply

NO if:

  • Same approach works for all contexts
  • No environmental dependencies
  • No trade-off decisions needed

Example YES: Fix agent (quick vs diagnostic vs comprehensive based on issue)

Example NO: Generate UUID (context-independent)

#### Question 5: Is the complexity justified?

YES if:

  • Problem is repeated frequently (2+ times/week)
  • Manual execution takes 30+ minutes
  • High value from automation
  • Maintenance cost is acceptable

NO if:

  • One-off or rare problem
  • Quick manual execution (< 5 minutes)
  • Simple script suffices
  • Maintenance cost exceeds benefit

Example YES: CI failure diagnosis (frequent, time-consuming, high value)

Example NO: One-time data migration (rare, script sufficient)

Decision Matrix

| All 5 YES | Use Goal-Seeking Agent |

| 4 YES, 1 NO | Probably use Goal-Seeking Agent |

| 3 YES, 2 NO | Consider simpler agent or hybrid |

| 2 YES, 3 NO | Traditional agent likely better |

| 0-1 YES | Script or simple automation |

3. Architecture Pattern

Component Architecture

Goal-seeking agents have four core components:

```python

# Component 1: Goal Definition

class GoalDefinition:

"""Structured representation of objective"""

raw_prompt: str # Natural language goal

goal: str # Extracted primary objective

domain: str # Problem domain (security, data, automation, etc.)

constraints: list[str] # Technical/operational constraints

success_criteria: list[str] # How to verify success

complexity: str # simple, moderate, complex

context: dict # Additional metadata

# Component 2: Execution Plan

class ExecutionPlan:

"""Multi-phase plan with dependencies"""

goal_id: uuid.UUID

phases: list[PlanPhase]

total_estimated_duration: str

required_skills: list[str]

parallel_opportunities: list[list[str]] # Phases that can run parallel

risk_factors: list[str]

# Component 3: Plan Phase

class PlanPhase:

"""Individual phase in execution plan"""

name: str

description: str

required_capabilities: list[str]

estimated_duration: str

dependencies: list[str] # Names of prerequisite phases

parallel_safe: bool # Can execute in parallel

success_indicators: list[str] # How to verify phase completion

# Component 4: Skill Definition

class SkillDefinition:

"""Capability needed for execution"""

name: str

description: str

capabilities: list[str]

implementation_type: str # "native" or "delegated"

delegation_target: str # Agent to delegate to

```

Execution Flow

```

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”

β”‚ 1. GOAL ANALYSIS β”‚

β”‚ β”‚

β”‚ Input: Natural language objective β”‚

β”‚ Process: Extract goal, domain, constraints, criteria β”‚

β”‚ Output: GoalDefinition β”‚

β”‚ β”‚

β”‚ [PromptAnalyzer.analyze_text(prompt)] β”‚

β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

↓

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”

β”‚ 2. PLANNING β”‚

β”‚ β”‚

β”‚ Input: GoalDefinition β”‚

β”‚ Process: Decompose into phases, identify dependencies β”‚

β”‚ Output: ExecutionPlan β”‚

β”‚ β”‚

β”‚ [ObjectivePlanner.generate_plan(goal_definition)] β”‚

β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

↓

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”

β”‚ 3. SKILL SYNTHESIS β”‚

β”‚ β”‚

β”‚ Input: ExecutionPlan β”‚

β”‚ Process: Map capabilities to skills, identify agents β”‚

β”‚ Output: list[SkillDefinition] β”‚

β”‚ β”‚

β”‚ [SkillSynthesizer.synthesize(execution_plan)] β”‚

β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

↓

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”

β”‚ 4. AGENT ASSEMBLY β”‚

β”‚ β”‚

β”‚ Input: GoalDefinition, ExecutionPlan, Skills β”‚

β”‚ Process: Combine into executable bundle β”‚

β”‚ Output: GoalAgentBundle β”‚

β”‚ β”‚

β”‚ [AgentAssembler.assemble(goal, plan, skills)] β”‚

β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

↓

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”

β”‚ 5. EXECUTION (Auto-Mode) β”‚

β”‚ β”‚

β”‚ Input: GoalAgentBundle β”‚

β”‚ Process: Execute phases, monitor progress, adapt β”‚

β”‚ Output: Success or escalation β”‚

β”‚ β”‚

β”‚ [Auto-mode with initial_prompt from bundle] β”‚

β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

```

Phase Dependency Management

Phases can have three relationship types:

Sequential Dependency: Phase B depends on Phase A completion

```

Phase A β†’ Phase B β†’ Phase C

```

Parallel Execution: Phases can run concurrently

```

Phase A ──┬→ Phase B ──┐

β””β†’ Phase C ──┴→ Phase D

```

Conditional Branching: Phase selection based on results

```

Phase A β†’ [Decision] β†’ Phase B (success path)

β””β†’ Phase C (recovery path)

```

State Management

Goal-seeking agents maintain state across phases:

```python

class AgentState:

"""Runtime state for goal-seeking agent"""

current_phase: str

completed_phases: list[str]

phase_results: dict[str, Any] # Output from each phase

failures: list[FailureRecord] # Track what didn't work

retry_count: int

total_duration: timedelta

context: dict # Shared context across phases

```

Error Handling

Three error recovery strategies:

Retry with Backoff: Same approach, exponential delay

```python

for attempt in range(MAX_RETRIES):

try:

result = execute_phase(phase)

break

except RetryableError as e:

wait_time = INITIAL_DELAY (2 * attempt)

sleep(wait_time)

```

Alternative Strategy: Different approach to same goal

```python

for strategy in STRATEGIES:

try:

result = execute_phase(phase, strategy)

break

except StrategyFailedError:

continue # Try next strategy

else:

escalate_to_human("All strategies exhausted")

```

Graceful Degradation: Accept partial success

```python

try:

result = execute_phase_optimal(phase)

except OptimalFailedError:

result = execute_phase_fallback(phase) # Lower quality but works

```

4. Integration with goal_agent_generator

The goal_agent_generator module provides the implementation for goal-seeking agents. Here's how to integrate:

Core API

```python

from amplihack.goal_agent_generator import (

PromptAnalyzer,

ObjectivePlanner,

SkillSynthesizer,

AgentAssembler,

GoalAgentPackager,

)

# Step 1: Analyze natural language goal

analyzer = PromptAnalyzer()

goal_definition = analyzer.analyze_text("""

Automate AKS cluster production readiness verification.

Check security, networking, monitoring, and compliance.

Generate report with actionable recommendations.

""")

# Step 2: Generate execution plan

planner = ObjectivePlanner()

execution_plan = planner.generate_plan(goal_definition)

# Step 3: Synthesize required skills

synthesizer = SkillSynthesizer()

skills = synthesizer.synthesize(execution_plan)

# Step 4: Assemble complete agent

assembler = AgentAssembler()

agent_bundle = assembler.assemble(

goal_definition=goal_definition,

execution_plan=execution_plan,

skills=skills,

bundle_name="aks-readiness-checker"

)

# Step 5: Package for deployment

packager = GoalAgentPackager()

packager.package(

bundle=agent_bundle,

output_dir=".claude/agents/goal-driven/aks-readiness-checker"

)

```

CLI Integration

```bash

# Generate agent from prompt file

amplihack goal-agent-generator create \

--prompt ./prompts/aks-readiness.md \

--output .claude/agents/goal-driven/aks-readiness-checker

# Generate agent from inline prompt

amplihack goal-agent-generator create \

--inline "Automate CI failure diagnosis and fix iteration" \

--output .claude/agents/goal-driven/ci-fixer

# List generated agents

amplihack goal-agent-generator list

# Test agent execution

amplihack goal-agent-generator test \

--agent-path .claude/agents/goal-driven/ci-fixer \

--dry-run

```

PromptAnalyzer Details

Extracts structured information from natural language:

```python

from amplihack.goal_agent_generator import PromptAnalyzer

from pathlib import Path

analyzer = PromptAnalyzer()

# From file

goal_def = analyzer.analyze(Path("./prompts/my-goal.md"))

# From text

goal_def = analyzer.analyze_text("Deploy and monitor microservices to AKS")

# GoalDefinition contains:

print(goal_def.goal) # "Deploy and monitor microservices to AKS"

print(goal_def.domain) # "deployment"

print(goal_def.constraints) # ["Zero downtime", "Rollback capability"]

print(goal_def.success_criteria) # ["All pods running", "Metrics visible"]

print(goal_def.complexity) # "moderate"

print(goal_def.context) # {"priority": "high", "scale": "medium"}

```

Domain classification:

  • data-processing: Data transformation, analysis, ETL
  • security-analysis: Vulnerability scanning, audits
  • automation: Workflow automation, scheduling
  • testing: Test generation, validation
  • deployment: Release, publishing, distribution
  • monitoring: Observability, alerting
  • integration: API connections, webhooks
  • reporting: Dashboards, metrics, summaries

Complexity determination:

  • simple: Single-phase, < 50 words, basic operations
  • moderate: 2-4 phases, 50-150 words, some coordination
  • complex: 5+ phases, > 150 words, sophisticated orchestration

ObjectivePlanner Details

Generates multi-phase execution plans:

```python

from amplihack.goal_agent_generator import ObjectivePlanner

planner = ObjectivePlanner()

plan = planner.generate_plan(goal_definition)

# ExecutionPlan contains:

for i, phase in enumerate(plan.phases, 1):

print(f"Phase {i}: {phase.name}")

print(f" Description: {phase.description}")

print(f" Duration: {phase.estimated_duration}")

print(f" Capabilities: {', '.join(phase.required_capabilities)}")

print(f" Dependencies: {', '.join(phase.dependencies)}")

print(f" Parallel Safe: {phase.parallel_safe}")

print(f" Success Indicators: {phase.success_indicators}")

print(f"\nTotal Duration: {plan.total_estimated_duration}")

print(f"Required Skills: {', '.join(plan.required_skills)}")

print(f"Parallel Opportunities: {plan.parallel_opportunities}")

print(f"Risk Factors: {plan.risk_factors}")

```

Phase templates by domain:

  • data-processing: Collection β†’ Transformation β†’ Analysis β†’ Reporting
  • security-analysis: Reconnaissance β†’ Vulnerability Detection β†’ Risk Assessment β†’ Reporting
  • automation: Setup β†’ Workflow Design β†’ Execution β†’ Validation
  • testing: Test Planning β†’ Implementation β†’ Execution β†’ Results Analysis
  • deployment: Pre-deployment β†’ Deployment β†’ Verification β†’ Post-deployment
  • monitoring: Setup Monitors β†’ Data Collection β†’ Analysis β†’ Alerting

SkillSynthesizer Details

Maps capabilities to skills:

```python

from amplihack.goal_agent_generator import SkillSynthesizer

synthesizer = SkillSynthesizer()

skills = synthesizer.synthesize(execution_plan)

# list[SkillDefinition]

for skill in skills:

print(f"Skill: {skill.name}")

print(f" Description: {skill.description}")

print(f" Capabilities: {', '.join(skill.capabilities)}")

print(f" Type: {skill.implementation_type}")

if skill.implementation_type == "delegated":

print(f" Delegates to: {skill.delegation_target}")

```

Capability mapping:

  • data-* β†’ data-processor skill
  • security-, vulnerability- β†’ security-analyzer skill
  • test-* β†’ tester skill
  • deploy-* β†’ deployer skill
  • monitor-, alert- β†’ monitor skill
  • report-, document- β†’ documenter skill

AgentAssembler Details

Combines components into executable bundle:

```python

from amplihack.goal_agent_generator import AgentAssembler

assembler = AgentAssembler()

bundle = assembler.assemble(

goal_definition=goal_definition,

execution_plan=execution_plan,

skills=skills,

bundle_name="custom-agent" # Optional, auto-generated if omitted

)

# GoalAgentBundle contains:

print(bundle.id) # UUID

print(bundle.name) # "custom-agent" or auto-generated

print(bundle.version) # "1.0.0"

print(bundle.status) # "ready"

print(bundle.auto_mode_config) # Configuration for auto-mode execution

print(bundle.metadata) # Domain, complexity, skills, etc.

# Auto-mode configuration

config = bundle.auto_mode_config

print(config["max_turns"]) # Based on complexity

print(config["initial_prompt"]) # Generated execution prompt

print(config["success_criteria"]) # From goal definition

print(config["constraints"]) # From goal definition

```

Auto-mode configuration:

  • max_turns: 5 (simple), 10 (moderate), 15 (complex), +20% per extra phase
  • initial_prompt: Full markdown prompt with goal, plan, success criteria
  • working_dir: Current directory
  • sdk: "claude" (default)
  • ui_mode: False (headless by default)

GoalAgentPackager Details

Packages bundle for deployment:

```python

from amplihack.goal_agent_generator import GoalAgentPackager

from pathlib import Path

packager = GoalAgentPackager()

packager.package(

bundle=agent_bundle,

output_dir=Path(".claude/agents/goal-driven/my-agent")

)

# Creates:

# .claude/agents/goal-driven/my-agent/

# β”œβ”€β”€ agent.md # Agent definition

# β”œβ”€β”€ prompt.md # Initial prompt

# β”œβ”€β”€ metadata.json # Bundle metadata

# β”œβ”€β”€ plan.yaml # Execution plan

# └── skills.yaml # Required skills

```

5. Recent Amplihack Examples

Real goal-seeking agents from the amplihack project:

Example 1: AKS SRE Automation (Issue #1293)

Problem: Manual AKS cluster operations are time-consuming and error-prone

Goal-Seeking Solution:

```python

# Goal: Automate AKS production readiness verification

goal = """

Verify AKS cluster production readiness:

  • Security: RBAC, network policies, Key Vault integration
  • Networking: Ingress, DNS, load balancers
  • Monitoring: Container Insights, alerts, dashboards
  • Compliance: Azure Policy, resource quotas

Generate actionable report with recommendations.

"""

# Agent decomposes into phases:

# 1. Security Audit (parallel): RBAC check, network policies, Key Vault

# 2. Networking Validation (parallel): Ingress test, DNS resolution, LB health

# 3. Monitoring Verification (parallel): Metrics, logs, alerts configured

# 4. Compliance Check (depends on 1-3): Azure Policy, quotas, best practices

# 5. Report Generation (depends on 4): Markdown report with findings

# Agent adapts based on findings:

# - If security issues found: Suggest fixes, offer to apply

# - If monitoring missing: Generate alert templates

# - If compliance violations: List remediation steps

```

Key Characteristics:

  • Autonomous: Checks multiple systems without step-by-step instructions
  • Adaptive: Investigation depth varies by findings
  • Multi-Phase: Parallel security/networking/monitoring, sequential reporting
  • Domain Expert: Azure + Kubernetes knowledge embedded
  • Self-Assessing: Validates each check, aggregates results

Implementation:

```python

# Located in: .claude/agents/amplihack/specialized/azure-kubernetes-expert.md

# Uses knowledge base: .claude/data/azure_aks_expert/

# Integrates with goal_agent_generator:

from amplihack.goal_agent_generator import (

PromptAnalyzer, ObjectivePlanner, AgentAssembler

)

analyzer = PromptAnalyzer()

goal_def = analyzer.analyze_text(goal)

planner = ObjectivePlanner()

plan = planner.generate_plan(goal_def) # Generates 5-phase plan

# Domain-specific customization:

plan.phases[0].required_capabilities = [

"rbac-audit", "network-policy-check", "key-vault-integration"

]

```

Lessons Learned:

  • Domain expertise critical for complex infrastructure
  • Parallel execution significantly reduces total time
  • Actionable recommendations increase agent value
  • Comprehensive knowledge base (Q&A format) enables autonomous decisions

Example 2: CI Diagnostic Workflow

Problem: CI failures require manual diagnosis and fix iteration

Goal-Seeking Solution:

```python

# Goal: Diagnose CI failure and iterate fixes until success

goal = """

CI pipeline failing after push.

Diagnose failures, apply fixes, push updates, monitor CI.

Iterate until all checks pass.

Stop at mergeable state without auto-merging.

"""

# Agent decomposes into phases:

# 1. CI Status Monitoring: Check current CI state

# 2. Failure Diagnosis: Analyze logs, compare environments

# 3. Fix Application: Apply fixes based on failure patterns

# 4. Push and Wait: Commit fixes, push, wait for CI re-run

# 5. Success Verification: Confirm all checks pass

# Iterative loop:

# Phases 2-4 repeat until success or max iterations (5)

```

Key Characteristics:

  • Iterative: Repeats fix cycle until success
  • Autonomous Recovery: Tries multiple fix strategies
  • State Management: Tracks attempted fixes, avoids repeating failures
  • Pattern Matching: Recognizes common CI failure types
  • Escalation: Reports to user after max iterations

Implementation:

```python

# Located in: .claude/agents/amplihack/specialized/ci-diagnostic-workflow.md

# Fix iteration loop:

MAX_ITERATIONS = 5

iteration = 0

while iteration < MAX_ITERATIONS:

status = check_ci_status()

if status["conclusion"] == "success":

break

# Diagnose failures

failures = analyze_ci_logs(status)

# Apply pattern-matched fixes

for failure in failures:

if "test" in failure["type"]:

fix_test_failure(failure)

elif "lint" in failure["type"]:

fix_lint_failure(failure)

elif "type" in failure["type"]:

fix_type_failure(failure)

# Commit and push

git_commit_and_push(f"fix: CI iteration {iteration + 1}")

# Wait for CI re-run

wait_for_ci_completion()

iteration += 1

if iteration >= MAX_ITERATIONS:

escalate_to_user("CI still failing after 5 iterations")

```

Lessons Learned:

  • Iteration limits prevent infinite loops
  • Pattern matching (test/lint/type) enables targeted fixes
  • Smart waiting (exponential backoff) reduces wait time
  • Never auto-merge: human approval always required

Example 3: Pre-Commit Diagnostic Workflow

Problem: Pre-commit hooks fail with unclear errors

Goal-Seeking Solution:

```python

# Goal: Fix pre-commit hook failures before commit

goal = """

Pre-commit hooks failing.

Diagnose issues (formatting, linting, type checking).

Apply fixes locally, re-run hooks.

Ensure all hooks pass before allowing commit.

"""

# Agent decomposes into phases:

# 1. Hook Failure Analysis: Identify which hooks failed

# 2. Environment Check: Compare local vs pre-commit versions

# 3. Targeted Fixes: Apply fixes per hook type

# 4. Hook Re-run: Validate fixes, iterate if needed

# 5. Commit Readiness: Confirm all hooks pass

```

Key Characteristics:

  • Pre-Push Focus: Fixes issues before pushing to CI
  • Tool Version Management: Ensures local matches pre-commit config
  • Hook-Specific Fixes: Tailored approach per hook type
  • Fast Iteration: No wait for CI, immediate feedback

Implementation:

```python

# Located in: .claude/agents/amplihack/specialized/pre-commit-diagnostic.md

# Hook failure patterns:

HOOK_FIXES = {

"ruff": lambda: subprocess.run(["ruff", "check", "--fix", "."]),

"black": lambda: subprocess.run(["black", "."]),

"mypy": lambda: add_type_ignores(),

"trailing-whitespace": lambda: subprocess.run(["pre-commit", "run", "trailing-whitespace", "--all-files"]),

}

# Execution:

failed_hooks = detect_failed_hooks()

for hook in failed_hooks:

if hook in HOOK_FIXES:

HOOK_FIXES[hook]()

else:

generic_fix(hook)

# Re-run to verify

rerun_result = subprocess.run(["pre-commit", "run", "--all-files"])

if rerun_result.returncode == 0:

print("All hooks passing, ready to commit!")

```

Lessons Learned:

  • Pre-commit fixes are faster than CI iteration
  • Tool version mismatches are common culprit
  • Automated fixes for 80% of cases
  • Remaining 20% escalate with clear diagnostics

Example 4: Fix-Agent Pattern Matching

Problem: Different issues require different fix approaches

Goal-Seeking Solution:

```python

# Goal: Select optimal fix strategy based on problem context

goal = """

Analyze issue and select fix mode:

  • QUICK: Obvious fixes (< 5 min)
  • DIAGNOSTIC: Unclear root cause (investigation)
  • COMPREHENSIVE: Complex issues (full workflow)

"""

# Agent decomposes into phases:

# 1. Issue Analysis: Classify problem type and complexity

# 2. Mode Selection: Choose QUICK/DIAGNOSTIC/COMPREHENSIVE

# 3. Fix Execution: Apply mode-appropriate strategy

# 4. Validation: Verify fix resolves issue

```

Key Characteristics:

  • Context-Aware: Selects strategy based on problem analysis
  • Multi-Mode: Three fix modes for different complexity levels
  • Pattern Recognition: Learns from past fixes
  • Adaptive: Escalates complexity if initial mode fails

Implementation:

```python

# Located in: .claude/agents/amplihack/specialized/fix-agent.md

# Mode selection logic:

def select_fix_mode(issue: Issue) -> FixMode:

if issue.is_obvious() and issue.scope == "single-file":

return FixMode.QUICK

elif issue.root_cause_unclear():

return FixMode.DIAGNOSTIC

elif issue.is_complex() or issue.requires_architecture_change():

return FixMode.COMPREHENSIVE

else:

return FixMode.DIAGNOSTIC # Default to investigation

# Pattern frequency (from real usage):

FIX_PATTERNS = {

"import": 0.15, # Import errors (15%)

"config": 0.12, # Configuration issues (12%)

"test": 0.18, # Test failures (18%)

"ci": 0.20, # CI/CD problems (20%)

"quality": 0.25, # Code quality (linting, types) (25%)

"logic": 0.10, # Logic errors (10%)

}

# Template-based fixes for common patterns:

if issue.pattern == "import":

apply_template("import-fix-template", issue)

elif issue.pattern == "config":

apply_template("config-fix-template", issue)

# ... etc

```

Lessons Learned:

  • Pattern matching enables template-based fixes (80% coverage)
  • Mode selection reduces over-engineering (right-sized approach)
  • Diagnostic mode critical for unclear issues (root cause analysis)
  • Usage data informs template priorities

6. Design Checklist

Use this checklist when designing goal-seeking agents:

Goal Definition

  • [ ] Objective is clear and well-defined
  • [ ] Success criteria are measurable and verifiable
  • [ ] Constraints are explicit (time, resources, safety)
  • [ ] Domain is identified (impacts phase templates)
  • [ ] Complexity is estimated (simple/moderate/complex)

Phase Design

  • [ ] Decomposed into 3-5 phases (not too granular, not too coarse)
  • [ ] Phase dependencies are explicit
  • [ ] Parallel execution opportunities identified
  • [ ] Each phase has clear success indicators
  • [ ] Phase durations are estimated

Skill Mapping

  • [ ] Required capabilities identified per phase
  • [ ] Skills mapped to existing agents or tools
  • [ ] Delegation targets specified
  • [ ] No missing capabilities

Error Handling

  • [ ] Retry strategies defined (max attempts, backoff)
  • [ ] Alternative strategies identified
  • [ ] Escalation criteria clear (when to ask for help)
  • [ ] Graceful degradation options (fallback approaches)

State Management

  • [ ] State tracked across phases
  • [ ] Phase results stored for downstream use
  • [ ] Failure history maintained
  • [ ] Context shared appropriately

Testing

  • [ ] Success scenarios tested
  • [ ] Failure recovery tested
  • [ ] Edge cases identified
  • [ ] Performance validated (duration, resource usage)

Documentation

  • [ ] Goal clearly documented
  • [ ] Phase descriptions complete
  • [ ] Usage examples provided
  • [ ] Integration points specified

Philosophy Compliance

  • [ ] Ruthless simplicity (no unnecessary complexity)
  • [ ] Single responsibility per phase
  • [ ] No over-engineering (right-sized solution)
  • [ ] Regeneratable (clear specifications)

7. Agent SDK Integration (Future)

When the Agent SDK Skill is integrated, goal-seeking agents can leverage:

Enhanced Autonomy

```python

# Agent SDK provides enhanced context management

from claude_agent_sdk import AgentContext, Tool

class GoalSeekingAgent:

def __init__(self, context: AgentContext):

self.context = context

self.state = {}

async def execute_phase(self, phase: PlanPhase):

# SDK provides tools, memory, delegation

tools = self.context.get_tools(phase.required_capabilities)

memory = self.context.get_memory()

# Execute with SDK support

result = await phase.execute(tools, memory)

# Store in context for downstream phases

self.context.store_result(phase.name, result)

```

Tool Discovery

```python

# SDK enables dynamic tool discovery

available_tools = context.discover_tools(capability="data-processing")

# Select optimal tool for task

tool = context.select_tool(

capability="data-transformation",

criteria={"performance": "high", "accuracy": "required"}

)

```

Memory Management

```python

# SDK provides persistent memory across sessions

context.memory.store("deployment-history", deployment_record)

previous = context.memory.retrieve("deployment-history")

# Enables learning from past executions

if previous and previous.failed:

# Avoid previous failure strategy

strategy = select_alternative_strategy(previous.failure_reason)

```

Agent Delegation

```python

# SDK simplifies agent-to-agent delegation

result = await context.delegate(

agent="security-analyzer",

task="audit-rbac-policies",

input={"cluster": cluster_name}

)

# Parallel delegation

results = await context.delegate_parallel([

("security-analyzer", "audit-rbac-policies"),

("network-analyzer", "validate-ingress"),

("monitoring-validator", "check-metrics")

])

```

Observability

```python

# SDK provides built-in tracing and metrics

with context.trace("data-transformation"):

result = transform_data(input_data)

context.metrics.record("transformation-duration", duration)

context.metrics.record("transformation-accuracy", accuracy)

```

Integration Example

```python

from claude_agent_sdk import AgentContext, create_agent

from amplihack.goal_agent_generator import GoalAgentBundle

# Create SDK-enabled goal-seeking agent

def create_goal_agent(bundle: GoalAgentBundle) -> Agent:

context = AgentContext(

name=bundle.name,

version=bundle.version,

capabilities=bundle.metadata["required_capabilities"]

)

# Register phases as agent tasks

for phase in bundle.execution_plan.phases:

context.register_task(

name=phase.name,

capabilities=phase.required_capabilities,

executor=create_phase_executor(phase)

)

# Create agent with SDK

agent = create_agent(context)

# Execute goal

return agent

# Usage:

agent = create_goal_agent(agent_bundle)

result = await agent.execute(bundle.auto_mode_config["initial_prompt"])

```

8. Trade-Off Analysis

Goal-Seeking vs Traditional Agents

| Dimension | Goal-Seeking Agent | Traditional Agent |

| -------------------- | ------------------------------------- | ------------------------- |

| Flexibility | High - adapts to context | Low - fixed workflow |

| Development Time | Moderate - define goals & phases | Low - script steps |

| Execution Time | Higher - decision overhead | Lower - direct execution |

| Maintenance | Lower - self-adapting | Higher - manual updates |

| Debuggability | Harder - dynamic behavior | Easier - predictable flow |

| Reusability | High - same agent, different contexts | Low - context-specific |

| Failure Handling | Autonomous recovery | Manual intervention |

| Complexity | Higher - multi-phase coordination | Lower - linear execution |

When to Choose Each

Choose Goal-Seeking when:

  • Problem space is large with many valid approaches
  • Context varies significantly across executions
  • Autonomous recovery is valuable
  • Reusability across contexts is important
  • Development time investment is justified

Choose Traditional when:

  • Single deterministic path exists
  • Performance is critical (low latency required)
  • Simplicity is paramount
  • One-off or rare execution
  • Debugging and auditability are critical

Cost-Benefit Analysis

Goal-Seeking Costs:

  • Higher development time (define goals, phases, capabilities)
  • Increased execution time (decision overhead)
  • More complex testing (dynamic behavior)
  • Harder debugging (non-deterministic paths)

Goal-Seeking Benefits:

  • Autonomous operation (less human intervention)
  • Adaptive to context (works in varied conditions)
  • Reusable across problems (same agent, different goals)
  • Self-recovering (handles failures gracefully)

Break-Even Point: Goal-seeking justified when problem is:

  • Repeated 2+ times per week, OR
  • Takes 30+ minutes manual execution, OR
  • Requires expert knowledge hard to document, OR
  • High value from autonomous recovery

9. When to Escalate

Goal-seeking agents should escalate to humans when:

Hard Limits Reached

Max Iterations Exceeded:

```python

if iteration_count >= MAX_ITERATIONS:

escalate(

reason="Reached maximum iterations without success",

context={

"iterations": iteration_count,

"attempted_strategies": attempted_strategies,

"last_error": last_error

}

)

```

Timeout Exceeded:

```python

if elapsed_time > MAX_DURATION:

escalate(

reason="Execution time exceeded limit",

context={

"elapsed": elapsed_time,

"max_allowed": MAX_DURATION,

"completed_phases": completed_phases

}

)

```

Safety Boundaries

Destructive Operations:

```python

if operation.is_destructive() and not operation.has_approval():

escalate(

reason="Destructive operation requires human approval",

operation=operation.description,

impact=operation.estimate_impact()

)

```

Production Changes:

```python

if target_environment == "production":

escalate(

reason="Production deployments require human verification",

changes=proposed_changes,

rollback_plan=rollback_strategy

)

```

Uncertainty Detection

Low Confidence:

```python

if decision_confidence < CONFIDENCE_THRESHOLD:

escalate(

reason="Confidence below threshold for autonomous decision",

decision=decision_description,

confidence=decision_confidence,

alternatives=alternative_options

)

```

Conflicting Strategies:

```python

if len(viable_strategies) > 1 and not clear_winner:

escalate(

reason="Multiple viable strategies, need human judgment",

strategies=viable_strategies,

trade_offs=strategy_trade_offs

)

```

Unexpected Conditions

Unrecognized Errors:

```python

if error_type not in KNOWN_ERROR_PATTERNS:

escalate(

reason="Encountered unknown error pattern",

error=error_details,

context=execution_context,

recommendation="Manual investigation required"

)

```

Environment Mismatch:

```python

if detected_environment != expected_environment:

escalate(

reason="Environment mismatch detected",

expected=expected_environment,

detected=detected_environment,

risk="Potential for incorrect behavior"

)

```

Escalation Best Practices

Provide Context:

  • What was attempted
  • What failed and why
  • What alternatives were considered
  • Current system state

Suggest Actions:

  • Recommend next steps
  • Provide diagnostic commands
  • Offer manual intervention points
  • Suggest rollback if needed

Enable Recovery:

  • Save execution state
  • Document failures
  • Provide resume capability
  • Offer manual override

Example Escalation:

```python

escalate(

reason="CI failure diagnosis unsuccessful after 5 iterations",

context={

"iterations": 5,

"attempted_fixes": [

"Import path corrections (iteration 1)",

"Type annotation fixes (iteration 2)",

"Test environment setup (iteration 3)",

"Dependency version pins (iteration 4)",

"Mock configuration (iteration 5)"

],

"persistent_failures": [

"test_integration.py::test_api_connection - Timeout",

"test_models.py::test_validation - Assertion error"

],

"system_state": "2 of 25 tests still failing",

"ci_logs": "https://github.com/.../actions/runs/123456"

},

recommendations=[

"Review test_api_connection timeout - may need increased timeout or mock",

"Examine test_validation assertion - data structure may have changed",

"Consider running tests locally with same environment as CI",

"Check if recent changes affected integration test setup"

],

next_steps={

"manual_investigation": "Run failing tests locally with verbose output",

"rollback_option": "git revert HEAD~5 if fixes made things worse",

"resume_point": "Fix failures and run /amplihack:ci-diagnostic to resume"

}

)

```

10. Example Workflow

Complete example: Building a goal-seeking agent for data pipeline automation

Step 1: Define Goal

```markdown

# Goal: Automate Multi-Source Data Pipeline

Objective

Collect data from multiple sources (S3, database, API), transform to common schema, validate quality, publish to data warehouse.

Success Criteria

  • All sources successfully ingested
  • Data transformed to target schema
  • Quality checks pass (completeness, accuracy)
  • Data published to warehouse
  • Pipeline completes within 30 minutes

Constraints

  • Must handle source unavailability gracefully
  • No data loss (failed records logged)
  • Idempotent (safe to re-run)
  • Resource limits: 8GB RAM, 4 CPU cores

Context

  • Daily execution (automated schedule)
  • Priority: High (blocking downstream analytics)
  • Scale: Medium (100K-1M records per source)

```

Step 2: Analyze with PromptAnalyzer

```python

from amplihack.goal_agent_generator import PromptAnalyzer

analyzer = PromptAnalyzer()

goal_definition = analyzer.analyze_text(goal_text)

# Result:

# goal_definition.goal = "Automate Multi-Source Data Pipeline"

# goal_definition.domain = "data-processing"

# goal_definition.complexity = "moderate"

# goal_definition.constraints = [

# "Must handle source unavailability gracefully",

# "No data loss (failed records logged)",

# "Idempotent (safe to re-run)",

# "Resource limits: 8GB RAM, 4 CPU cores"

# ]

# goal_definition.success_criteria = [

# "All sources successfully ingested",

# "Data transformed to target schema",

# "Quality checks pass (completeness, accuracy)",

# "Data published to warehouse",

# "Pipeline completes within 30 minutes"

# ]

```

Step 3: Generate Plan with ObjectivePlanner

```python

from amplihack.goal_agent_generator import ObjectivePlanner

planner = ObjectivePlanner()

execution_plan = planner.generate_plan(goal_definition)

# Result: 4-phase plan

# Phase 1: Data Collection (parallel)

# - Collect from S3 (parallel-safe)

# - Collect from database (parallel-safe)

# - Collect from API (parallel-safe)

# Duration: 15 minutes

# Success: All sources attempted, failures logged

#

# Phase 2: Data Transformation (depends on Phase 1)

# - Parse raw data

# - Transform to common schema

# - Handle missing fields

# Duration: 15 minutes

# Success: All records transformed or logged as failed

#

# Phase 3: Quality Validation (depends on Phase 2)

# - Completeness check

# - Accuracy validation

# - Consistency verification

# Duration: 5 minutes

# Success: Quality thresholds met

#

# Phase 4: Data Publishing (depends on Phase 3)

# - Load to warehouse

# - Update metadata

# - Generate report

# Duration: 10 minutes

# Success: Data in warehouse, report generated

```

Step 4: Synthesize Skills

```python

from amplihack.goal_agent_generator import SkillSynthesizer

synthesizer = SkillSynthesizer()

skills = synthesizer.synthesize(execution_plan)

# Result: 3 skills

# Skill 1: data-collector

# Capabilities: ["s3-read", "database-query", "api-fetch"]

# Implementation: "native" (built-in)

#

# Skill 2: data-transformer

# Capabilities: ["parsing", "schema-mapping", "validation"]

# Implementation: "native" (built-in)

#

# Skill 3: data-publisher

# Capabilities: ["warehouse-load", "metadata-update", "reporting"]

# Implementation: "delegated" (delegates to warehouse tool)

```

Step 5: Assemble Agent

```python

from amplihack.goal_agent_generator import AgentAssembler

assembler = AgentAssembler()

agent_bundle = assembler.assemble(

goal_definition=goal_definition,

execution_plan=execution_plan,

skills=skills,

bundle_name="multi-source-data-pipeline"

)

# Result: GoalAgentBundle

# - Name: multi-source-data-pipeline

# - Max turns: 12 (moderate complexity, 4 phases)

# - Initial prompt: Full execution plan with phases

# - Status: "ready"

```

Step 6: Package Agent

```python

from amplihack.goal_agent_generator import GoalAgentPackager

from pathlib import Path

packager = GoalAgentPackager()

packager.package(

bundle=agent_bundle,

output_dir=Path(".claude/agents/goal-driven/multi-source-data-pipeline")

)

# Creates agent package:

# .claude/agents/goal-driven/multi-source-data-pipeline/

# β”œβ”€β”€ agent.md # Agent definition

# β”œβ”€β”€ prompt.md # Execution prompt

# β”œβ”€β”€ metadata.json # Bundle metadata

# β”œβ”€β”€ plan.yaml # Execution plan (4 phases)

# └── skills.yaml # 3 required skills

```

Step 7: Execute Agent (Auto-Mode)

```bash

# Execute via CLI

amplihack goal-agent-generator execute \

--agent-path .claude/agents/goal-driven/multi-source-data-pipeline \

--auto-mode \

--max-turns 12

# Or programmatically:

```

```python

from claude_code import execute_auto_mode

result = execute_auto_mode(

initial_prompt=agent_bundle.auto_mode_config["initial_prompt"],

max_turns=agent_bundle.auto_mode_config["max_turns"],

working_dir=agent_bundle.auto_mode_config["working_dir"]

)

```

Step 8: Monitor Execution

Agent executes autonomously:

```

Phase 1: Data Collection [In Progress]

β”œβ”€β”€ S3 Collection: βœ“ COMPLETED (50K records, 5 minutes)

β”œβ”€β”€ Database Collection: βœ“ COMPLETED (75K records, 8 minutes)

└── API Collection: βœ— FAILED (timeout, retrying...)

└── Retry 1: βœ“ COMPLETED (25K records, 4 minutes)

Phase 1: βœ“ COMPLETED (150K records total, 3 sources, 17 minutes)

Phase 2: Data Transformation [In Progress]

β”œβ”€β”€ Parsing: βœ“ COMPLETED (150K records parsed)

β”œβ”€β”€ Schema Mapping: βœ“ COMPLETED (148K records mapped, 2K failed)

└── Missing Fields: βœ“ COMPLETED (defaults applied)

Phase 2: βœ“ COMPLETED (148K records ready, 2K logged as failed, 12 minutes)

Phase 3: Quality Validation [In Progress]

β”œβ”€β”€ Completeness: βœ“ PASS (98.7% complete, threshold 95%)

β”œβ”€β”€ Accuracy: βœ“ PASS (99.2% accurate, threshold 98%)

└── Consistency: βœ“ PASS (100% consistent)

Phase 3: βœ“ COMPLETED (All checks passed, 4 minutes)

Phase 4: Data Publishing [In Progress]

β”œβ”€β”€ Warehouse Load: βœ“ COMPLETED (148K records loaded)

β”œβ”€β”€ Metadata Update: βœ“ COMPLETED (pipeline_run_id: 12345)

└── Report Generation: βœ“ COMPLETED (report.html)

Phase 4: βœ“ COMPLETED (Data published, 8 minutes)

Total Execution: βœ“ SUCCESS (41 minutes, all success criteria met)

```

Step 9: Review Results

```markdown

# Pipeline Execution Report

Summary

  • Status: SUCCESS
  • Duration: 41 minutes (estimated: 30 minutes)
  • Records Processed: 150K ingested, 148K published
  • Success Rate: 98.7%

Phase Results

Phase 1: Data Collection

  • S3: 50K records (5 min)
  • Database: 75K records (8 min)
  • API: 25K records (4 min, 1 retry)

Phase 2: Data Transformation

  • Successfully transformed: 148K records
  • Failed transformations: 2K records (logged to failed_records.log)
  • Failure reasons: Schema mismatch (1.5K), Invalid data (500)

Phase 3: Quality Validation

  • Completeness: 98.7% βœ“
  • Accuracy: 99.2% βœ“
  • Consistency: 100% βœ“

Phase 4: Data Publishing

  • Warehouse load: Success
  • Pipeline run ID: 12345
  • Report: report.html

Issues Encountered

  1. API timeout (Phase 1): Resolved with retry
  2. 2K transformation failures: Logged for manual review

Recommendations

  1. Investigate schema mismatches in API data
  2. Add validation for API data format
  3. Consider increasing timeout for API calls

```

Step 10: Iteration (If Needed)

If pipeline fails, agent adapts:

```python

# Example: API source completely unavailable

if phase1_result["api"]["status"] == "unavailable":

# Agent adapts: continues with partial data

log_warning("API source unavailable, continuing with S3 + database")

proceed_to_phase2_with_partial_data()

# Report notes partial data

add_to_report("Data incomplete: API source unavailable")

# Example: Quality validation fails

if phase3_result["completeness"] < THRESHOLD:

# Agent tries recovery: fetch missing data

missing_records = identify_missing_records()

retry_collection_for_missing(missing_records)

rerun_transformation()

rerun_validation()

# If still fails after retry, escalate

if still_below_threshold:

escalate("Quality threshold not met after retry")

```

11. Related Patterns

Goal-seeking agents relate to and integrate with other patterns:

Debate Pattern (Multi-Agent Decision Making)

When to Combine:

  • Goal-seeking agent faces complex decision with trade-offs
  • Multiple valid approaches exist
  • Need consensus from different perspectives

Example:

```python

# Goal-seeking agent reaches decision point

if len(viable_strategies) > 1:

# Invoke debate pattern

result = invoke_debate(

question="Which data transformation approach?",

perspectives=["performance", "accuracy", "simplicity"],

context=current_state

)

# Use debate result to select strategy

selected_strategy = result.consensus

```

N-Version Pattern (Redundant Implementation)

When to Combine:

  • Goal-seeking agent executing critical phase
  • Error cost is high
  • Multiple independent implementations possible

Example:

```python

# Critical security validation phase

if phase.is_critical():

# Generate N versions

results = generate_n_versions(

phase=phase,

n=3,

independent=True

)

# Use voting or comparison to select result

validated_result = compare_and_validate(results)

```

Cascade Pattern (Fallback Strategies)

When to Combine:

  • Goal-seeking agent has preferred approach but needs fallbacks
  • Quality/performance trade-offs exist
  • Graceful degradation desired

Example:

```python

# Data transformation with fallback

try:

# Optimal: ML-based transformation

result = ml_transform(data)

except MLModelUnavailable:

try:

# Pragmatic: Rule-based transformation

result = rule_based_transform(data)

except RuleEngineError:

# Minimal: Manual templates

result = template_transform(data)

```

Investigation Workflow (Knowledge Discovery)

When to Combine:

  • Goal requires understanding existing system
  • Need to discover architecture or patterns
  • Knowledge excavation before execution

Example:

```python

# Before automating deployment, understand current system

if goal.requires_system_knowledge():

# Run investigation workflow

investigation = run_investigation_workflow(

scope="deployment pipeline",

depth="comprehensive"

)

# Use findings to inform goal-seeking execution

adapt_plan_based_on_investigation(investigation.findings)

```

Document-Driven Development (Specification First)

When to Combine:

  • Goal-seeking agent generates or modifies code
  • Clear specifications prevent drift
  • Documentation is single source of truth

Example:

```python

# Goal: Implement new feature

if goal.involves_code_changes():

# DDD Phase 1: Generate specifications

specs = generate_specifications(goal)

# DDD Phase 2: Review and approve specs

await human_review(specs)

# Goal-seeking agent implements from specs

implementation = execute_from_specifications(specs)

```

Pre-Commit / CI Diagnostic (Quality Gates)

When to Combine:

  • Goal-seeking agent makes code changes
  • Need to ensure quality before commit/push
  • Automated validation and fixes

Example:

```python

# After goal-seeking agent generates code

if changes_made:

# Run pre-commit diagnostic

pre_commit_result = run_pre_commit_diagnostic()

if pre_commit_result.has_failures():

# Agent fixes issues

apply_pre_commit_fixes(pre_commit_result.failures)

# After push, run CI diagnostic

ci_result = run_ci_diagnostic_workflow()

if ci_result.has_failures():

# Agent iterates fixes

iterate_ci_fixes_until_pass(ci_result)

```

12. Quality Standards

Goal-seeking agents must meet these quality standards:

Correctness

Success Criteria Verification:

  • [ ] Agent verifies all success criteria before completion
  • [ ] Intermediate phase results validated
  • [ ] No silent failures (all errors logged and handled)

Testing Coverage:

  • [ ] Happy path tested (all success criteria met)
  • [ ] Failure scenarios tested (phase failures, retries)
  • [ ] Edge cases identified and tested
  • [ ] Integration with real systems validated

Resilience

Error Handling:

  • [ ] Retry logic with exponential backoff
  • [ ] Alternative strategies for common failures
  • [ ] Graceful degradation when optimal path unavailable
  • [ ] Clear escalation criteria

State Management:

  • [ ] State persisted across phase boundaries
  • [ ] Resume capability after failures
  • [ ] Idempotent execution (safe to re-run)
  • [ ] Cleanup on abort

Performance

Efficiency:

  • [ ] Phases execute in parallel when possible
  • [ ] No unnecessary work (skip completed phases on retry)
  • [ ] Resource usage within limits (memory, CPU, time)
  • [ ] Timeout limits enforced

Latency:

  • [ ] Decision overhead acceptable for use case
  • [ ] No blocking waits (async where possible)
  • [ ] Progress reported (no black box periods)

Observability

Logging:

  • [ ] Phase transitions logged
  • [ ] Decisions logged with reasoning
  • [ ] Errors logged with context
  • [ ] Results logged with metrics

Metrics:

  • [ ] Duration per phase tracked
  • [ ] Success/failure rates tracked
  • [ ] Resource usage monitored
  • [ ] Quality metrics reported

Tracing:

  • [ ] Execution flow traceable
  • [ ] Correlations across phases maintained
  • [ ] Debugging information sufficient

Usability

Documentation:

  • [ ] Goal clearly stated
  • [ ] Success criteria documented
  • [ ] Usage examples provided
  • [ ] Integration guide complete

User Experience:

  • [ ] Clear progress reporting
  • [ ] Actionable error messages
  • [ ] Human-readable outputs
  • [ ] Easy to invoke and monitor

Philosophy Compliance

Ruthless Simplicity:

  • [ ] No unnecessary phases or complexity
  • [ ] Simplest approach that works
  • [ ] No premature optimization

Single Responsibility:

  • [ ] Each phase has one clear job
  • [ ] No overlapping responsibilities
  • [ ] Clean phase boundaries

Modularity:

  • [ ] Skills are reusable across agents
  • [ ] Phases are independent
  • [ ] Clear interfaces (inputs/outputs)

Regeneratable:

  • [ ] Can be rebuilt from specifications
  • [ ] No hardcoded magic values
  • [ ] Configuration externalized

13. Getting Started

Quick Start: Build Your First Goal-Seeking Agent

Step 1: Install amplihack (if not already)

```bash

pip install amplihack

```

Step 2: Write a goal prompt

```bash

cat > my-goal.md << 'EOF'

# Goal: Automated Security Audit

Check application for common security issues:

  • SQL injection vulnerabilities
  • XSS vulnerabilities
  • Insecure dependencies
  • Missing security headers

Generate report with severity levels and remediation steps.

EOF

```

Step 3: Generate agent

```bash

amplihack goal-agent-generator create \

--prompt my-goal.md \

--output .claude/agents/goal-driven/security-auditor

```

Step 4: Review generated plan

```bash

cat .claude/agents/goal-driven/security-auditor/plan.yaml

```

Step 5: Execute agent

```bash

amplihack goal-agent-generator execute \

--agent-path .claude/agents/goal-driven/security-auditor \

--auto-mode

```

Common Use Cases

Use Case 1: Workflow Automation

```bash

# Create release automation agent

echo "Automate release workflow: tag, build, test, deploy to staging" | \

amplihack goal-agent-generator create --inline --output .claude/agents/goal-driven/release-automator

```

Use Case 2: Data Pipeline

```bash

# Create ETL pipeline agent

echo "Extract from sources, transform to schema, validate quality, load to warehouse" | \

amplihack goal-agent-generator create --inline --output .claude/agents/goal-driven/etl-pipeline

```

Use Case 3: Diagnostic Workflow

```bash

# Create performance diagnostic agent

echo "Diagnose application performance issues, identify bottlenecks, suggest optimizations" | \

amplihack goal-agent-generator create --inline --output .claude/agents/goal-driven/perf-diagnostic

```

Learning Resources

Documentation:

  • Review examples in ~/.amplihack/.claude/skills/goal-seeking-agent-pattern/examples/
  • Read real agent implementations in ~/.amplihack/.claude/agents/amplihack/specialized/
  • Check integration guide in ~/.amplihack/.claude/skills/goal-seeking-agent-pattern/templates/integration_guide.md

Practice:

  1. Start simple: Build single-phase agent (e.g., file formatter)
  2. Add complexity: Build multi-phase agent (e.g., test generator + runner)
  3. Add autonomy: Build agent with error recovery (e.g., CI fixer)
  4. Build production: Build full goal-seeking agent (e.g., deployment pipeline)

Get Help:

  • Review decision framework (Section 2)
  • Check design checklist (Section 6)
  • Study real examples (Section 5)
  • Ask architect agent for guidance

Next Steps

After building your first goal-seeking agent:

  1. Test thoroughly: Cover success, failure, and edge cases
  2. Monitor in production: Track metrics, logs, failures
  3. Iterate: Refine based on real usage
  4. Document learnings: Update DISCOVERIES.md with insights
  5. Share patterns: Add successful approaches to PATTERNS.md

Success Indicators:

  • Agent completes goal autonomously 80%+ of time
  • Failures escalate with clear context
  • Execution time is acceptable
  • Users trust agent to run autonomously

---

Remember: Goal-seeking agents should be ruthlessly simple, focused on clear objectives, and adaptive to context. Start simple, add complexity only when justified, and always verify against success criteria.