🎯

guidance

🎯Skill

from ovachiever/droid-tings

VibeIndex|
What it does

Generates structured, constrained LLM outputs using regex, grammars, and workflows to ensure valid JSON, XML, and code formats.

πŸ“¦

Part of

ovachiever/droid-tings(370 items)

guidance

Installation

pip installInstall Python package
pip install guidance
pip installInstall Python package
pip install guidance[transformers] # Hugging Face models
pip installInstall Python package
pip install guidance[llama_cpp] # llama.cpp models
πŸ“– Extracted from docs: ovachiever/droid-tings
16Installs
-
AddedFeb 4, 2026

Skill Details

SKILL.md

Control LLM output with regex and grammars, guarantee valid JSON/XML/code generation, enforce structured formats, and build multi-step workflows with Guidance - Microsoft Research's constrained generation framework

Overview

# Guidance: Constrained LLM Generation

When to Use This Skill

Use Guidance when you need to:

  • Control LLM output syntax with regex or grammars
  • Guarantee valid JSON/XML/code generation
  • Reduce latency vs traditional prompting approaches
  • Enforce structured formats (dates, emails, IDs, etc.)
  • Build multi-step workflows with Pythonic control flow
  • Prevent invalid outputs through grammatical constraints

GitHub Stars: 18,000+ | From: Microsoft Research

Installation

```bash

# Base installation

pip install guidance

# With specific backends

pip install guidance[transformers] # Hugging Face models

pip install guidance[llama_cpp] # llama.cpp models

```

Quick Start

Basic Example: Structured Generation

```python

from guidance import models, gen

# Load model (supports OpenAI, Transformers, llama.cpp)

lm = models.OpenAI("gpt-4")

# Generate with constraints

result = lm + "The capital of France is " + gen("capital", max_tokens=5)

print(result["capital"]) # "Paris"

```

With Anthropic Claude

```python

from guidance import models, gen, system, user, assistant

# Configure Claude

lm = models.Anthropic("claude-sonnet-4-5-20250929")

# Use context managers for chat format

with system():

lm += "You are a helpful assistant."

with user():

lm += "What is the capital of France?"

with assistant():

lm += gen(max_tokens=20)

```

Core Concepts

1. Context Managers

Guidance uses Pythonic context managers for chat-style interactions.

```python

from guidance import system, user, assistant, gen

lm = models.Anthropic("claude-sonnet-4-5-20250929")

# System message

with system():

lm += "You are a JSON generation expert."

# User message

with user():

lm += "Generate a person object with name and age."

# Assistant response

with assistant():

lm += gen("response", max_tokens=100)

print(lm["response"])

```

Benefits:

  • Natural chat flow
  • Clear role separation
  • Easy to read and maintain

2. Constrained Generation

Guidance ensures outputs match specified patterns using regex or grammars.

#### Regex Constraints

```python

from guidance import models, gen

lm = models.Anthropic("claude-sonnet-4-5-20250929")

# Constrain to valid email format

lm += "Email: " + gen("email", regex=r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}")

# Constrain to date format (YYYY-MM-DD)

lm += "Date: " + gen("date", regex=r"\d{4}-\d{2}-\d{2}")

# Constrain to phone number

lm += "Phone: " + gen("phone", regex=r"\d{3}-\d{3}-\d{4}")

print(lm["email"]) # Guaranteed valid email

print(lm["date"]) # Guaranteed YYYY-MM-DD format

```

How it works:

  • Regex converted to grammar at token level
  • Invalid tokens filtered during generation
  • Model can only produce matching outputs

#### Selection Constraints

```python

from guidance import models, gen, select

lm = models.Anthropic("claude-sonnet-4-5-20250929")

# Constrain to specific choices

lm += "Sentiment: " + select(["positive", "negative", "neutral"], name="sentiment")

# Multiple-choice selection

lm += "Best answer: " + select(

["A) Paris", "B) London", "C) Berlin", "D) Madrid"],

name="answer"

)

print(lm["sentiment"]) # One of: positive, negative, neutral

print(lm["answer"]) # One of: A, B, C, or D

```

3. Token Healing

Guidance automatically "heals" token boundaries between prompt and generation.

Problem: Tokenization creates unnatural boundaries.

```python

# Without token healing

prompt = "The capital of France is "

# Last token: " is "

# First generated token might be " Par" (with leading space)

# Result: "The capital of France is Paris" (double space!)

```

Solution: Guidance backs up one token and regenerates.

```python

from guidance import models, gen

lm = models.Anthropic("claude-sonnet-4-5-20250929")

# Token healing enabled by default

lm += "The capital of France is " + gen("capital", max_tokens=5)

# Result: "The capital of France is Paris" (correct spacing)

```

Benefits:

  • Natural text boundaries
  • No awkward spacing issues
  • Better model performance (sees natural token sequences)

4. Grammar-Based Generation

Define complex structures using context-free grammars.

```python

from guidance import models, gen

lm = models.Anthropic("claude-sonnet-4-5-20250929")

# JSON grammar (simplified)

json_grammar = """

{

"name": ,

"age": ,

"email":

}

"""

# Generate valid JSON

lm += gen("person", grammar=json_grammar)

print(lm["person"]) # Guaranteed valid JSON structure

```

Use cases:

  • Complex structured outputs
  • Nested data structures
  • Programming language syntax
  • Domain-specific languages

5. Guidance Functions

Create reusable generation patterns with the @guidance decorator.

```python

from guidance import guidance, gen, models

@guidance

def generate_person(lm):

"""Generate a person with name and age."""

lm += "Name: " + gen("name", max_tokens=20, stop="\n")

lm += "\nAge: " + gen("age", regex=r"[0-9]+", max_tokens=3)

return lm

# Use the function

lm = models.Anthropic("claude-sonnet-4-5-20250929")

lm = generate_person(lm)

print(lm["name"])

print(lm["age"])

```

Stateful Functions:

```python

@guidance(stateless=False)

def react_agent(lm, question, tools, max_rounds=5):

"""ReAct agent with tool use."""

lm += f"Question: {question}\n\n"

for i in range(max_rounds):

# Thought

lm += f"Thought {i+1}: " + gen("thought", stop="\n")

# Action

lm += "\nAction: " + select(list(tools.keys()), name="action")

# Execute tool

tool_result = tools[lm["action"]]()

lm += f"\nObservation: {tool_result}\n\n"

# Check if done

lm += "Done? " + select(["Yes", "No"], name="done")

if lm["done"] == "Yes":

break

# Final answer

lm += "\nFinal Answer: " + gen("answer", max_tokens=100)

return lm

```

Backend Configuration

Anthropic Claude

```python

from guidance import models

lm = models.Anthropic(

model="claude-sonnet-4-5-20250929",

api_key="your-api-key" # Or set ANTHROPIC_API_KEY env var

)

```

OpenAI

```python

lm = models.OpenAI(

model="gpt-4o-mini",

api_key="your-api-key" # Or set OPENAI_API_KEY env var

)

```

Local Models (Transformers)

```python

from guidance.models import Transformers

lm = Transformers(

"microsoft/Phi-4-mini-instruct",

device="cuda" # Or "cpu"

)

```

Local Models (llama.cpp)

```python

from guidance.models import LlamaCpp

lm = LlamaCpp(

model_path="/path/to/model.gguf",

n_ctx=4096,

n_gpu_layers=35

)

```

Common Patterns

Pattern 1: JSON Generation

```python

from guidance import models, gen, system, user, assistant

lm = models.Anthropic("claude-sonnet-4-5-20250929")

with system():

lm += "You generate valid JSON."

with user():

lm += "Generate a user profile with name, age, and email."

with assistant():

lm += """{

"name": """ + gen("name", regex=r'"[A-Za-z ]+"', max_tokens=30) + """,

"age": """ + gen("age", regex=r"[0-9]+", max_tokens=3) + """,

"email": """ + gen("email", regex=r'"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}"', max_tokens=50) + """

}"""

print(lm) # Valid JSON guaranteed

```

Pattern 2: Classification

```python

from guidance import models, gen, select

lm = models.Anthropic("claude-sonnet-4-5-20250929")

text = "This product is amazing! I love it."

lm += f"Text: {text}\n"

lm += "Sentiment: " + select(["positive", "negative", "neutral"], name="sentiment")

lm += "\nConfidence: " + gen("confidence", regex=r"[0-9]+", max_tokens=3) + "%"

print(f"Sentiment: {lm['sentiment']}")

print(f"Confidence: {lm['confidence']}%")

```

Pattern 3: Multi-Step Reasoning

```python

from guidance import models, gen, guidance

@guidance

def chain_of_thought(lm, question):

"""Generate answer with step-by-step reasoning."""

lm += f"Question: {question}\n\n"

# Generate multiple reasoning steps

for i in range(3):

lm += f"Step {i+1}: " + gen(f"step_{i+1}", stop="\n", max_tokens=100) + "\n"

# Final answer

lm += "\nTherefore, the answer is: " + gen("answer", max_tokens=50)

return lm

lm = models.Anthropic("claude-sonnet-4-5-20250929")

lm = chain_of_thought(lm, "What is 15% of 200?")

print(lm["answer"])

```

Pattern 4: ReAct Agent

```python

from guidance import models, gen, select, guidance

@guidance(stateless=False)

def react_agent(lm, question):

"""ReAct agent with tool use."""

tools = {

"calculator": lambda expr: eval(expr),

"search": lambda query: f"Search results for: {query}",

}

lm += f"Question: {question}\n\n"

for round in range(5):

# Thought

lm += f"Thought: " + gen("thought", stop="\n") + "\n"

# Action selection

lm += "Action: " + select(["calculator", "search", "answer"], name="action")

if lm["action"] == "answer":

lm += "\nFinal Answer: " + gen("answer", max_tokens=100)

break

# Action input

lm += "\nAction Input: " + gen("action_input", stop="\n") + "\n"

# Execute tool

if lm["action"] in tools:

result = tools[lm["action"]](lm["action_input"])

lm += f"Observation: {result}\n\n"

return lm

lm = models.Anthropic("claude-sonnet-4-5-20250929")

lm = react_agent(lm, "What is 25 * 4 + 10?")

print(lm["answer"])

```

Pattern 5: Data Extraction

```python

from guidance import models, gen, guidance

@guidance

def extract_entities(lm, text):

"""Extract structured entities from text."""

lm += f"Text: {text}\n\n"

# Extract person

lm += "Person: " + gen("person", stop="\n", max_tokens=30) + "\n"

# Extract organization

lm += "Organization: " + gen("organization", stop="\n", max_tokens=30) + "\n"

# Extract date

lm += "Date: " + gen("date", regex=r"\d{4}-\d{2}-\d{2}", max_tokens=10) + "\n"

# Extract location

lm += "Location: " + gen("location", stop="\n", max_tokens=30) + "\n"

return lm

text = "Tim Cook announced at Apple Park on 2024-09-15 in Cupertino."

lm = models.Anthropic("claude-sonnet-4-5-20250929")

lm = extract_entities(lm, text)

print(f"Person: {lm['person']}")

print(f"Organization: {lm['organization']}")

print(f"Date: {lm['date']}")

print(f"Location: {lm['location']}")

```

Best Practices

1. Use Regex for Format Validation

```python

# βœ… Good: Regex ensures valid format

lm += "Email: " + gen("email", regex=r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}")

# ❌ Bad: Free generation may produce invalid emails

lm += "Email: " + gen("email", max_tokens=50)

```

2. Use select() for Fixed Categories

```python

# βœ… Good: Guaranteed valid category

lm += "Status: " + select(["pending", "approved", "rejected"], name="status")

# ❌ Bad: May generate typos or invalid values

lm += "Status: " + gen("status", max_tokens=20)

```

3. Leverage Token Healing

```python

# Token healing is enabled by default

# No special action needed - just concatenate naturally

lm += "The capital is " + gen("capital") # Automatic healing

```

4. Use stop Sequences

```python

# βœ… Good: Stop at newline for single-line outputs

lm += "Name: " + gen("name", stop="\n")

# ❌ Bad: May generate multiple lines

lm += "Name: " + gen("name", max_tokens=50)

```

5. Create Reusable Functions

```python

# βœ… Good: Reusable pattern

@guidance

def generate_person(lm):

lm += "Name: " + gen("name", stop="\n")

lm += "\nAge: " + gen("age", regex=r"[0-9]+")

return lm

# Use multiple times

lm = generate_person(lm)

lm += "\n\n"

lm = generate_person(lm)

```

6. Balance Constraints

```python

# βœ… Good: Reasonable constraints

lm += gen("name", regex=r"[A-Za-z ]+", max_tokens=30)

# ❌ Too strict: May fail or be very slow

lm += gen("name", regex=r"^(John|Jane)$", max_tokens=10)

```

Comparison to Alternatives

| Feature | Guidance | Instructor | Outlines | LMQL |

|---------|----------|------------|----------|------|

| Regex Constraints | βœ… Yes | ❌ No | βœ… Yes | βœ… Yes |

| Grammar Support | βœ… CFG | ❌ No | βœ… CFG | βœ… CFG |

| Pydantic Validation | ❌ No | βœ… Yes | βœ… Yes | ❌ No |

| Token Healing | βœ… Yes | ❌ No | βœ… Yes | ❌ No |

| Local Models | βœ… Yes | ⚠️ Limited | βœ… Yes | βœ… Yes |

| API Models | βœ… Yes | βœ… Yes | ⚠️ Limited | βœ… Yes |

| Pythonic Syntax | βœ… Yes | βœ… Yes | βœ… Yes | ❌ SQL-like |

| Learning Curve | Low | Low | Medium | High |

When to choose Guidance:

  • Need regex/grammar constraints
  • Want token healing
  • Building complex workflows with control flow
  • Using local models (Transformers, llama.cpp)
  • Prefer Pythonic syntax

When to choose alternatives:

  • Instructor: Need Pydantic validation with automatic retrying
  • Outlines: Need JSON schema validation
  • LMQL: Prefer declarative query syntax

Performance Characteristics

Latency Reduction:

  • 30-50% faster than traditional prompting for constrained outputs
  • Token healing reduces unnecessary regeneration
  • Grammar constraints prevent invalid token generation

Memory Usage:

  • Minimal overhead vs unconstrained generation
  • Grammar compilation cached after first use
  • Efficient token filtering at inference time

Token Efficiency:

  • Prevents wasted tokens on invalid outputs
  • No need for retry loops
  • Direct path to valid outputs

Resources

  • Documentation: https://guidance.readthedocs.io
  • GitHub: https://github.com/guidance-ai/guidance (18k+ stars)
  • Notebooks: https://github.com/guidance-ai/guidance/tree/main/notebooks
  • Discord: Community support available

See Also

  • references/constraints.md - Comprehensive regex and grammar patterns
  • references/backends.md - Backend-specific configuration
  • references/examples.md - Production-ready examples