rag-skills
π―Skillfrom llama-farm/llamafarm
Implements robust RAG document processing and retrieval using LlamaIndex, ChromaDB, and Celery for efficient, scalable AI document workflows.
Installation
npx skills add https://github.com/llama-farm/llamafarm --skill rag-skillsSkill Details
RAG-specific best practices for LlamaIndex, ChromaDB, and Celery workers. Covers ingestion, retrieval, embeddings, and performance.
Overview
# RAG Skills for LlamaFarm
Framework-specific patterns and code review checklists for the RAG component.
Extends: [python-skills](../python-skills/SKILL.md) - All Python best practices apply here.
Component Overview
| Aspect | Technology | Version |
|--------|------------|---------|
| Python | Python | 3.11+ |
| Document Processing | LlamaIndex | 0.13+ |
| Vector Storage | ChromaDB | 1.0+ |
| Task Queue | Celery | 5.5+ |
| Embeddings | Universal/Ollama/OpenAI | Multiple |
Directory Structure
```
rag/
βββ api.py # Search and database APIs
βββ celery_app.py # Celery configuration
βββ main.py # Entry point
βββ core/
β βββ base.py # Document, Component, Pipeline ABCs
β βββ factories.py # Component factories
β βββ ingest_handler.py # File ingestion with safety checks
β βββ blob_processor.py # Binary file processing
β βββ settings.py # Pydantic settings
β βββ logging.py # RAGStructLogger
βββ components/
β βββ embedders/ # Embedding providers
β βββ extractors/ # Metadata extractors
β βββ parsers/ # Document parsers (LlamaIndex)
β βββ retrievers/ # Retrieval strategies
β βββ stores/ # Vector stores (ChromaDB, FAISS)
βββ tasks/ # Celery tasks
β βββ ingest_tasks.py # File ingestion
β βββ search_tasks.py # Database search
β βββ query_tasks.py # Complex queries
β βββ health_tasks.py # Health checks
β βββ stats_tasks.py # Statistics
βββ utils/
βββ embedding_safety.py # Circuit breaker, validation
```
Quick Reference
| Topic | File | Key Points |
|-------|------|------------|
| LlamaIndex | [llamaindex.md](llamaindex.md) | Document parsing, chunking, node conversion |
| ChromaDB | [chromadb.md](chromadb.md) | Collections, embeddings, distance metrics |
| Celery | [celery.md](celery.md) | Task routing, error handling, worker config |
| Performance | [performance.md](performance.md) | Batching, caching, deduplication |
Core Patterns
Document Dataclass
```python
from dataclasses import dataclass, field
from typing import Any
@dataclass
class Document:
content: str
metadata: dict[str, Any] = field(default_factory=dict)
id: str = field(default_factory=lambda: str(uuid.uuid4()))
source: str | None = None
embeddings: list[float] | None = None
```
Component Abstract Base Class
```python
from abc import ABC, abstractmethod
class Component(ABC):
def __init__(
self,
name: str | None = None,
config: dict[str, Any] | None = None,
project_dir: Path | None = None,
):
self.name = name or self.__class__.__name__
self.config = config or {}
self.logger = RAGStructLogger(__name__).bind(name=self.name)
self.project_dir = project_dir
@abstractmethod
def process(self, documents: list[Document]) -> ProcessingResult:
pass
```
Retrieval Strategy Pattern
```python
class RetrievalStrategy(Component, ABC):
@abstractmethod
def retrieve(
self,
query_embedding: list[float],
vector_store,
top_k: int = 5,
**kwargs
) -> RetrievalResult:
pass
@abstractmethod
def supports_vector_store(self, vector_store_type: str) -> bool:
pass
```
Embedder with Circuit Breaker
```python
class Embedder(Component):
DEFAULT_FAILURE_THRESHOLD = 5
DEFAULT_RESET_TIMEOUT = 60.0
def __init__(self, ...):
super().__init__(...)
self._circuit_breaker = CircuitBreaker(
failure_threshold=config.get("failure_threshold", 5),
reset_timeout=config.get("reset_timeout", 60.0),
)
self._fail_fast = config.get("fail_fast", True)
def embed_text(self, text: str) -> list[float]:
self.check_circuit_breaker()
try:
embedding = self._call_embedding_api(text)
self.record_success()
return embedding
except Exception as e:
self.record_failure(e)
if self._fail_fast:
raise EmbedderUnavailableError(str(e)) from e
return [0.0] * self.get_embedding_dimension()
```
Review Checklist Summary
When reviewing RAG code:
- LlamaIndex (Medium priority)
- Proper chunking configuration
- Metadata preservation during parsing
- Error handling for unsupported formats
- ChromaDB (High priority)
- Thread-safe client access
- Proper distance metric selection
- Metadata type compatibility
- Celery (High priority)
- Task routing to correct queue
- Error logging with context
- Proper serialization
- Performance (Medium priority)
- Batch processing for embeddings
- Deduplication enabled
- Appropriate caching
See individual topic files for detailed checklists with grep patterns.
More from this repository10
Generates specialized Claude Code skills for each subsystem, creating shared language and subsystem-specific checklists to optimize AI code generation across the monorepo.
Manages shared Python utilities for LlamaFarm, focusing on HuggingFace model handling, GGUF file management, and cross-service consistency.
Configures secure Electron desktop application architecture with isolated processes, type-safe IPC, and cross-platform packaging for LlamaFarm.
Enforces Go best practices and idiomatic patterns for secure, maintainable LlamaFarm CLI development.
Enforces strict TypeScript best practices for React and Electron frontend applications, ensuring type safety, immutability, and clean code patterns.
Provides comprehensive Go CLI development guidelines using Cobra, Bubbletea, and Lipgloss for creating robust, interactive command-line interfaces in LlamaFarm projects.
Automates git workflow by committing changes, pushing to GitHub, and opening a PR with intelligent checks and handling of edge cases.
Provides comprehensive Python best practices and code review guidelines for ensuring high-quality, secure, and maintainable code across LlamaFarm's Python components.
Generates temporary files in a structured system directory, ensuring clean organization and easy tracking of generated reports and logs.
Provides server-side best practices and code review guidelines for FastAPI, Celery, and Pydantic frameworks in Python.