rusty-page-indexer
π―Skillfrom algiras/rusty-pageindex
rusty-page-indexer skill from algiras/rusty-pageindex
Installation
curl -fsSL https://raw.githubusercontent.com/Algiras/rusty-pageindex/main/install.sh | bashcargo install rusty-page-indexernpx skills add https://github.com/Algiras/rusty-pageindex --skill rusty-page-indexerSkill Details
Overview
# π¦ RustyPageIndex

RustyPageIndex is a high-performance Rust implementation of the PageIndex pattern. It transforms complex documents into hierarchical "Table-of-Contents" (TOC) trees for vectorless, reasoning-based RAG.
This project is inspired by [VectifyAI/PageIndex](https://github.com/VectifyAI/PageIndex) but has diverged significantly with multi-repo support, parallel processing, and a unified tree architecture.
π Key Features
Performance
- Parallel Indexing: Uses Rayon for parallel file parsing (238 files in ~0.04s)
- Rust-Native Parsing:
pdf-extractandpulldown-cmarkfor fast document processing - Incremental Updates: Hash-based caching skips unchanged files
Multi-Repository Support
- Index multiple repos: Each indexed folder is tracked separately
- Query across all: Search spans all indexed repositories by default
- Manage indices: List, filter, and clean up indices easily
Unified Tree Architecture
- Folder β File β Section hierarchy preserves document structure
- Single tree per repo: Efficient storage and navigation
- Smart search: Auto-unwraps folder roots for better LLM context
---
π Divergence from Original PageIndex
| Feature | Original PageIndex | RustyPageIndex |
|---------|-------------------|----------------|
| Language | Python | Rust |
| Indexing | Per-file indices | Unified folder tree |
| Multi-repo | Not supported | Full support with list/clean |
| Parallelism | Sequential | Rayon parallel processing |
| Storage | Cloud-based (MCP) | Local filesystem |
| Tree Structure | Flat sections | Folder β File β Section hierarchy |
| Headerless Markdown | Empty tree | Auto-creates "Document" node |
---
π οΈ Getting Started
Installation
One-liner Install (Unix/macOS):
```bash
curl -fsSL https://raw.githubusercontent.com/Algiras/rusty-pageindex/main/install.sh | bash
```
One-liner Install (Windows PowerShell):
```powershell
irm https://raw.githubusercontent.com/Algiras/rusty-pageindex/main/install.ps1 | iex
```
Via Cargo:
```bash
cargo install rusty-page-indexer
```
π§ Use as an Agent Skill
```bash
npx skills add https://github.com/Algiras/rusty-pageindex --skill rusty-page-indexer
```
π Authentication
```bash
# For OpenAI
rusty-page-indexer auth --api-key "your-key-here"
# For Ollama (local LLM)
rusty-page-indexer auth --api-key "ollama" --api-base "http://localhost:11434/v1" --model "llama3.2"
```
---
π² Usage
Indexing Documents
```bash
# Index a repository
rusty-page-indexer index ./my-project
# Index with LLM-generated summaries
rusty-page-indexer index ./my-project --enrich
# Force re-index (ignores cache)
rusty-page-indexer index ./my-project --force
# Preview what would be indexed
rusty-page-indexer index ./my-project --dry-run
```
Managing Multiple Repositories
```bash
# Index multiple repos
rusty-page-indexer index ./repo-a
rusty-page-indexer index ./repo-b
# List all indexed repositories
rusty-page-indexer list
# Example output:
# π Indexed Repositories
# ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
# π repo-a (125.3 KB)
# /Users/you/projects/repo-a
# π repo-b (89.7 KB)
# /Users/you/projects/repo-b
# ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
# Total: 2 indices
```
Querying
```bash
# Search across ALL indexed repositories
rusty-page-indexer query "how does authentication work"
# Search within a specific repository
rusty-page-indexer query "kafka messaging" --path repo-a
```
Cleanup
```bash
# Remove a specific index
rusty-page-indexer clean repo-a
# Remove all indices
rusty-page-indexer clean --all
```
Status Information
```bash
rusty-page-indexer info
```
---
π€ Model Compatibility
OpenAI Models (Remote)
| Model | Cost | Speed | Notes |
|-------|------|-------|-------|
| gpt-4o | $$$ | Fast | Best accuracy, recommended for complex queries |
| `gpt