🎯

prompt-caching

🎯Skill

from hainamchung/agent-assistant

VibeIndex|
What it does

Reduces LLM costs by strategically caching prompts, responses, and semantic matches across multiple caching levels.

πŸ“¦

Part of

hainamchung/agent-assistant(227 items)

prompt-caching

Installation

npm installInstall npm package
npm install -g @namch/agent-assistant
git cloneClone repository
git clone https://github.com/hainamchung/agent-assistant.git
Node.jsRun Node.js server
node cli/install.js install cursor # Cursor
Node.jsRun Node.js server
node cli/install.js install claude # Claude Code
Node.jsRun Node.js server
node cli/install.js install copilot # GitHub Copilot

+ 7 more commands

πŸ“– Extracted from docs: hainamchung/agent-assistant
1Installs
-
AddedFeb 4, 2026

Skill Details

SKILL.md

"Caching strategies for LLM prompts including Anthropic prompt caching, response caching, and CAG (Cache Augmented Generation) Use when: prompt caching, cache prompt, response cache, cag, cache augmented."

Overview

# Prompt Caching

You're a caching specialist who has reduced LLM costs by 90% through strategic caching.

You've implemented systems that cache at multiple levels: prompt prefixes, full responses,

and semantic similarity matches.

You understand that LLM caching is different from traditional cachingβ€”prompts have

prefixes that can be cached, responses vary with temperature, and semantic similarity

often matters more than exact match.

Your core principles:

  1. Cache at the right levelβ€”prefix, response, or both
  2. K

Capabilities

  • prompt-cache
  • response-cache
  • kv-cache
  • cag-patterns
  • cache-invalidation

Patterns

Anthropic Prompt Caching

Use Claude's native prompt caching for repeated prefixes

Response Caching

Cache full LLM responses for identical or similar queries

Cache Augmented Generation (CAG)

Pre-cache documents in prompt instead of RAG retrieval

Anti-Patterns

❌ Caching with High Temperature

❌ No Cache Invalidation

❌ Caching Everything

⚠️ Sharp Edges

| Issue | Severity | Solution |

|-------|----------|----------|

| Cache miss causes latency spike with additional overhead | high | // Optimize for cache misses, not just hits |

| Cached responses become incorrect over time | high | // Implement proper cache invalidation |

| Prompt caching doesn't work due to prefix changes | medium | // Structure prompts for optimal caching |

Related Skills

Works well with: context-window-management, rag-implementation, conversation-memory

More from this repository10

🎯
senior-devops🎯Skill

Skill

🎯
cpp-pro🎯Skill

Develops high-performance C++ applications with modern C++20/23 features, template metaprogramming, and zero-overhead systems design.

🎯
senior-architect🎯Skill

Designs scalable software architectures using modern tech stacks, generating architecture diagrams, analyzing dependencies, and providing system design recommendations.

🎯
senior-frontend🎯Skill

Generates, analyzes, and scaffolds modern frontend projects using ReactJS, NextJS, TypeScript, and Tailwind CSS with automated best practices.

🎯
spec-miner🎯Skill

Extracts and documents specifications from legacy or undocumented codebases by systematically analyzing code structure, data flows, and system behaviors.

🎯
docs-seeker🎯Skill

Searches and retrieves technical documentation by executing intelligent scripts across library sources, GitHub repos, and context7.com with automated query detection.

🎯
writing-plans🎯Skill

Generates comprehensive, step-by-step implementation plans for software features with precise file paths, test-driven development approach, and clear task granularity.

🎯
file path traversal testing🎯Skill

Tests and identifies potential file path traversal vulnerabilities in code by analyzing file path handling and input validation mechanisms.

🎯
nodejs-best-practices🎯Skill

Guides developers in making strategic Node.js architecture and framework decisions by providing context-aware selection principles and modern runtime considerations.

🎯
red-team-tactics🎯Skill

Simulates adversarial attack techniques across MITRE ATT&CK framework phases, mapping network vulnerabilities and demonstrating systematic compromise strategies.