llm-evaluation
π―Skillfrom phrazzld/claude-config
Automates comprehensive LLM testing through prompt evaluation, regression detection, security scanning, and CI/CD integration using Promptfoo.
Part of
phrazzld/claude-config(176 items)
Installation
npx promptfoo@latest initnpx promptfoo@latest evalnpx promptfoo@latest viewnpx promptfoo@latest redteam runnpx promptfoo@latest eval -c promptfooconfig.yaml -o results.jsonSkill Details
|
More from this repository10
pencil-to-code skill from phrazzld/claude-config
pencil-renderer skill from phrazzld/claude-config
Provides comprehensive web design guidelines and best practices for creating user-friendly, accessible, and visually appealing websites.
browser-extension-dev skill from phrazzld/claude-config
Enforces opinionated UI development constraints, delegating implementation to Kimi and ensuring accessible, performant frontend design.
Provides comprehensive React and Next.js performance optimization guidelines, offering 45 rules across 8 categories to improve application efficiency and reduce load times.
Systematically reviews code changes against comprehensive quality, design, security, and performance criteria in under 5 minutes.
stripe-local-dev skill from phrazzld/claude-config
llm-gateway-routing skill from phrazzld/claude-config
Channels expert personas like Carmack or Torvalds to ruthlessly critique code, design, and plans through their unique technical perspectives.