Showing 30 of 14311 results
Automated migration tool for projects built on essencium-frontend. Analyzes upstream changes, detects downstream customizations, and applies version-by-version migrations interactively.
from PolicyEngine/policyengine-claude
Measures AI estimation accuracy β how long AI thinks a task takes vs how long it actually takes. Tracks per-model, per-tool. Exports to OpenTelemetry.
EVA-01: Automated life testing platform with spec discussion, PRD generation, implementation, and observation commands
Experiment Design Engineer β Experiment design β A/B testing, statistical power, experiment tracking, causal inference
κ²°κ³Ό νμ§ μλ νκ° β Haiku LLM-as-judge 0~10μ (μ€ν)
Design eval harnesses β task schemas, metrics, dataset versioning, eval-as-code patterns.
Build automated regression suites β golden sets, threshold alerting, CI integration for model changes.
AI Operations Team β Evals: Eval harness design, benchmark suites, automated regression, human eval orchestration.
Control a real Chrome browser from your AI coding agent via the Ever CLI β navigate, snapshot, click, type, extract, screenshot, and record video.
Complete collection of agents, skills, hooks, commands, and rules evolved over 10+ months of intensive daily use
from affaan-m/everything-claude-code