model-evaluation-benchmark
π―Skillfrom rysweet/amplihack
model-evaluation-benchmark skill from rysweet/amplihack
Part of
rysweet/amplihack(81 items)
Installation
python run_benchmarks.py --model {opus|sonnet} --tasks 1,2,3,4Skill Details
|
More from this repository10
Performs comprehensive cybersecurity analysis by evaluating events through threat modeling, risk assessment, and defensive frameworks to identify vulnerabilities and recommend mitigation strategies.
lawyer-analyst skill from rysweet/amplihack
philosopher-analyst skill from rysweet/amplihack
documentation-writing skill from rysweet/amplihack
psychologist-analyst skill from rysweet/amplihack
roadmap-strategist skill from rysweet/amplihack
computer-scientist-analyst skill from rysweet/amplihack
investigation-workflow skill from rysweet/amplihack
storytelling-synthesizer skill from rysweet/amplihack
journalist-analyst skill from rysweet/amplihack