4 results for tag "evaluating-llms-harness"
A large collection of Claude Code skill templates sponsored by Z.AI, providing ready-to-use development skill configurations across various domains.
A evaluating llms harness skill from AI Research Skills by Orchestra Research, providing skills for academic research, paper analysis, and scientific workflows.
A skill for evaluating LLMs using evaluation harnesses from the Droid Tings collection, a comprehensive set of 375 skills and 155 custom droids covering AI/ML, development, scientific research, and more.