hugging-face-evaluation
🔌Pluginhuggingface/skills
Official Hugging Face skills defining AI/ML tasks like dataset creation, model training, and evaluation. Interoperable with Claude Code, OpenAI Codex, Gemini CLI, and Cursor using the standardized Agent Skill format.
Overview
Hugging Face Evaluation is a plugin from Hugging Face's official skills repository that provides AI/ML task definitions for model evaluation. The skills are interoperable with all major coding agent tools including OpenAI Codex, Claude Code, Gemini CLI, and Cursor, following the standardized Agent Skill format with support for multiple installation methods.
Key Features
- Cross-Agent Interoperability - Compatible with Claude Code, OpenAI Codex, Gemini CLI, and Cursor, with additional integrations for Windsurf and Continue in development
- Standardized Skill Format - Follows the Agent Skill format with YAML frontmatter, SKILL.md files, and AGENTS.md fallback for agents that do not support skills natively
- Multiple Installation Methods - Install via Claude Code plugin marketplace, Codex AGENTS.md auto-detection, or Gemini CLI extensions with local or URL-based setup
- ML Task Coverage - Skills cover dataset creation, model training, evaluation, and other core Hugging Face Hub operations
- Gemini Extension Support - Includes
gemini-extension.jsonfor native integration with the Gemini CLI
Who is this for?
This skill is designed for ML engineers and data scientists who need AI-assisted guidance on evaluating machine learning models using Hugging Face tools and infrastructure. It is particularly useful for teams that work across multiple AI coding agents and want consistent evaluation workflows regardless of which tool they use.
Part of
huggingface-skills
Installation
/plugin marketplace add huggingface/skills/plugin install hugging-face-evaluation@huggingface-skillsMore from this repository10
Agent Skills for AI/ML tasks including dataset creation, model training, evaluation, and research paper publishing on Hugging Face Hub
Official Hugging Face skills defining AI/ML tasks like dataset creation, model training, and evaluation. Interoperable with Claude Code, OpenAI Codex, Gemini CLI, and Cursor using the standardized Agent Skill format.
Official Hugging Face skills defining AI/ML tasks like dataset creation, model training, and evaluation. Interoperable with Claude Code, OpenAI Codex, Gemini CLI, and Cursor using the standardized Agent Skill format.
Official Hugging Face skills defining AI/ML tasks like dataset creation, model training, and evaluation. Interoperable with Claude Code, OpenAI Codex, Gemini CLI, and Cursor using the standardized Agent Skill format.
Official Hugging Face skills defining AI/ML tasks like dataset creation, model training, and evaluation. Interoperable with Claude Code, OpenAI Codex, Gemini CLI, and Cursor using the standardized Agent Skill format.
Official Hugging Face skills defining AI/ML tasks like dataset creation, model training, and evaluation. Interoperable with Claude Code, OpenAI Codex, Gemini CLI, and Cursor using the standardized Agent Skill format.
Official Hugging Face skills defining AI/ML tasks like dataset creation, model training, and evaluation. Interoperable with Claude Code, OpenAI Codex, Gemini CLI, and Cursor using the standardized Agent Skill format.
Official Hugging Face skills defining AI/ML tasks like dataset creation, model training, and evaluation. Interoperable with Claude Code, OpenAI Codex, Gemini CLI, and Cursor using the standardized Agent Skill format.
Skill
Trains and fine-tunes language models using TRL (Transformer Reinforcement Learning) on Hugging Face Jobs infrastructure, supporting SFT, DPO, GRPO, reward modeling, and GGUF conversion for local deployment.