4 results for tag "simpo-training"
A large collection of Claude Code skill templates sponsored by Z.AI, providing ready-to-use development skill configurations across various domains.
A SimPO training skill from the AI Research Engineering Skills Library, providing patterns for implementing Simple Preference Optimization training for language model alignment.
Skill for SimPO (Simple Preference Optimization) training, a reference-free alternative to DPO for aligning language models with human preferences.