verl-rl-training
π―Skillfrom orchestra-research/ai-research-skills
Trains and fine-tunes Verifiable Reinforcement Learning models using advanced algorithmic techniques for robust AI policy optimization
Part of
orchestra-research/ai-research-skills(84 items)
Installation
npx @orchestra-research/ai-research-skillsnpx @orchestra-research/ai-research-skills list # View installed skillsnpx @orchestra-research/ai-research-skills update # Update installed skills/plugin marketplace add orchestra-research/AI-research-SKILLs/plugin install fine-tuning@ai-research-skills # Axolotl, LLaMA-Factory, PEFT, Unsloth+ 4 more commands
More from this repository10
Streamlines AI research workflows by providing curated Claude skills for data analysis, literature review, experiment design, and research paper generation.
Assists AI researchers in drafting, structuring, and generating machine learning research papers with academic writing best practices and technical precision.
Streamlines distributed machine learning training using Ray, optimizing hyperparameter tuning and parallel model execution across compute clusters.
Streamlines distributed data processing and machine learning workflows using Ray's scalable data loading and transformation capabilities.
Streamlines parameter-efficient fine-tuning of large language models using Transformers Reinforcement Learning (TRL) techniques and best practices.
Enables distributed training of large AI models using PyTorch's Fully Sharded Data Parallel (FSDP) with advanced memory optimization and scaling techniques
Enables remote neural network interpretation and analysis through advanced visualization, layer probing, and activation tracking techniques.
Accelerates AI model inference by predicting and parallel processing multiple token candidates to reduce latency and improve generation speed.
Streamlines training and fine-tuning of Mixture of Experts (MoE) models with automated hyperparameter optimization and distributed learning strategies
Efficiently fine-tune large language models using Parameter-Efficient Fine-Tuning (PEFT) techniques with minimal computational resources and memory overhead.