openrlhf-training
π―Skillfrom orchestra-research/ai-research-skills
Trains large language models using open-source reinforcement learning from human feedback (RLHF) techniques with advanced alignment and reward modeling
Part of
orchestra-research/ai-research-skills(84 items)
Installation
npx @orchestra-research/ai-research-skillsnpx @orchestra-research/ai-research-skills list # View installed skillsnpx @orchestra-research/ai-research-skills update # Update installed skills/plugin marketplace add orchestra-research/AI-research-SKILLs/plugin install fine-tuning@ai-research-skills # Axolotl, LLaMA-Factory, PEFT, Unsloth+ 4 more commands
More from this repository10
Streamlines AI research workflows by providing curated Claude skills for data analysis, literature review, experiment design, and research paper generation.
Assists AI researchers in drafting, structuring, and generating machine learning research papers with academic writing best practices and technical precision.
Streamlines distributed machine learning training using Ray, optimizing hyperparameter tuning and parallel model execution across compute clusters.
Streamlines distributed data processing and machine learning workflows using Ray's scalable data loading and transformation capabilities.
Streamline machine learning experiment tracking, model versioning, and deployment management with comprehensive MLflow integration and best practices.
Automates multi-agent AI task workflows with dynamic role assignment, collaborative problem-solving, and intelligent task delegation across complex research scenarios.
Quantizes large language models using Activation-aware Weight Quantization (AWQ) to reduce model size and improve inference efficiency.
Enables remote neural network interpretation and analysis through advanced visualization, layer probing, and activation tracking techniques.
Streamlines fine-tuning and deployment of Llama language models with automated configuration, dataset processing, and model optimization workflows.
Provides structured, context-aware advice and recommendations for complex problem-solving, research workflows, and strategic decision-making