training-llms-megatron
π―Skillfrom zechenzhangagi/ai-research-skills
Trains large language models using Megatron framework with advanced parallelism and optimization techniques for high-performance AI model development.
Part of
zechenzhangagi/ai-research-skills(83 items)
Installation
npx @orchestra-research/ai-research-skillsnpx @orchestra-research/ai-research-skills list # View installed skillsnpx @orchestra-research/ai-research-skills update # Update installed skills/plugin marketplace add orchestra-research/AI-research-SKILLs/plugin install fine-tuning@ai-research-skills # Axolotl, LLaMA-Factory, PEFT, Unsloth+ 4 more commands
More from this repository10
ml-paper-writing skill from zechenzhangagi/ai-research-skills
Performs efficient semantic vector search and similarity matching using Qdrant's vector database for AI-powered information retrieval and recommendation systems.
Orchestrates seamless multi-cloud deployment and management of AI workloads across different cloud providers using SkyPilot's infrastructure automation capabilities.
Enables AI agents to leverage LangChain's framework for building complex language model workflows and chaining together different AI components and tools.
Orchestrates multi-agent collaboration using CrewAI for complex research tasks, enabling specialized AI agents to work together systematically.
Enables AI agents to analyze and understand images by leveraging the BLIP-2 vision-language model for multimodal perception and reasoning tasks.
Generates and manages fine-tuning configurations and workflows for Llama language models, streamlining the process of customizing and training large language models.
Generates and manages autonomous AI agents using AutoGPT's framework for executing complex, multi-step research and problem-solving tasks.
Logs and tracks machine learning experiments, model performance, and hyperparameters using Weights & Biases platform for comprehensive AI research visualization.
Accelerates large language model inference by predicting and pre-computing potential token sequences before final model verification, reducing computational latency.