🔌

tokenization

🔌Plugin

Orchestra-Research/AI-research-SKILLs

VibeIndex|
What it does
|

A tokenization skill from the AI Research Engineering Skills Library, which offers 83 skills across 20 categories covering model architecture, fine-tuning, inference, and other AI research areas.

Overview

Tokenization is a skill from Orchestra Research's AI Research Engineering Skills Library, the most comprehensive open-source collection of AI research engineering skills for AI coding agents. It is one of 83 skills organized across 20 categories that enable coding agents to write and conduct AI research experiments, including preparing datasets, executing training pipelines, and deploying models.

Key Features

  • Tokenization Expertise - Part of the dedicated Tokenization category within the library, providing specialized knowledge for text tokenization tasks in AI/ML pipelines
  • 83 Skills Across 20 Categories - Comprehensive coverage spanning model architecture, fine-tuning, post-training, distributed training, optimization, inference, data processing, evaluation, safety, RAG, multimodal, and more
  • Research Agent Building Blocks - Skills serve as the engineering ability layer that enables coding agents to conduct AI research experiments end-to-end
  • NPM Distribution - Installable via npm as @orchestra-research/ai-research-skills for easy integration into existing workflows
  • Open Source & Community - MIT licensed with an active Slack community, designed for collaboration and extension by AI researchers

Who is this for?

This skill is designed for AI researchers and ML engineers who want their coding agents to handle tokenization tasks within research workflows. It is ideal for teams building AI research pipelines who need agents capable of preparing text data, configuring tokenizers, and integrating tokenization steps into larger training and inference workflows.

🏪

Part of

orchestra-research-ai-research-skills

Installation

Add marketplace in Claude Code:
/plugin marketplace add orchestra-research/AI-research-SKILLs
Step 2. Install plugin:
/plugin install tokenization@ai-research-skills
3,928
-
Last UpdatedFeb 19, 2026

More from this repository10

🏪
orchestra-research-ai-research-skills🏪Marketplace

Streamlines AI research workflows by providing curated Claude skills for data analysis, literature review, experiment design, and research paper generation.

🔌
prompt-engineering🔌Plugin

Prompt-engineering category of the AI Research Engineering Skills library — 4 skills covering DSPy (declarative prompt programming), Instructor (Pydantic-validated structured outputs), Guidance (regex/grammar-constrained generation), and Outlines (FSM-based structured text).

🔌
emerging-techniques🔌Plugin

Emerging-techniques category of the AI Research Engineering Skills library — 6 skills covering Mixture-of-Experts training, Model Merging (TIES/DARE/SLERP via mergekit), Long Context (RoPE/YaRN/ALiBi), Speculative Decoding, Knowledge Distillation, and Model Pruning.

🔌
ml-paper-writing🔌Plugin

AI research skill for writing publication-ready ML papers for top conferences (NeurIPS, ICML, ICLR, ACL, AAAI, COLM) with LaTeX templates and citation verification.

🔌
multimodal🔌Plugin

A collection of 7 multimodal AI research skills covering CLIP, Whisper, LLaVA, BLIP-2, SAM, Stable Diffusion, and AudioCraft — part of Orchestra Research's 83 AI research engineering skills for coding agents.

🔌
safety-alignment🔌Plugin

A collection of 4 AI safety and alignment research skills covering Constitutional AI, LlamaGuard safety classifier, NeMo programmable guardrails with Colang, and Meta's Prompt Guard injection detector — part of Orchestra Research's AI research engineering skills.

🎯
ml-paper-writing🎯Skill

Assists AI researchers in drafting, structuring, and generating machine learning research papers with academic writing best practices and technical precision.

🎯
mlflow🎯Skill

MLflow experiment tracking and model management skill from Orchestra Research, part of the most comprehensive open-source AI research engineering skills library with 83 skills.

🎯
brainstorming-research-ideas🎯Skill

Structured ideation frameworks for discovering high-impact research directions with 10 complementary lenses (384 lines). Part of orchestra-research/ai-research-skills.

🎯
faiss🎯Skill

FAISS vector search skill from Orchestra Research for efficient similarity search and dense vector clustering in AI research workflows.