🎯

domain-ml

🎯Skill

from goooice/rust-skills

What it does

Enables efficient machine learning and AI development in Rust with optimized tensor operations, GPU acceleration, and model inference across frameworks.

📦

Part of

goooice/rust-skills(35 items)

domain-ml

Installation

Add MarketplaceAdd marketplace to Claude Code

/plugin marketplace add ZhangHanDong/rust-skills

Install PluginInstall plugin from marketplace

/plugin install rust-skills@rust-skills

Quick InstallInstall with npx

npx skills add ZhangHanDong/rust-skills

git cloneClone repository

git clone https://github.com/ZhangHanDong/rust-skills.git

📖 Extracted from docs: goooice/rust-skills

Need more details? View full documentation on GitHub →

4Installs

AddedFeb 4, 2026

View on GitHub Back to Skills

Skill Details

SKILL.md

"Use when building ML/AI apps in Rust. Keywords: machine learning, ML, AI, tensor, model, inference, neural network, deep learning, training, prediction, ndarray, tch-rs, burn, candle, 机器学习, 人工智能, 模型推理"

Overview

# Machine Learning Domain

> Layer 3: Domain Constraints

Domain Constraints → Design Implications

| Domain Rule | Design Constraint | Rust Implication |

|-------------|-------------------|------------------|

| Large data | Efficient memory | Zero-copy, streaming |

| GPU acceleration | CUDA/Metal support | candle, tch-rs |

| Model portability | Standard formats | ONNX |

| Batch processing | Throughput over latency | Batched inference |

| Numerical precision | Float handling | ndarray, careful f32/f64 |

| Reproducibility | Deterministic | Seeded random, versioning |

---

Critical Constraints

Memory Efficiency

```

RULE: Avoid copying large tensors

WHY: Memory bandwidth is bottleneck

RUST: References, views, in-place ops

```

GPU Utilization

```

RULE: Batch operations for GPU efficiency

WHY: GPU overhead per kernel launch

RUST: Batch sizes, async data loading

```

Model Portability

```

RULE: Use standard model formats

WHY: Train in Python, deploy in Rust

RUST: ONNX via tract or candle

```

---

Trace Down ↓

From constraints to design (Layer 2):

```

"Need efficient data pipelines"

↓ m10-performance: Streaming, batching

↓ polars: Lazy evaluation

"Need GPU inference"

↓ m07-concurrency: Async data loading

↓ candle/tch-rs: CUDA backend

"Need model loading"

↓ m12-lifecycle: Lazy init, caching

↓ tract: ONNX runtime

```

---

Use Case → Framework

| Use Case | Recommended | Why |

|----------|-------------|-----|

| Inference only | tract (ONNX) | Lightweight, portable |

| Training + inference | candle, burn | Pure Rust, GPU |

| PyTorch models | tch-rs | Direct bindings |

| Data pipelines | polars | Fast, lazy eval |

Key Crates

| Purpose | Crate |

|---------|-------|

| Tensors | ndarray |

| ONNX inference | tract |

| ML framework | candle, burn |

| PyTorch bindings | tch-rs |

| Data processing | polars |

| Embeddings | fastembed |

Design Patterns

| Pattern | Purpose | Implementation |

|---------|---------|----------------|

| Model loading | Once, reuse | OnceLock |

| Batching | Throughput | Collect then process |

| Streaming | Large data | Iterator-based |

| GPU async | Parallelism | Data loading parallel to compute |

Code Pattern: Inference Server

```rust

use std::sync::OnceLock;

use tract_onnx::prelude::*;

static MODEL: OnceLock, Graph>>> = OnceLock::new();

fn get_model() -> &'static SimplePlan<...> {

MODEL.get_or_init(|| {

tract_onnx::onnx()

.model_for_path("model.onnx")

.unwrap()

.into_optimized()

.unwrap()

.into_runnable()

.unwrap()

})

}

async fn predict(input: Vec) -> anyhow::Result> {

let model = get_model();

let input = tract_ndarray::arr1(&input).into_shape((1, input.len()))?;

let result = model.run(tvec!(input.into()))?;

Ok(result[0].to_array_view::()?.iter().copied().collect())

}

```

Code Pattern: Batched Inference

```rust

async fn batch_predict(inputs: Vec>, batch_size: usize) -> Vec> {

let mut results = Vec::with_capacity(inputs.len());

for batch in inputs.chunks(batch_size) {

// Stack inputs into batch tensor

let batch_tensor = stack_inputs(batch);

// Run inference on batch

let batch_output = model.run(batch_tensor).await;

// Unstack results

results.extend(unstack_outputs(batch_output));

}

results

}

```

---

Common Mistakes

| Mistake | Domain Violation | Fix |

|---------|-----------------|-----|

| Clone tensors | Memory waste | Use views |

| Single inference | GPU underutilized | Batch processing |

| Load model per request | Slow | Singleton pattern |

| Sync data loading | GPU idle | Async pipeline |

---

Trace to Layer 1

| Constraint | Layer 2 Pattern | Layer 1 Implementation |

|------------|-----------------|------------------------|

| Memory efficiency | Zero-copy | ndarray views |

| Model singleton | Lazy init | OnceLock |

| Batch processing | Chunked iteration | chunks() + parallel |

| GPU async | Concurrent loading | tokio::spawn + GPU |

---

Related Skills

| When | See |

|------|-----|

| Performance | m10-performance |

| Lazy initialization | m12-lifecycle |

| Async patterns | m07-concurrency |

| Memory efficiency | m01-ownership |

More from this repository10

🎯

m10-performance🎯Skill

Optimizes code performance by identifying bottlenecks, measuring impact, and guiding strategic improvements across algorithm, data structure, and memory efficiency.

🎯

m14-mental-model🎯Skill

Applies the M14 mental model framework to enhance decision-making and strategic thinking through structured cognitive analysis.

🎯

m04-zero-cost🎯Skill

Guides developers in choosing zero-cost abstractions by analyzing type system constraints and performance trade-offs in Rust generics and traits.

🎯

meta-cognition-parallel🎯Skill

Performs parallel three-layer meta-cognitive analysis by forking subagents to simultaneously analyze language mechanics, design choices, and domain constraints, then synthesizing results.

🎯

unsafe-checker🎯Skill

Identifies and reviews unsafe Rust code patterns, FFI risks, and potential memory unsafety in Rust projects.

🎯

rust-skill-creator🎯Skill

Dynamically generates Claude skills for Rust crates, standard library modules, and documentation by extracting and processing technical details from specified URLs.

🎯

coding-guidelines🎯Skill

Provides comprehensive Rust coding guidelines covering naming conventions, best practices, error handling, memory management, concurrency, and code style recommendations.

🎯

rust-refactor-helper🎯Skill

Performs safe Rust refactoring by analyzing symbol references, checking conflicts, and applying changes across project files using LSP.

🎯

m03-mutability🎯Skill

Diagnoses and guides resolution of Rust mutability and borrowing conflicts by analyzing ownership, mutation patterns, and thread-safety requirements.

🎯

m05-type-driven🎯Skill

Explores and demonstrates type-driven development techniques in Rust, showcasing advanced type system features and pattern matching strategies.