4 results for tag "serving-llms-vllm"
A large collection of Claude Code skill templates sponsored by Z.AI, providing ready-to-use development skill configurations across various domains.
vLLM serving skill from Orchestra Research for deploying and serving large language models with high throughput and low latency.
Skill for serving large language models with vLLM, covering deployment configurations, batching strategies, and performance optimization for LLM inference.