🎯

devtu-create-tool

🎯Skill

from mims-harvard/tooluniverse

What it does

Generates scientific tool classes and configurations for ToolUniverse framework, ensuring proper structure, validation, and automated wrapper creation.

📦

Part of

mims-harvard/tooluniverse(19 items)

devtu-create-tool

Installation

PythonRun Python server

Python Implementation From JSON Configs

PythonRun Python server

python3 -m json.tool src/tooluniverse/data/your_tools.json

📖 Extracted from docs: mims-harvard/tooluniverse

Need more details? View full documentation on GitHub →

7Installs

AddedFeb 4, 2026

View on GitHub Back to Skills

Skill Details

SKILL.md

Create new scientific tools for ToolUniverse framework with proper structure, validation, and testing. Use when users need to add tools to ToolUniverse, implement new API integrations, create tool wrappers for scientific databases/services, expand ToolUniverse capabilities, or follow ToolUniverse contribution guidelines. Supports creating tool classes, JSON configurations, validation, error handling, and test examples.

Overview

# ToolUniverse Tool Creator

Create new scientific tools for the ToolUniverse framework following established best practices.

---

[Critical Knowledge](#critical-knowledge)
[Core Concepts](#core-concepts)
[Implementation Guide](#implementation-guide)
[Testing Strategy](#testing-strategy)
[Common Patterns](#common-patterns)
[Troubleshooting](#troubleshooting)
[Reference](#reference)

---

Critical Knowledge

Top 5 Mistakes (90% of Failures)

Missing default_config.py Entry - Tools silently won't load
Fake test_examples - Tests fail, agents get bad examples
Single-level Testing - Misses registration bugs
Tool Names > 55 chars - Breaks MCP compatibility
Raising Exceptions - Should return error dicts instead

Tool Creator vs SDK User

| SDK User (Using) | Tool Creator (Building) |

|------------------|-------------------------|

| tu.tools.ToolName() | @register_tool() + JSON |

| Handle responses | Design schemas |

| One-level usage | Three-step registration |

---

Core Concepts

Two-Stage Architecture

```

Stage 1: Tool Class Stage 2: Wrappers (Auto-Generated)

Python Implementation From JSON Configs

↓ ↓

@register_tool("MyTool") MyAPI_list_items()

class MyTool(BaseTool): MyAPI_search()

def run(arguments): MyAPI_get_details()

```

Key Points:

One class handles multiple operations
JSON defines individual tool wrappers
Users call wrappers, which route to class
Need BOTH for tools to work

Three-Step Registration

Step 1: Class Registration

```python

@register_tool("MyAPITool") # Decorator registers class

class MyAPITool(BaseTool):

pass

```

Step 2: Config Registration ⚠️ MOST COMMONLY MISSED

```python

# In src/tooluniverse/default_config.py

TOOLS_CONFIGS = {

"my_category": os.path.join(current_dir, "data", "my_category_tools.json"),

}

```

Step 3: Wrapper Generation (Automatic)

```bash

tu = ToolUniverse()

tu.load_tools() # Auto-generates wrappers in tools/

```

Verification Script:

```python

import sys

sys.path.insert(0, 'src')

# Step 1: Check class registered

from tooluniverse.tool_registry import get_tool_registry

import tooluniverse.your_tool_module

registry = get_tool_registry()

assert "YourToolClass" in registry, "❌ Step 1 FAILED"

print("✅ Step 1: Class registered")

# Step 2: Check config registered

from tooluniverse.default_config import TOOLS_CONFIGS

assert "your_category" in TOOLS_CONFIGS, "❌ Step 2 FAILED"

print("✅ Step 2: Config registered")

# Step 3: Check wrappers generated

from tooluniverse import ToolUniverse

tu = ToolUniverse()

tu.load_tools()

assert hasattr(tu.tools, 'YourCategory_operation1'), "❌ Step 3 FAILED"

print("✅ Step 3: Wrappers generated")

print(f"✅ All steps complete!")

```

Standard Response Format

All tools must return:

```json

{

"status": "success" | "error",

"data": {...}, // On success

"error": "message" // On failure

}

```

Why: Consistent error handling, composability, user expectations

---

Implementation Guide

File Structure

Required Files:

src/tooluniverse/my_api_tool.py - Implementation
src/tooluniverse/data/my_api_tools.json - Tool definitions
tests/unit/test_my_api_tool.py - Tests
examples/my_api_examples.py - Usage examples

Auto-Generated (don't create manually):

src/tooluniverse/tools/MyAPI_*.py - Wrappers

Pattern 1: Multi-Operation Tool (Recommended)

Python Class:

```python

from typing import Dict, Any

from tooluniverse.tool import BaseTool

from tooluniverse.tool_utils import register_tool

import requests

@register_tool("MyAPITool")

class MyAPITool(BaseTool):

"""Tool for MyAPI database."""

BASE_URL = "https://api.example.com/v1"

def __init__(self, tool_config):

super().__init__(tool_config)

self.parameter = tool_config.get("parameter", {})

self.required = self.parameter.get("required", [])

def run(self, arguments: Dict[str, Any]) -> Dict[str, Any]:

"""Route to operation handler."""

operation = arguments.get("operation")

if not operation:

return {"status": "error", "error": "Missing: operation"}

if operation == "list_items":

return self._list_items(arguments)

elif operation == "search":

return self._search(arguments)

else:

return {"status": "error", "error": f"Unknown: {operation}"}

def _list_items(self, arguments: Dict[str, Any]) -> Dict[str, Any]:

"""List items with pagination."""

try:

params = {}

if "limit" in arguments:

params["limit"] = arguments["limit"]

response = requests.get(

f"{self.BASE_URL}/items",

params=params,

timeout=30

)

response.raise_for_status()

data = response.json()

return {

"status": "success",

"data": data.get("items", []),

"total": data.get("total", 0)

}

except requests.exceptions.Timeout:

return {"status": "error", "error": "Timeout after 30s"}

except requests.exceptions.HTTPError as e:

return {"status": "error", "error": f"HTTP {e.response.status_code}"}

except Exception as e:

return {"status": "error", "error": str(e)}

def _search(self, arguments: Dict[str, Any]) -> Dict[str, Any]:

"""Search items by query."""

query = arguments.get("query")

if not query:

return {"status": "error", "error": "Missing: query"}

try:

response = requests.get(

f"{self.BASE_URL}/search",

params={"q": query},

timeout=30

)

response.raise_for_status()

data = response.json()

return {

"status": "success",

"results": data.get("results", []),

"count": data.get("count", 0)

}

except requests.exceptions.RequestException as e:

return {"status": "error", "error": f"API failed: {str(e)}"}

```

JSON Configuration:

```json

[

{

"name": "MyAPI_list_items",

"class": "MyAPITool",

"description": "List items from database with pagination. Returns item IDs and names. Supports filtering by status and type. Example: limit=10 returns first 10 items.",

"parameter": {

"type": "object",

"required": ["operation"],

"properties": {

"operation": {

"const": "list_items",

"description": "Operation type (fixed)"

"limit": {

"type": "integer",

"description": "Max results (1-100)",

"minimum": 1,

"maximum": 100

}

"return": {

"type": "object",

"properties": {

"status": {"type": "string", "enum": ["success", "error"]},

"data": {"type": "array"},

"total": {"type": "integer"},

"error": {"type": "string"}

"required": ["status"]

"test_examples": [

{

"operation": "list_items",

"limit": 10

}

]

}

]

```

Pattern 2: Async Polling (Job-Based APIs)

```python

import time

def _submit_job(self, arguments: Dict[str, Any]) -> Dict[str, Any]:

"""Submit job and poll for results."""

try:

# Submit

submit_response = requests.post(

f"{self.BASE_URL}/jobs/submit",

json={"data": arguments.get("data")},

timeout=30

)

submit_response.raise_for_status()

job_id = submit_response.json().get("job_id")

# Poll

for attempt in range(60): # 2 min max

status_response = requests.get(

f"{self.BASE_URL}/jobs/{job_id}/status",

timeout=30

)

status_response.raise_for_status()

result = status_response.json()

if result.get("status") == "completed":

return {

"status": "success",

"data": result.get("results"),

"job_id": job_id

}

elif result.get("status") == "failed":

return {

"status": "error",

"error": result.get("error"),

"job_id": job_id

}

time.sleep(2) # Poll every 2s

return {"status": "error", "error": "Timeout after 2 min"}

except requests.exceptions.RequestException as e:

return {"status": "error", "error": str(e)}

```

JSON Best Practices

Tool Naming (≤55 chars for MCP):

Template: {API}_{action}_{target}
✅ Good: FDA_get_drug_info (20 chars)
❌ Bad: FDA_get_detailed_drug_information_with_history (55+ chars)

Description (150-250 chars, high-context):

```json

{

"description": "Search database for items. Returns up to 100 results with scores. Supports wildcards (* ?) and Boolean operators (AND, OR, NOT). Example: 'protein AND membrane' finds membrane proteins."

}

```

Include: What it returns, data source, use case, input format, example

test_examples (MUST be real):

```json

{

"test_examples": [

{

"operation": "search",

"query": "protein", // ✅ Real, common term

"limit": 10

}

]

}

```

❌ Don't use: "id": "XXXXX", "placeholder": "example_123"

✅ Do use: Real IDs from actual API documentation

---

Testing Strategy

Two-Level Testing (MANDATORY)

Level 1: Direct Class Testing

```python

import json

from tooluniverse.your_tool_module import YourToolClass

def test_direct_class():

"""Test implementation logic."""

with open("src/tooluniverse/data/your_tools.json") as f:

tools = json.load(f)

config = next(t for t in tools if t["name"] == "YourTool_operation1")

tool = YourToolClass(config)

result = tool.run({"operation": "operation1", "param": "value"})

assert result["status"] == "success"

assert "data" in result

```

Level 2: ToolUniverse Interface Testing

```python

import pytest

from tooluniverse import ToolUniverse

class TestYourTools:

@pytest.fixture

def tu(self):

tu = ToolUniverse()

tu.load_tools() # CRITICAL

return tu

def test_tools_load(self, tu):

"""Verify registration."""

assert hasattr(tu.tools, 'YourTool_operation1')

def test_execution(self, tu):

"""Test via ToolUniverse (how users call it)."""

result = tu.tools.YourTool_operation1(**{

"operation": "operation1",

"param": "value"

})

assert result["status"] == "success"

def test_error_handling(self, tu):

"""Test missing params."""

result = tu.tools.YourTool_operation1(**{

"operation": "operation1"

# Missing required param

})

assert result["status"] == "error"

```

Level 3: Real API Testing

```python

def test_real_api():

"""Verify actual API integration."""

tu = ToolUniverse()

tu.load_tools()

result = tu.tools.YourTool_operation1(**{

"operation": "operation1",

"param": "real_value_from_docs"

})

if result["status"] == "success":

assert "data" in result

print("✅ Real API works")

else:

print(f"⚠️ API error (may be down): {result['error']}")

```

Why Both Levels:

Level 1: Tests implementation, catches code bugs
Level 2: Tests registration, catches config bugs
Level 3: Tests integration, catches API issues

---

Common Patterns

Error Handling Checklist

✅ Always set timeout (30s recommended)

✅ Catch specific exceptions (Timeout, ConnectionError, HTTPError)

✅ Return error dicts, never raise in run()

✅ Include helpful context in error messages

✅ Handle JSON parsing errors

✅ Validate required parameters

Dependency Management

Check package size FIRST:

```bash

curl -s https://pypi.org/pypi/PACKAGE/json | python3 -c "

import json, sys

data = json.load(sys.stdin)

print(f'Dependencies: {len(data[\"info\"][\"requires_dist\"] or [])}')

```

Classification:

Core (<100MB, universal use) → [project.dependencies]
Optional (>100MB or niche) → [project.optional-dependencies]

In code:

```python

try:

import optional_package

except ImportError:

return {

"status": "error",

"error": "Install with: pip install optional_package"

}

```

Pagination Pattern

```python

def _list_items(self, arguments):

params = {}

if "page" in arguments:

params["page"] = arguments["page"]

if "limit" in arguments:

params["limit"] = arguments["limit"]

response = requests.get(url, params=params, timeout=30)

data = response.json()

return {

"status": "success",

"data": data.get("items", []),

"page": data.get("page", 0),

"total_pages": data.get("total_pages", 1),

"total_items": data.get("total", 0)

}

```

---

Troubleshooting

Tool Doesn't Load (90% of Issues)

Symptoms: Tool count doesn't increase, no error, AttributeError when calling

Cause: Missing Step 2 of registration (default_config.py)

Solution:

```python

# Edit src/tooluniverse/default_config.py

TOOLS_CONFIGS = {

# ... existing ...

"your_category": os.path.join(current_dir, "data", "your_category_tools.json"),

}

```

Verify:

```bash

grep "your_category" src/tooluniverse/default_config.py

ls src/tooluniverse/tools/YourCategory_*.py

python3 -c "from tooluniverse import ToolUniverse; tu = ToolUniverse(); tu.load_tools(); print(hasattr(tu.tools, 'YourCategory_op1'))"

```

Tests Fail with Real APIs

Mock vs Real Testing:

Mocks test code structure
Real calls test API integration
Both needed for confidence

What Real Testing Catches:

Response structure differences
Parameter name mismatches
Unexpected pagination
Timeout issues
Data type surprises

---

Reference

Complete Workflow

Create Python class with @register_tool
Create JSON config with realistic test_examples
Add to default_config.py ← CRITICAL
Generate wrappers: tu.load_tools()
Test Level 1 (direct class)
Test Level 2 (ToolUniverse interface)
Test Level 3 (real API calls)
Create examples file
Verify all 3 registration steps
Document in verification report

Quick Commands

```bash

# Validate JSON

python3 -m json.tool src/tooluniverse/data/your_tools.json

# Check Python syntax

python3 -m py_compile src/tooluniverse/your_tool.py

# Verify registration

grep "your_category" src/tooluniverse/default_config.py

# Generate wrappers

PYTHONPATH=src python3 -m tooluniverse.generate_tools --force

# List wrappers

ls src/tooluniverse/tools/YourCategory_*.py

# Run tests

pytest tests/unit/test_your_tool.py -v

# Count tools

python3 << 'EOF'

from tooluniverse import ToolUniverse

tu = ToolUniverse()

tu.load_tools()

print(f"Total: {len([t for t in dir(tu.tools) if 'YourCategory' in t])} tools")

EOF

```

Critical Reminders

⚠️ ALWAYS add to default_config.py (Step 2)

⚠️ NEVER raise exceptions in run()

⚠️ ALWAYS use real test_examples

⚠️ ALWAYS test both levels

⚠️ KEEP tool names ≤55 characters

⚠️ RETURN standard response format

⚠️ SET timeout on all HTTP requests

⚠️ VERIFY all 3 registration steps

Success Criteria

✅ All 3 registration steps verified

✅ Level 1 tests passing (direct class)

✅ Level 2 tests passing (ToolUniverse interface)

✅ Real API calls working (Level 3)

✅ Tool names ≤55 characters

✅ test_examples use real IDs

✅ Standard response format used

✅ Helpful error messages

✅ Examples file created

✅ No raised exceptions in run()

When all criteria met → Production Ready 🎉

More from this repository10

🎯

tooluniverse-protein-structure-retrieval🎯Skill

Retrieves protein structure data from various databases and provides detailed structural information for scientific research and analysis.

🎯

tooluniverse-expression-data-retrieval🎯Skill

Retrieves comprehensive gene expression and multi-omics datasets from ArrayExpress and BioStudies with intelligent gene disambiguation and quality assessment.

🎯

tooluniverse-literature-deep-research🎯Skill

Performs comprehensive literature research with target disambiguation, evidence grading, and structured theme extraction for thorough scientific investigations.

🎯

tooluniverse-sdk🎯Skill

Enables programmatic access to 1000+ scientific tools for building AI-powered research workflows, data analysis, and computational biology tasks.

🎯

tooluniverse-sequence-retrieval🎯Skill

Retrieves biological sequences from NCBI and ENA with precise gene disambiguation, accession handling, and comprehensive sequence metadata.

🎯

tooluniverse-disease-research🎯Skill

Researches and provides comprehensive insights into diseases, symptoms, treatments, and medical research using advanced AI analysis.

🎯

tooluniverse-target-research🎯Skill

Performs targeted research by systematically exploring and analyzing information sources to gather comprehensive insights on a specific topic or research question.

🎯

tooluniverse-chemical-compound-retrieval🎯Skill

Retrieves comprehensive chemical compound data from PubChem and ChEMBL, providing detailed profiles with identifiers, properties, and bioactivity information.

🎯

tooluniverse-drug-research🎯Skill

Retrieves and analyzes drug research data, providing comprehensive insights into pharmaceutical compounds and their scientific literature.

🎯

devtu-optimize-descriptions🎯Skill

Optimizes tool descriptions in ToolUniverse JSON configs by reviewing and enhancing clarity, prerequisites, parameter guidance, and usage examples.

devtu-create-tool

Installation

Skill Details

Overview

Table of Contents

Critical Knowledge

Top 5 Mistakes (90% of Failures)

Tool Creator vs SDK User

Core Concepts

Two-Stage Architecture

Three-Step Registration

Standard Response Format

Implementation Guide

File Structure

Pattern 1: Multi-Operation Tool (Recommended)

Pattern 2: Async Polling (Job-Based APIs)

JSON Best Practices

Testing Strategy

Two-Level Testing (MANDATORY)

Common Patterns

Error Handling Checklist

Dependency Management

Pagination Pattern

Troubleshooting

Tool Doesn't Load (90% of Issues)

Tests Fail with Real APIs

Reference

Complete Workflow

Quick Commands

Critical Reminders

Success Criteria

More from this repository10