🎯

devtu-create-tool

🎯Skill

from mims-harvard/tooluniverse

VibeIndex|
What it does

Generates scientific tool classes and configurations for ToolUniverse framework, ensuring proper structure, validation, and automated wrapper creation.

πŸ“¦

Part of

mims-harvard/tooluniverse(19 items)

devtu-create-tool

Installation

PythonRun Python server
Python Implementation From JSON Configs
PythonRun Python server
python3 -m json.tool src/tooluniverse/data/your_tools.json
πŸ“– Extracted from docs: mims-harvard/tooluniverse
7Installs
-
AddedFeb 4, 2026

Skill Details

SKILL.md

Create new scientific tools for ToolUniverse framework with proper structure, validation, and testing. Use when users need to add tools to ToolUniverse, implement new API integrations, create tool wrappers for scientific databases/services, expand ToolUniverse capabilities, or follow ToolUniverse contribution guidelines. Supports creating tool classes, JSON configurations, validation, error handling, and test examples.

Overview

# ToolUniverse Tool Creator

Create new scientific tools for the ToolUniverse framework following established best practices.

---

Table of Contents

  1. [Critical Knowledge](#critical-knowledge)
  2. [Core Concepts](#core-concepts)
  3. [Implementation Guide](#implementation-guide)
  4. [Testing Strategy](#testing-strategy)
  5. [Common Patterns](#common-patterns)
  6. [Troubleshooting](#troubleshooting)
  7. [Reference](#reference)

---

Critical Knowledge

Top 5 Mistakes (90% of Failures)

  1. Missing default_config.py Entry - Tools silently won't load
  2. Fake test_examples - Tests fail, agents get bad examples
  3. Single-level Testing - Misses registration bugs
  4. Tool Names > 55 chars - Breaks MCP compatibility
  5. Raising Exceptions - Should return error dicts instead

Tool Creator vs SDK User

| SDK User (Using) | Tool Creator (Building) |

|------------------|-------------------------|

| tu.tools.ToolName() | @register_tool() + JSON |

| Handle responses | Design schemas |

| One-level usage | Three-step registration |

---

Core Concepts

Two-Stage Architecture

```

Stage 1: Tool Class Stage 2: Wrappers (Auto-Generated)

Python Implementation From JSON Configs

↓ ↓

@register_tool("MyTool") MyAPI_list_items()

class MyTool(BaseTool): MyAPI_search()

def run(arguments): MyAPI_get_details()

```

Key Points:

  • One class handles multiple operations
  • JSON defines individual tool wrappers
  • Users call wrappers, which route to class
  • Need BOTH for tools to work

Three-Step Registration

Step 1: Class Registration

```python

@register_tool("MyAPITool") # Decorator registers class

class MyAPITool(BaseTool):

pass

```

Step 2: Config Registration ⚠️ MOST COMMONLY MISSED

```python

# In src/tooluniverse/default_config.py

TOOLS_CONFIGS = {

"my_category": os.path.join(current_dir, "data", "my_category_tools.json"),

}

```

Step 3: Wrapper Generation (Automatic)

```bash

tu = ToolUniverse()

tu.load_tools() # Auto-generates wrappers in tools/

```

Verification Script:

```python

import sys

sys.path.insert(0, 'src')

# Step 1: Check class registered

from tooluniverse.tool_registry import get_tool_registry

import tooluniverse.your_tool_module

registry = get_tool_registry()

assert "YourToolClass" in registry, "❌ Step 1 FAILED"

print("βœ… Step 1: Class registered")

# Step 2: Check config registered

from tooluniverse.default_config import TOOLS_CONFIGS

assert "your_category" in TOOLS_CONFIGS, "❌ Step 2 FAILED"

print("βœ… Step 2: Config registered")

# Step 3: Check wrappers generated

from tooluniverse import ToolUniverse

tu = ToolUniverse()

tu.load_tools()

assert hasattr(tu.tools, 'YourCategory_operation1'), "❌ Step 3 FAILED"

print("βœ… Step 3: Wrappers generated")

print(f"βœ… All steps complete!")

```

Standard Response Format

All tools must return:

```json

{

"status": "success" | "error",

"data": {...}, // On success

"error": "message" // On failure

}

```

Why: Consistent error handling, composability, user expectations

---

Implementation Guide

File Structure

Required Files:

  • src/tooluniverse/my_api_tool.py - Implementation
  • src/tooluniverse/data/my_api_tools.json - Tool definitions
  • tests/unit/test_my_api_tool.py - Tests
  • examples/my_api_examples.py - Usage examples

Auto-Generated (don't create manually):

  • src/tooluniverse/tools/MyAPI_*.py - Wrappers

Pattern 1: Multi-Operation Tool (Recommended)

Python Class:

```python

from typing import Dict, Any

from tooluniverse.tool import BaseTool

from tooluniverse.tool_utils import register_tool

import requests

@register_tool("MyAPITool")

class MyAPITool(BaseTool):

"""Tool for MyAPI database."""

BASE_URL = "https://api.example.com/v1"

def __init__(self, tool_config):

super().__init__(tool_config)

self.parameter = tool_config.get("parameter", {})

self.required = self.parameter.get("required", [])

def run(self, arguments: Dict[str, Any]) -> Dict[str, Any]:

"""Route to operation handler."""

operation = arguments.get("operation")

if not operation:

return {"status": "error", "error": "Missing: operation"}

if operation == "list_items":

return self._list_items(arguments)

elif operation == "search":

return self._search(arguments)

else:

return {"status": "error", "error": f"Unknown: {operation}"}

def _list_items(self, arguments: Dict[str, Any]) -> Dict[str, Any]:

"""List items with pagination."""

try:

params = {}

if "limit" in arguments:

params["limit"] = arguments["limit"]

response = requests.get(

f"{self.BASE_URL}/items",

params=params,

timeout=30

)

response.raise_for_status()

data = response.json()

return {

"status": "success",

"data": data.get("items", []),

"total": data.get("total", 0)

}

except requests.exceptions.Timeout:

return {"status": "error", "error": "Timeout after 30s"}

except requests.exceptions.HTTPError as e:

return {"status": "error", "error": f"HTTP {e.response.status_code}"}

except Exception as e:

return {"status": "error", "error": str(e)}

def _search(self, arguments: Dict[str, Any]) -> Dict[str, Any]:

"""Search items by query."""

query = arguments.get("query")

if not query:

return {"status": "error", "error": "Missing: query"}

try:

response = requests.get(

f"{self.BASE_URL}/search",

params={"q": query},

timeout=30

)

response.raise_for_status()

data = response.json()

return {

"status": "success",

"results": data.get("results", []),

"count": data.get("count", 0)

}

except requests.exceptions.RequestException as e:

return {"status": "error", "error": f"API failed: {str(e)}"}

```

JSON Configuration:

```json

[

{

"name": "MyAPI_list_items",

"class": "MyAPITool",

"description": "List items from database with pagination. Returns item IDs and names. Supports filtering by status and type. Example: limit=10 returns first 10 items.",

"parameter": {

"type": "object",

"required": ["operation"],

"properties": {

"operation": {

"const": "list_items",

"description": "Operation type (fixed)"

},

"limit": {

"type": "integer",

"description": "Max results (1-100)",

"minimum": 1,

"maximum": 100

}

}

},

"return": {

"type": "object",

"properties": {

"status": {"type": "string", "enum": ["success", "error"]},

"data": {"type": "array"},

"total": {"type": "integer"},

"error": {"type": "string"}

},

"required": ["status"]

},

"test_examples": [

{

"operation": "list_items",

"limit": 10

}

]

}

]

```

Pattern 2: Async Polling (Job-Based APIs)

```python

import time

def _submit_job(self, arguments: Dict[str, Any]) -> Dict[str, Any]:

"""Submit job and poll for results."""

try:

# Submit

submit_response = requests.post(

f"{self.BASE_URL}/jobs/submit",

json={"data": arguments.get("data")},

timeout=30

)

submit_response.raise_for_status()

job_id = submit_response.json().get("job_id")

# Poll

for attempt in range(60): # 2 min max

status_response = requests.get(

f"{self.BASE_URL}/jobs/{job_id}/status",

timeout=30

)

status_response.raise_for_status()

result = status_response.json()

if result.get("status") == "completed":

return {

"status": "success",

"data": result.get("results"),

"job_id": job_id

}

elif result.get("status") == "failed":

return {

"status": "error",

"error": result.get("error"),

"job_id": job_id

}

time.sleep(2) # Poll every 2s

return {"status": "error", "error": "Timeout after 2 min"}

except requests.exceptions.RequestException as e:

return {"status": "error", "error": str(e)}

```

JSON Best Practices

Tool Naming (≀55 chars for MCP):

  • Template: {API}_{action}_{target}
  • βœ… Good: FDA_get_drug_info (20 chars)
  • ❌ Bad: FDA_get_detailed_drug_information_with_history (55+ chars)

Description (150-250 chars, high-context):

```json

{

"description": "Search database for items. Returns up to 100 results with scores. Supports wildcards (* ?) and Boolean operators (AND, OR, NOT). Example: 'protein AND membrane' finds membrane proteins."

}

```

Include: What it returns, data source, use case, input format, example

test_examples (MUST be real):

```json

{

"test_examples": [

{

"operation": "search",

"query": "protein", // βœ… Real, common term

"limit": 10

}

]

}

```

❌ Don't use: "id": "XXXXX", "placeholder": "example_123"

βœ… Do use: Real IDs from actual API documentation

---

Testing Strategy

Two-Level Testing (MANDATORY)

Level 1: Direct Class Testing

```python

import json

from tooluniverse.your_tool_module import YourToolClass

def test_direct_class():

"""Test implementation logic."""

with open("src/tooluniverse/data/your_tools.json") as f:

tools = json.load(f)

config = next(t for t in tools if t["name"] == "YourTool_operation1")

tool = YourToolClass(config)

result = tool.run({"operation": "operation1", "param": "value"})

assert result["status"] == "success"

assert "data" in result

```

Level 2: ToolUniverse Interface Testing

```python

import pytest

from tooluniverse import ToolUniverse

class TestYourTools:

@pytest.fixture

def tu(self):

tu = ToolUniverse()

tu.load_tools() # CRITICAL

return tu

def test_tools_load(self, tu):

"""Verify registration."""

assert hasattr(tu.tools, 'YourTool_operation1')

def test_execution(self, tu):

"""Test via ToolUniverse (how users call it)."""

result = tu.tools.YourTool_operation1(**{

"operation": "operation1",

"param": "value"

})

assert result["status"] == "success"

def test_error_handling(self, tu):

"""Test missing params."""

result = tu.tools.YourTool_operation1(**{

"operation": "operation1"

# Missing required param

})

assert result["status"] == "error"

```

Level 3: Real API Testing

```python

def test_real_api():

"""Verify actual API integration."""

tu = ToolUniverse()

tu.load_tools()

result = tu.tools.YourTool_operation1(**{

"operation": "operation1",

"param": "real_value_from_docs"

})

if result["status"] == "success":

assert "data" in result

print("βœ… Real API works")

else:

print(f"⚠️ API error (may be down): {result['error']}")

```

Why Both Levels:

  • Level 1: Tests implementation, catches code bugs
  • Level 2: Tests registration, catches config bugs
  • Level 3: Tests integration, catches API issues

---

Common Patterns

Error Handling Checklist

βœ… Always set timeout (30s recommended)

βœ… Catch specific exceptions (Timeout, ConnectionError, HTTPError)

βœ… Return error dicts, never raise in run()

βœ… Include helpful context in error messages

βœ… Handle JSON parsing errors

βœ… Validate required parameters

Dependency Management

Check package size FIRST:

```bash

curl -s https://pypi.org/pypi/PACKAGE/json | python3 -c "

import json, sys

data = json.load(sys.stdin)

print(f'Dependencies: {len(data[\"info\"][\"requires_dist\"] or [])}')

"

```

Classification:

  • Core (<100MB, universal use) β†’ [project.dependencies]
  • Optional (>100MB or niche) β†’ [project.optional-dependencies]

In code:

```python

try:

import optional_package

except ImportError:

return {

"status": "error",

"error": "Install with: pip install optional_package"

}

```

Pagination Pattern

```python

def _list_items(self, arguments):

params = {}

if "page" in arguments:

params["page"] = arguments["page"]

if "limit" in arguments:

params["limit"] = arguments["limit"]

response = requests.get(url, params=params, timeout=30)

data = response.json()

return {

"status": "success",

"data": data.get("items", []),

"page": data.get("page", 0),

"total_pages": data.get("total_pages", 1),

"total_items": data.get("total", 0)

}

```

---

Troubleshooting

Tool Doesn't Load (90% of Issues)

Symptoms: Tool count doesn't increase, no error, AttributeError when calling

Cause: Missing Step 2 of registration (default_config.py)

Solution:

```python

# Edit src/tooluniverse/default_config.py

TOOLS_CONFIGS = {

# ... existing ...

"your_category": os.path.join(current_dir, "data", "your_category_tools.json"),

}

```

Verify:

```bash

grep "your_category" src/tooluniverse/default_config.py

ls src/tooluniverse/tools/YourCategory_*.py

python3 -c "from tooluniverse import ToolUniverse; tu = ToolUniverse(); tu.load_tools(); print(hasattr(tu.tools, 'YourCategory_op1'))"

```

Tests Fail with Real APIs

Mock vs Real Testing:

  • Mocks test code structure
  • Real calls test API integration
  • Both needed for confidence

What Real Testing Catches:

  • Response structure differences
  • Parameter name mismatches
  • Unexpected pagination
  • Timeout issues
  • Data type surprises

---

Reference

Complete Workflow

  1. Create Python class with @register_tool
  2. Create JSON config with realistic test_examples
  3. Add to default_config.py ← CRITICAL
  4. Generate wrappers: tu.load_tools()
  5. Test Level 1 (direct class)
  6. Test Level 2 (ToolUniverse interface)
  7. Test Level 3 (real API calls)
  8. Create examples file
  9. Verify all 3 registration steps
  10. Document in verification report

Quick Commands

```bash

# Validate JSON

python3 -m json.tool src/tooluniverse/data/your_tools.json

# Check Python syntax

python3 -m py_compile src/tooluniverse/your_tool.py

# Verify registration

grep "your_category" src/tooluniverse/default_config.py

# Generate wrappers

PYTHONPATH=src python3 -m tooluniverse.generate_tools --force

# List wrappers

ls src/tooluniverse/tools/YourCategory_*.py

# Run tests

pytest tests/unit/test_your_tool.py -v

# Count tools

python3 << 'EOF'

from tooluniverse import ToolUniverse

tu = ToolUniverse()

tu.load_tools()

print(f"Total: {len([t for t in dir(tu.tools) if 'YourCategory' in t])} tools")

EOF

```

Critical Reminders

⚠️ ALWAYS add to default_config.py (Step 2)

⚠️ NEVER raise exceptions in run()

⚠️ ALWAYS use real test_examples

⚠️ ALWAYS test both levels

⚠️ KEEP tool names ≀55 characters

⚠️ RETURN standard response format

⚠️ SET timeout on all HTTP requests

⚠️ VERIFY all 3 registration steps

Success Criteria

βœ… All 3 registration steps verified

βœ… Level 1 tests passing (direct class)

βœ… Level 2 tests passing (ToolUniverse interface)

βœ… Real API calls working (Level 3)

βœ… Tool names ≀55 characters

βœ… test_examples use real IDs

βœ… Standard response format used

βœ… Helpful error messages

βœ… Examples file created

βœ… No raised exceptions in run()

When all criteria met β†’ Production Ready πŸŽ‰

More from this repository10

🎯
tooluniverse-protein-structure-retrieval🎯Skill

Retrieves protein structure data from various databases and provides detailed structural information for scientific research and analysis.

🎯
tooluniverse-expression-data-retrieval🎯Skill

Retrieves comprehensive gene expression and multi-omics datasets from ArrayExpress and BioStudies with intelligent gene disambiguation and quality assessment.

🎯
tooluniverse-literature-deep-research🎯Skill

Performs comprehensive literature research with target disambiguation, evidence grading, and structured theme extraction for thorough scientific investigations.

🎯
tooluniverse-sdk🎯Skill

Enables programmatic access to 1000+ scientific tools for building AI-powered research workflows, data analysis, and computational biology tasks.

🎯
tooluniverse-sequence-retrieval🎯Skill

Retrieves biological sequences from NCBI and ENA with precise gene disambiguation, accession handling, and comprehensive sequence metadata.

🎯
tooluniverse-disease-research🎯Skill

Researches and provides comprehensive insights into diseases, symptoms, treatments, and medical research using advanced AI analysis.

🎯
tooluniverse-target-research🎯Skill

Performs targeted research by systematically exploring and analyzing information sources to gather comprehensive insights on a specific topic or research question.

🎯
tooluniverse-chemical-compound-retrieval🎯Skill

Retrieves comprehensive chemical compound data from PubChem and ChEMBL, providing detailed profiles with identifiers, properties, and bioactivity information.

🎯
tooluniverse-drug-research🎯Skill

Retrieves and analyzes drug research data, providing comprehensive insights into pharmaceutical compounds and their scientific literature.

🎯
devtu-optimize-descriptions🎯Skill

Optimizes tool descriptions in ToolUniverse JSON configs by reviewing and enhancing clarity, prerequisites, parameter guidance, and usage examples.