🎯

tooluniverse-sdk

🎯Skill

from mims-harvard/tooluniverse

What it does

Enables programmatic access to 1000+ scientific tools for building AI-powered research workflows, data analysis, and computational biology tasks.

📦

Part of

mims-harvard/tooluniverse(19 items)

tooluniverse-sdk

Installation

pip installInstall Python package

pip install tooluniverse

pip installInstall Python package

pip install tooluniverse[embedding] # Embedding search (GPU)

pip installInstall Python package

pip install tooluniverse[ml] # ML model tools

pip installInstall Python package

pip install tooluniverse[all] # All features

📖 Extracted from docs: mims-harvard/tooluniverse

Need more details? View full documentation on GitHub →

8Installs

AddedFeb 4, 2026

View on GitHub Back to Skills

Skill Details

SKILL.md

Build AI scientist systems using ToolUniverse Python SDK for scientific research. Use when users need to access 1000++ scientific tools through Python code, create scientific workflows, perform drug discovery, protein analysis, genomics analysis, literature research, or any computational biology task. Triggers include requests to use scientific tools programmatically, build research pipelines, analyze biological data, search literature, predict drug properties, or create AI-powered scientific workflows.

Overview

# ToolUniverse Python SDK

ToolUniverse provides programmatic access to 1000++ scientific tools through a unified interface. It implements the AI-Tool Interaction Protocol for building AI scientist systems that integrate ML models, databases, APIs, and scientific packages.

Installation

```bash

# Standard installation

pip install tooluniverse

# With optional features

pip install tooluniverse[embedding] # Embedding search (GPU)

pip install tooluniverse[ml] # ML model tools

pip install tooluniverse[all] # All features

```

Environment Setup

```bash

# Required for LLM-based tool search and hooks

export OPENAI_API_KEY="sk-..."

# Optional for higher rate limits

export NCBI_API_KEY="..."

```

Or use .env file:

```python

from dotenv import load_dotenv

load_dotenv()

```

Quick Start

```python

from tooluniverse import ToolUniverse

# 1. Initialize and load tools

tu = ToolUniverse()

tu.load_tools() # Loads 1000++ tools (~5-10 seconds first time)

# 2. Find tools (three methods)

# Method A: Keyword (fast, no API key)

tools = tu.run({

"name": "Tool_Finder_Keyword",

"arguments": {"description": "protein structure", "limit": 10}

})

# Method B: LLM (intelligent, requires OPENAI_API_KEY)

tools = tu.run({

"name": "Tool_Finder_LLM",

"arguments": {"description": "predict drug toxicity", "limit": 5}

})

# Method C: Embedding (semantic, requires GPU)

tools = tu.run({

"name": "Tool_Finder",

"arguments": {"description": "protein interactions", "limit": 10}

})

# 3. Execute tools (two ways)

# Dictionary API

result = tu.run({

"name": "UniProt_get_entry_by_accession",

"arguments": {"accession": "P05067"}

})

# Function API (recommended)

result = tu.tools.UniProt_get_entry_by_accession(accession="P05067")

```

Core Patterns

Pattern 1: Discovery → Execute

```python

# Find tools

tools = tu.run({

"name": "Tool_Finder_Keyword",

"arguments": {"description": "ADMET prediction", "limit": 3}

})

# Check results structure

if isinstance(tools, dict) and 'tools' in tools:

for tool in tools['tools']:

print(f"{tool['name']}: {tool['description']}")

# Execute tool

result = tu.tools.ADMETAI_predict_admet(

smiles="CC(C)Cc1ccc(cc1)C(C)C(O)=O"

)

```

Pattern 2: Batch Execution

```python

# Define calls

calls = [

{"name": "UniProt_get_entry_by_accession", "arguments": {"accession": "P05067"}},

{"name": "UniProt_get_entry_by_accession", "arguments": {"accession": "P12345"}},

{"name": "RCSB_PDB_get_structure_by_id", "arguments": {"pdb_id": "1ABC"}}

]

# Execute in parallel

results = tu.run_batch(calls)

```

Pattern 3: Scientific Workflow

```python

def drug_discovery_pipeline(disease_id):

tu = ToolUniverse(use_cache=True)

tu.load_tools()

try:

# Get targets

targets = tu.tools.OpenTargets_get_associated_targets_by_disease_efoId(

efoId=disease_id

)

# Get compounds (batch)

compound_calls = [

{"name": "ChEMBL_search_molecule_by_target",

"arguments": {"target_id": t['id'], "limit": 10}}

for t in targets['data'][:5]

]

compounds = tu.run_batch(compound_calls)

# Predict ADMET

admet_results = []

for comp_list in compounds:

if comp_list and 'molecules' in comp_list:

for mol in comp_list['molecules'][:3]:

admet = tu.tools.ADMETAI_predict_admet(

smiles=mol['smiles'],

use_cache=True

)

admet_results.append(admet)

return {"targets": targets, "compounds": compounds, "admet": admet_results}

finally:

tu.close()

```

Configuration

Caching

```python

# Enable globally

tu = ToolUniverse(use_cache=True)

tu.load_tools()

# Or per-call

result = tu.tools.ADMETAI_predict_admet(

smiles="...",

use_cache=True # Cache expensive predictions

)

# Manage cache

stats = tu.get_cache_stats()

tu.clear_cache()

```

Hooks (Auto-summarization)

```python

# Enable hooks for large outputs

tu = ToolUniverse(hooks_enabled=True)

tu.load_tools()

result = tu.tools.OpenTargets_get_target_gene_ontology_by_ensemblID(

ensemblId="ENSG00000012048"

)

# Check if summarized

if isinstance(result, dict) and "summary" in result:

print(f"Summarized: {result['summary']}")

```

Load Specific Categories

```python

# Faster loading

tu = ToolUniverse()

tu.load_tools(categories=["proteins", "drugs"])

```

Critical Things to Know

⚠️ Always Call load_tools()

```python

# ❌ Wrong - will fail

tu = ToolUniverse()

result = tu.tools.some_tool() # Error!

# ✅ Correct

tu = ToolUniverse()

tu.load_tools()

result = tu.tools.some_tool()

```

⚠️ Tool Finder Returns Nested Structure

```python

# ❌ Wrong

tools = tu.run({"name": "Tool_Finder_Keyword", "arguments": {"description": "protein"}})

for tool in tools: # Error: tools is dict

print(tool['name'])

# ✅ Correct

if isinstance(tools, dict) and 'tools' in tools:

for tool in tools['tools']:

print(tool['name'])

```

⚠️ Check Required Parameters

```python

# Check tool schema first

tool_info = tu.all_tool_dict["UniProt_get_entry_by_accession"]

required = tool_info['parameter'].get('required', [])

print(f"Required: {required}")

# Then call

result = tu.tools.UniProt_get_entry_by_accession(accession="P05067")

```

⚠️ Cache Strategy

```python

# ✅ Cache: ML predictions, database queries (deterministic)

result = tu.tools.ADMETAI_predict_admet(smiles="...", use_cache=True)

# ❌ Don't cache: real-time data, time-sensitive results

result = tu.tools.get_latest_publications() # No cache

```

⚠️ Error Handling

```python

from tooluniverse.exceptions import ToolError, ToolUnavailableError

try:

result = tu.tools.UniProt_get_entry_by_accession(accession="P05067")

except ToolUnavailableError as e:

print(f"Tool unavailable: {e}")

except ToolError as e:

print(f"Execution failed: {e}")

```

⚠️ Tool Names Are Case-Sensitive

```python

# ❌ Wrong

result = tu.tools.uniprot_get_entry_by_accession(accession="P05067")

# ✅ Correct

result = tu.tools.UniProt_get_entry_by_accession(accession="P05067")

```

Execution Options

```python

result = tu.tools.tool_name(

param="value",

use_cache=True, # Cache this call

validate=True, # Validate parameters (default)

stream_callback=None # Streaming output

)

```

Performance Tips

```python

# 1. Load specific categories

tu.load_tools(categories=["proteins"])

# 2. Use batch execution

results = tu.run_batch(calls)

# 3. Enable caching

tu = ToolUniverse(use_cache=True)

# 4. Disable validation (after testing)

result = tu.tools.tool_name(param="value", validate=False)

```

Troubleshooting

Tool Not Found

```python

# Search for tool

tools = tu.run({

"name": "Tool_Finder_Keyword",

"arguments": {"description": "partial_name", "limit": 10}

})

# Check if exists

if "Tool_Name" in tu.all_tool_dict:

print("Found!")

```

API Key Issues

```python

import os

if not os.environ.get("OPENAI_API_KEY"):

print("⚠️ OPENAI_API_KEY not set")

print("Set: export OPENAI_API_KEY='sk-...'")

```

Validation Errors

```python

from tooluniverse.exceptions import ToolValidationError

try:

result = tu.tools.some_tool(param="value")

except ToolValidationError as e:

# Check schema

tool_info = tu.all_tool_dict["some_tool"]

print(f"Required: {tool_info['parameter'].get('required', [])}")

print(f"Properties: {tool_info['parameter']['properties'].keys()}")

```

Enable Debug Logging

```python

from tooluniverse.logging_config import set_log_level

set_log_level("DEBUG")

```

Tool Categories

| Category | Tools | Use Cases |

|----------|-------|-----------|

| Proteins | UniProt, RCSB PDB, AlphaFold | Protein analysis, structure |

| Drugs | DrugBank, ChEMBL, PubChem | Drug discovery, compounds |

| Genomics | Ensembl, NCBI Gene, gnomAD | Gene analysis, variants |

| Diseases | OpenTargets, ClinVar | Disease-target associations |

| Literature | PubMed, Europe PMC | Literature search |

| ML Models | ADMET-AI, AlphaFold | Predictions, modeling |

| Pathways | KEGG, Reactome | Pathway analysis |

Resources

Documentation: https://zitniklab.hms.harvard.edu/ToolUniverse/
Tool List: https://zitniklab.hms.harvard.edu/ToolUniverse/tools/tools_config_index.html
GitHub: https://github.com/mims-harvard/ToolUniverse
Examples: See examples/ directory in repository
Slack: https://join.slack.com/t/tooluniversehq/shared_invite/zt-3dic3eoio-5xxoJch7TLNibNQn5_AREQ

For detailed guides, see [REFERENCE.md](REFERENCE.md).

More from this repository10

🎯

tooluniverse-literature-deep-research🎯Skill

Performs comprehensive literature research with target disambiguation, evidence grading, and structured theme extraction for thorough scientific investigations.

🎯

tooluniverse-protein-structure-retrieval🎯Skill

Retrieves protein structure data from various databases and provides detailed structural information for scientific research and analysis.

🎯

tooluniverse-chemical-compound-retrieval🎯Skill

Retrieves comprehensive chemical compound data from PubChem and ChEMBL, providing detailed profiles with identifiers, properties, and bioactivity information.

🎯

tooluniverse-sequence-retrieval🎯Skill

Retrieves biological sequences from NCBI and ENA with precise gene disambiguation, accession handling, and comprehensive sequence metadata.

🎯

tooluniverse-disease-research🎯Skill

Researches and provides comprehensive insights into diseases, symptoms, treatments, and medical research using advanced AI analysis.

🎯

tooluniverse-expression-data-retrieval🎯Skill

Retrieves comprehensive gene expression and multi-omics datasets from ArrayExpress and BioStudies with intelligent gene disambiguation and quality assessment.

🎯

tooluniverse-target-research🎯Skill

Performs targeted research by systematically exploring and analyzing information sources to gather comprehensive insights on a specific topic or research question.

🎯

devtu-optimize-skills🎯Skill

Streamlines developer tool skill optimization by analyzing performance, identifying bottlenecks, and recommending targeted improvements for code efficiency.

🎯

devtu-optimize-descriptions🎯Skill

Optimizes tool descriptions in ToolUniverse JSON configs by reviewing and enhancing clarity, prerequisites, parameter guidance, and usage examples.

🎯

devtu-create-tool🎯Skill

Generates scientific tool classes and configurations for ToolUniverse framework, ensuring proper structure, validation, and automated wrapper creation.