1. Database Information (`kegg_info`)
Retrieve metadata and statistics about KEGG databases.
When to use: Understanding database structure, checking available data, getting release information.
Usage:
```python
from scripts.kegg_api import kegg_info
# Get pathway database info
info = kegg_info('pathway')
# Get organism-specific info
hsa_info = kegg_info('hsa') # Human genome
```
Common databases: kegg, pathway, module, brite, genes, genome, compound, glycan, reaction, enzyme, disease, drug
2. Listing Entries (`kegg_list`)
List entry identifiers and names from KEGG databases.
When to use: Getting all pathways for an organism, listing genes, retrieving compound catalogs.
Usage:
```python
from scripts.kegg_api import kegg_list
# List all reference pathways
pathways = kegg_list('pathway')
# List human-specific pathways
hsa_pathways = kegg_list('pathway', 'hsa')
# List specific genes (max 10)
genes = kegg_list('hsa:10458+hsa:10459')
```
Common organism codes: hsa (human), mmu (mouse), dme (fruit fly), sce (yeast), eco (E. coli)
3. Searching (`kegg_find`)
Search KEGG databases by keywords or molecular properties.
When to use: Finding genes by name/description, searching compounds by formula or mass, discovering entries by keywords.
Usage:
```python
from scripts.kegg_api import kegg_find
# Keyword search
results = kegg_find('genes', 'p53')
shiga_toxin = kegg_find('genes', 'shiga toxin')
# Chemical formula search (exact match)
compounds = kegg_find('compound', 'C7H10N4O2', 'formula')
# Molecular weight range search
drugs = kegg_find('drug', '300-310', 'exact_mass')
```
Search options: formula (exact match), exact_mass (range), mol_weight (range)
4. Retrieving Entries (`kegg_get`)
Get complete database entries or specific data formats.
When to use: Retrieving pathway details, getting gene/protein sequences, downloading pathway maps, accessing compound structures.
Usage:
```python
from scripts.kegg_api import kegg_get
# Get pathway entry
pathway = kegg_get('hsa00010') # Glycolysis pathway
# Get multiple entries (max 10)
genes = kegg_get(['hsa:10458', 'hsa:10459'])
# Get protein sequence (FASTA)
sequence = kegg_get('hsa:10458', 'aaseq')
# Get nucleotide sequence
nt_seq = kegg_get('hsa:10458', 'ntseq')
# Get compound structure
mol_file = kegg_get('cpd:C00002', 'mol') # ATP in MOL format
# Get pathway as JSON (single entry only)
pathway_json = kegg_get('hsa05130', 'json')
# Get pathway image (single entry only)
pathway_img = kegg_get('hsa05130', 'image')
```
Output formats: aaseq (protein FASTA), ntseq (nucleotide FASTA), mol (MOL format), kcf (KCF format), image (PNG), kgml (XML), json (pathway JSON)
Important: Image, KGML, and JSON formats allow only one entry at a time.
5. ID Conversion (`kegg_conv`)
Convert identifiers between KEGG and external databases.
When to use: Integrating KEGG data with other databases, mapping gene IDs, converting compound identifiers.
Usage:
```python
from scripts.kegg_api import kegg_conv
# Convert all human genes to NCBI Gene IDs
conversions = kegg_conv('ncbi-geneid', 'hsa')
# Convert specific gene
gene_id = kegg_conv('ncbi-geneid', 'hsa:10458')
# Convert to UniProt
uniprot_id = kegg_conv('uniprot', 'hsa:10458')
# Convert compounds to PubChem
pubchem_ids = kegg_conv('pubchem', 'compound')
# Reverse conversion (NCBI Gene ID to KEGG)
kegg_id = kegg_conv('hsa', 'ncbi-geneid')
```
Supported conversions: ncbi-geneid, ncbi-proteinid, uniprot, pubchem, chebi
6. Cross-Referencing (`kegg_link`)
Find related entries within and between KEGG databases.
When to use: Finding pathways containing genes, getting genes in a pathway, mapping genes to KO groups, finding compounds in pathways.
Usage:
```python
from scripts.kegg_api import kegg_link
# Find pathways linked to human genes
pathways = kegg_link('pathway', 'hsa')
# Get genes in a specific pathway
genes = kegg_link('genes', 'hsa00010') # Glycolysis genes
# Find pathways containing a specific gene
gene_pathways = kegg_link('pathway', 'hsa:10458')
# Find compounds in a pathway
compounds = kegg_link('compound', 'hsa00010')
# Map genes to KO (orthology) groups
ko_groups = kegg_link('ko', 'hsa:10458')
```
Common links: genes β pathway, pathway β compound, pathway β enzyme, genes β ko (orthology)
7. Drug-Drug Interactions (`kegg_ddi`)
Check for drug-drug interactions.
When to use: Analyzing drug combinations, checking for contraindications, pharmacological research.
Usage:
```python
from scripts.kegg_api import kegg_ddi
# Check single drug
interactions = kegg_ddi('D00001')
# Check multiple drugs (max 10)
interactions = kegg_ddi(['D00001', 'D00002', 'D00003'])
```