bio-variant-calling-joint-calling
π―Skillfrom gptomics/bioskills
Performs joint variant calling across multiple samples to improve variant detection accuracy and identify shared genetic variations in population-level genomic datasets.
Installation
npx skills add https://github.com/gptomics/bioskills --skill bio-variant-calling-joint-callingSkill Details
Overview
# bioSkills
A collection of skills that guide AI coding agents (Claude Code, Codex, Gemini) through common bioinformatics tasks.
Project Goal
This repository provides AI agents with expert knowledge for bioinformatics workflows. Each skill contains code patterns, best practices, and examples that help agents generate correct, idiomatic code for common tasks.
Target users range from undergrads learning computational biology to PhD researchers processing large-scale data. The skills cover the full spectrum from basic sequence manipulation to advanced analyses like single-cell RNA-seq and population genetics.
Requirements
Python
- Python 3.9+
- biopython, pysam, cyvcf2, pybedtools, pyBigWig, scikit-allel, anndata
```bash
pip install biopython pysam cyvcf2 pybedtools pyBigWig scikit-allel anndata mygene
```
R/Bioconductor
Required for differential expression, single-cell, pathway analysis, and methylation skills.
```r
if (!require('BiocManager', quietly = TRUE))
install.packages('BiocManager')
BiocManager::install(c('DESeq2', 'edgeR', 'Seurat', 'clusterProfiler', 'methylKit'))
```
CLI Tools
```bash
# macOS
brew install samtools bcftools blast minimap2 bedtools
# Ubuntu/Debian
sudo apt install samtools bcftools ncbi-blast+ minimap2 bedtools
# conda
conda install -c bioconda samtools bcftools blast minimap2 bedtools \
fastp kraken2 metaphlan sra-tools bwa-mem2 bowtie2 star hisat2 \
manta delly cnvkit macs3 tobias
```
Installation
Claude Code
```bash
git clone https://github.com/your-username/bioSkills.git
cd bioSkills
./install-claude.sh # Install globally
./install-claude.sh --project /path/to/project # Or install to specific project
./install-claude.sh --list # List available skills
./install-claude.sh --validate # Validate all skills
./install-claude.sh --update # Only update changed skills
./install-claude.sh --uninstall # Remove all bio-* skills
```
Codex CLI
```bash
./install-codex.sh # Install globally
./install-codex.sh --project /path/to/project # Or install to specific project
./install-codex.sh --validate # Validate all skills
./install-codex.sh --update # Only update changed skills
./install-codex.sh --uninstall # Remove all bio-* skills
```
Gemini CLI
```bash
./install-gemini.sh # Install globally
./install-gemini.sh --project /path/to/project # Or install to specific project
./install-gemini.sh --validate # Validate all skills
./install-gemini.sh --update # Only update changed skills
./install-gemini.sh --uninstall # Remove all bio-* skills
```
Codex and Gemini installers convert to the Agent Skills standard (examples/ -> scripts/, usage-guide.md -> references/).
Skill Categories
| Category | Skills | Primary Tools | Description |
|----------|--------|---------------|-------------|
| sequence-io | 9 | Bio.SeqIO | Read, write, convert FASTA/FASTQ/GenBank and 40+ formats |
| sequence-manipulation | 7 | Bio.Seq, Bio.SeqUtils | Transcription, translation, motif search, sequence properties |
| database-access | 10 | Bio.Entrez, BLAST+, SRA toolkit, UniProt API | NCBI/UniProt queries, SRA downloads, BLAST, homology searches |
| alignment-files | 9 | samtools, pysam | SAM/BAM/CRAM viewing, sorting, filtering, statistics, validation |
| variant-calling | 13 | bcftools, cyvcf2, Manta, Delly, VEP, SnpEff | VCF/BCF calling, SVs, filtering, annotation, clinical interpretation |
| alignment | 4 | Bio.Align, Bio.AlignIO | Pairwise and multiple sequence alignment, MSA statistics, alignment I/O |
| phylogenetics | 5 | Bio.Phylo, IQ-TREE2, RAxML-ng | Tree I/O, visualization, ML inference with model selection, ultrafast bootstrap |
| **differential-expr
More from this repository10
Analyzes microbiome composition and diversity by processing taxonomic abundance data, calculating ecological diversity indices, and generating statistical comparisons across different sample groups...
Generates reproducible microbiome data analysis workflows, automating sequence processing, taxonomic classification, diversity analysis, and statistical comparisons across microbiome samples using ...
Aligns long-read sequencing data (like Oxford Nanopore or PacBio reads) to a reference genome using specialized alignment algorithms optimized for high-error long-read technologies.
Calculates and reports comprehensive statistical metrics from Variant Call Format (VCF) files, including variant counts, allele frequencies, heterozygosity rates, and population genetic diversity i...
Analyzes spatial interactions and communication patterns between different cell types in tissue samples using spatial transcriptomics data, identifying potential intercellular signaling networks an...
Transforms gene expression data into biological pathway insights by mapping differentially expressed genes to molecular pathways and performing enrichment analysis.
Identifies and analyzes copy number variations (CNVs) in genomic data using GATK's CNV calling algorithms, generating comprehensive reports on genomic structural variations.
Guides AI agents through basic differential expression analysis using DESeq2 in R, providing code templates and best practices for processing RNA-seq count data and identifying statistically signif...
Generates pathway enrichment analysis visualizations by processing gene lists, performing statistical enrichment tests, and creating informative plots that highlight significant biological pathways...
Generates computational workflows for identifying protein-binding sites and transcription factor footprints from ATAC-seq genomic accessibility data using advanced computational analysis techniques.