software | Version | Website | Citation | category | how to run | When to use | Reference files or databases | |
---|---|---|---|---|---|---|---|---|
anviβo | 7.1 | Community-led, integrated, reproducible multi-omics with anviβo and Anviβo: an advanced analysis and visualization platform for βomics data | metagenomicsvisualizationMAGs |
| When you're ready to dive into Metagenome Assembled Genomes (MAGs). | |||
Amazon Web Services Command Line Interface (AWS CLI) | 2.12.6 | unpublished | utility |
| When you want to get reference genomes from the Illumina iGenomes project: https://ewels.github.io/AWS-iGenomes/ | |||
1.5.3 | unpublished | NGS tools |
| |||||
bcftools | 1.18 | NGS tools | | |||||
2.20.0.422 | unpublished | NGS tools | bcl2fastq -R $rundirectory -o $outdirectory --sample-sheet $samplesheet.csv --no-lane-splitting | |||||
4.2.7 | unpublished | NGS tools | bcl-convert | |||||
2.31.0 | BEDTools: a flexible suite of utilities for comparing genomic features | NGS tools |
| Anytime you want to calculate genomic metrics from sequence data (e.g. coverage) | ||||
BLAST | 2.12.0 | sequence search |
| |||||
2.5.1 | Ultrafast and memory-efficient alignment of short DNA sequences to the human genome and Fast gapped-read alignment with Bowtie2 | read alignment |
| One of the best and most popular base-wise aligners. Even if you don't use it as your primary aligner, it is still used by many other software tools under the hood. | prebuilt bowtie2 indexes for many species are located in /data/reference_db in folders named by genus and species | |||
0.7.17-r1188 | Fast and accurate short read alignment with BurrowsβWheeler transform | read alignment |
| I don't use this directly, but used by other programs for alignment | ||||
CellxGene gateway | 0.3.11 | unpublished | single cell |
| Allows us to host a cellxgene instance that works with multiple datasets |
| ||
CellRanger | 7.1.0 | single cell |
| If you want to preprocess single cell genomic data from the 10x platform | ||||
CellRanger-arc | 2.0.2 | single cell |
| If you want to preprocess single cell genomic data from the 10x platform | ||||
1.1.3 | metagenomicsQA/QC |
| If you have a bacterial genome assembly and want to check the quality of the assembly |
| ||||
1.18.0 | Clust: automatic extraction of optimal co-expressed gene clusters from gene expression data | transcriptomics |
| If you have an RNAseq dataset with multiple timepoints and want to identify 'tight' modules of co-regulated genes across these timepoints. Clust also allows comparison of modules between datasets/experiments. | ||||
3.5.2 | deepTools2: A next Generation Web Server for Deep-Sequencing Data Analysis | visualizationNGS tools |
| |||||
2.1.8 | multiple sequence alignment |
| If you have a bunch of protein AA or translated DNA sequences that you want to align. | diamond formatted databases for UniRef90 and UniRef50 live in | ||||
Docker | 24.0.2, build cb74dfc | unpublished | containerized software |
| ||||
Dorado | 0.3.1+bb8c5ee | unpublished | nanoporebasecallingGPU |
| ||||
0.23.4 | QA/QC |
| ||||||
0.12.1 | unpublished | QA/QC |
| The preferred choice for rapid quality control assessment of raw reads in a fastq file | ||||
0.15.3 | FastQ Screen: A tool for multi-genome mapping and quality control | decontaminationQA/QC |
| simple tool for figuring out if you fastq file has 'contaminating' reads from specific species. uses bowtie2 under the hood. | ||||
Filezilla | 3.63.0 | unpublished | utility |
| ||||
0.2.1 | unpublished | QA/QCnanopore |
| If you have Oxford Nanopore long read data and want to filter your raw data to remove reads based on length or quality | ||||
1.3.6 | Haplotype-based variant detection from short-read sequencing | variantsSNPs/INDELs |
| For variant calling | ||||
4.4.0.0 | variants |
| When you are working with SNPs/variants | |||||
0.7.0 | public data |
| A convenient wrapper around the fasterq_dump software that makes it easy to grab sequences from SRA, ENA, MGRAST and iMicrobe | |||||
GTDB-TK | 2.1.1 | GTDB-Tk2: memory friendly classification with the genome taxonomy database and GTDB-Tk: A toolkit to classify genomes with the Genome Taxonomy Database | metagenomicsclassification |
|
| |||
3.7 | Species-level functional profiling of metagenomes and metatranscriptomes | metagenomicsfunctional profiling | You have shotgun metagenomic data from a microbial community and want to understand functional content (e.g. bacterial metabolic pathways). Note that humann2 uses DIAMOND, MinPath and Bowtie2 under the hood | |||||
htop | 3.2.2 | unpublished | utility |
| ||||
1.10 | Measurement of bacterial replication rates in microbial communities | metagenomics |
| |||||
2.3.1 | A fast, lock-free approach for efficient parallel counting of occurrences ofΒ k-mers | NGS tools |
| For rapid/efficient counting of kmers in DNA | ||||
0.50.1 | transcriptomicsread alignment |
| Our preferred choice for mapping RNA-seq raw reads to a reference transcriptome | prebuilt kallisto indexes for a few species in /data/reference_db/kallisto | ||||
0.27.3 | Near-optimal probabilistic RNA-seq quantification and Modular, efficient and constant-memory single-cell RNA-seq preprocessing | single cell |
| A great alternative to CellRanger for preprocessing single cell data from the 10x platform. | prebuilt kallisto indexes for a few species in /data/reference_db/kallisto | |||
0.12.0 | unpublished | decontamination |
| If you want to remove 'contaminating' reads from a fastq file. Uses bowtie2 under the hood | ||||
2.0.7-beta | Kraken: ultrafast metagenomic sequence classification using exact alignments and Improved metagenomic analysis with Kraken 2 | metagenomicsclassification |
|
| ||||
Krakenuniq | 0.5.8 | KrakenUniq: confident and fast metagenomics classification using uniqueΒ k-mer counts | metagenomicsclassification |
| ||||
3.0.0a6 | Model-based Analysis of ChIP-Seq (MACS) and Improved peak-calling with MACS2 | Epigenetics |
| Anytime you have ATAC-seq or ChIP-seq data and want to identify 'peaks' or read pile-ups at specific positions in the genome | ||||
marker_alignments | 0.4.2 | Improved eukaryotic detection compatible with large-scale automated analysis of metagenomes | metagenomicsclassification |
| If you want to find microbial eukaryotes in metagenomic data | the EukDetect database used by this program lives in | ||
Mastiff | 0.0.3 | unpublished | metagenomicspublic data |
| ||||
MaxBin2 | 2.2.7 | assembly |
| |||||
1.2.9 | An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph and A Fast and Scalable Metagenome Assembler driven by Advanced Methodologies and Community Practices | assemblymetagenomics |
| If you want to assemble genomes from metagenomic data | ||||
4.06 | Metagenomic microbial community profiling using unique clade-specific marker genes and Extending and improving metagenomic taxonomic profiling with uncharacterized species with MetaPhlAn 4 | metagenomicsclassification |
|
| ||||
Micro | 2.0.11 | unpublished | utility |
| anytime you need to edit a text file in the terminalβ¦.itβs far better than vim or nano! | |||
0.3.4 | NGS tools |
| ||||||
1.14 | unpublished | QA/QC |
| Our preferred choice for quickly and easily summarizing QC metrics, as well as outputs from MANY other programs, in a convenient html report | ||||
23.04.2.5870 | unpublished | workflow management |
| If you want to set up an automated workflow on our server | ||||
nf-core | 2.10 | The nf-core framework for community-curated bioinformatics pipelines | workflow management |
| ||||
nvitop | 1.1.2 | unpublished | utilityGPU |
| ||||
nvtop | 3.0.1 | unpublished | utilityGPU |
| ||||
3.0.0 | unpublished | NGS tools |
| One of the main places we use this is for filtering out PCR duplicates in our ATAC-seq workflow | ||||
1.07 | Second-generation PLINK: rising to the challenge of larger and richer datasets | comparative genomics |
| Used for GWAS and other popgen analyses | ||||
0.2.4 (no longer maintained/supported) | unpublished | QA/QCnanopore |
| When you have Nanopore reads and you want to trim off the adapter sequence | ||||
1.14.6 | annotation |
| Great for quickly (and accurately) annotating a bacterial genome | |||||
2023.5.1 | Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2 | 16S |
| Anytime you want to figure out microbial community composition from 16S data | ||||
1.9.1 | QIIME allows analysis of high-throughput community sequencing data | 16S |
| |||||
Rosella | 0.4.2 | unpublished | metagenomicsbinningMAGs | |||||
rust | 1.26.0 | programming language |
| |||||
1.16.1 | NGS tools |
| A powerful suite of tools for working with aligment files (bam, sam, etc) | |||||
1.3-r106 | working with fasta/fastq |
| I use this anytime I want to quickly subsample a fastq file | |||||
2.3.0 | SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA/Q File Manipulation | working with fasta/fastq |
| Anytime you need to manipulate a fastq/a file. Some overlap in functionality with seqtk | ||||
5.1d | unpublished | SNPs/INDELs |
| |||||
3.15.4 | SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing | assemblymetagenomics |
| If you have a metagenomic sequencing data and want to assemble microbial genomes de novo | ||||
SWGA2 | 1.0.0 | A fast machine-learning-guided primer design pipeline for selective whole genome amplification | metagenomics |
| when you want to design primers for carrying out selective whole genome amplification (SWGA) | |||
4.8.2 | metagenomicsclassification |
| Fantastic software that takes an alignment-free approach to compare two or more fastq files to each other, or to all of refseq or genbank to understand what organisms might be present in the data. | |||||
2.1.1 | unpublished | single cell |
| When you have spatial gene expression from the Visium 10x platform | ||||
3.0.5 | public data |
| ||||||
2.7.10b | read alignment |
| Very fast and popular base-wise aligner | prebuilt STAR indexes for several species present in /data/reference_db/star | ||||
StrainPhlAn | 4.0.6 | Microbial strain-level population structure and genetic diversity from metagenomes | metagenomicsclassification |
| ||||
4.1.0 | Sunbeam: an extensible pipeline for analyzing metagenomic sequencing experiments | metagenomicsclassification |
| |||||
0.39 | QA/QC |
| Anytime you need to trim or filter raw reads from a fastq file based on base quality scores or length | |||||
0.5.0 | Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads | assembly |
| If you have short (Illuminati) and long (Nanopore or PacBio) reads from a bacterial isolate and want to get a complete genome assembly | ||||
VCFtools | 0.1.17 | variants |
| |||||
velocyto | 0.17 | single cell |
| |||||