We have a lot of software already installed on the server that covers applications ranging from QC analysis and preprocessing of raw sequence data, transcriptome analysis from RNAseq data, 16S and shotgun metagenomics pipelines, WGS tools, and more. If you have an account on our cluster, then you already have access to all of the software below, so get started!
If you’re looking for a piece of software and don’t find it below, just reach out to Dan Beiting to inquire about getting it installed.
software | Version | Website | Citation | category | how to run | When to use | Reference files or databases |
---|---|---|---|---|---|---|---|
anvi’o | 7.0 | metagenomicsvisualization | source activate anvio-7 | When you're ready to dive into Metagenome Assembled Genomes (MAGs). | |||
BamM | 1.7.3 | unpublished | NGS tools | bamm | |||
Bandage | NA | genome assembly visualization | Bandage (must be sitting at the Linux to use, since it is a graphical interface) | ||||
BaseSpace Sequence Hub CLI tool suite | 1.1.0 | unpublished | NGS tools | bs | |||
bcftools | NGS tools | bcftools | |||||
bedtools | 2.26.0 | NGS tools | bedtools | Anytime you want to calculate genomic metrics from sequence data (e.g. coverage) | |||
BLAST | 2.9.0 | sequence search | option include blastn , blastp , or blastx | ||||
bowtie2 | 2.3.4.1 | read alignmentMicrobial ecology pipeline for 16S rRNA data | bowtie2 | One of the best and most popular base-wise aligners. Even if you don't use it as your primary aligner, it is still used by many other software tools under the hood. | prebuilt bowtie2 indexes for many species are located in /data/reference_db in folders named by genus and species | ||
BWA | 0.7.17-r1188 | read alignment | bwa | I don't use this directly, but used by other programs for alignment | |||
CellRanger | 7.0.0 | single cell | cellranger | If you want to preprocess single cell genomic data from the 10x platform | |||
CellRanger-arc | 2.0.1 | single cell | cellranger-arc | If you want to preprocess single cell genomic data from the 10x platform | |||
Circos | visualization | circos | |||||
CheckM | 1.1.3 | metagenomicsgenome assemblyQA/QC | checkm lineage_wf | If you have a bacterial genome assembly and want to check the quality of the assembly | uses checkm_db, which is located in /data/reference_db | ||
Clust | 1.17.0 | transcriptomics | clust | If you have an RNAseq dataset with multiple timepoints and want to identify 'tight' modules of co-regulated genes across these timepoints. Clust also allows comparison of modules between datasets/experiments. | |||
CONCOCT | 1.1.0 | metagenomics | concoct | When you have metagenomic data and want to put de novo assembled contigs into genome bins | |||
Cytoscape | network analysis | navigate to /home/shared/softwares/Cytoscape_v3.5.1 folder. Double click to open program | |||||
deeptools | 3.5.1 | visualizationNGS tools | deeptools | ||||
DESMAN | metagenomics | desman | If you want to identify strains in your metagenomic data | ||||
DIAMOND | 0.8.22.84 | multiple sequence alignment | diamond | If you have a bunch of protein AA or translated DNA sequences that you want to align. | |||
EMIRGE | metagenomics16S | emirge.py or emirge_amplicon.py | |||||
Fastp | 0.20.1 | QA/QC | fastp | ||||
FastQC | 0.11.7 | unpublished | QA/QC | fastqc | The preferred choice for rapid quality control assessment of raw reads in a fastq file | ||
FastQ Screen | 0.15.1 | decontamination | fastq_screen | simple tool for figuring out if you fastq file has 'contaminating' reads from specific species. uses bowtie2 under the hood. | |||
Filtlong | 0.2.0 | unpublished | QA/QC | filtlong | If you have Oxford Nanopore long read data and want to filter your raw data to remove reads based on length or quality | ||
Freebayes | 1.3.6 | variantsSNPs/INDELs | freebayes | For variant calling | |||
GATK | 4.1.7.0 | variants | gatk | When you are working with SNPs/variants | |||
Grabseqs | 0.7.0 | public data | grabseqs | A convenient wrapper around the fasterq_dump software that makes it easy to grab sequences from SRA, ENA, MGRAST and iMicrobe | |||
GraPhlAn | 1.1.3 | metagenomics visualization | graphlan.py or graphlan_annotate.py | If you want to create a visual link between microbiome samples, their phylogenetic relationship and sample or patient metadata | |||
GroopM | 0.3.4 | metagenomics | groopm | ||||
HISAT2 | |||||||
Humann3 | 3.0.0.alpha.2 | metagenomics | humann2 | You have shotgun metagenomic data from a microbial community and want to understand functional content (e.g. bacterial metabolic pathways). Note that humann2 uses DIAMOND, MinPath and Bowtie2 under the hood | |||
HTSeq | 0.13.5 | transcriptomics | htseq-count or htseq-qa | If you’ve already aligned data using a base-aware aligned (e.g. STAR or Bowtie2) and you want to summarize reads to genomic features (genes, exons, etc) | |||
iRep | 1.1.14 | metagenomics | iRep or bPTR | ||||
Kallisto | 0.46.0 | transcriptomicsread alignment | kallisto | Our preferred choice for mapping RNA-seq raw reads to a reference transcriptome | prebuilt kallisto indexes for a few species in /data/reference_db/kallisto | ||
Kallisto-BUStools | 0.27.3 | single cell | conda activate kb then kb | A great alternative to CellRanger for preprocessing single cell data from the 10x platform. | prebuilt kallisto indexes for a few species in /data/reference_db/kallisto | ||
KneadData | 0.6.1 | unpublished | decontamination | kneaddata | If you want to remove 'contaminating' reads from a fastq file. Uses bowtie2 under the hood | ||
MinPath | biological pathway reconstructions using protein family predictions | MinPath1.4.py | |||||
Kraken2 | 2.0.6 beta | metagenomics | kraken2 | kracken reference database is in /data/reference_db/kraken2db_standard/ | |||
MACS3 | 3.0.0a6 | Epigenetics | conda activate macs3 then macs3 | Anytime you have ATAC-seq or ChIP-seq data and want to identify 'peaks' or read pile-ups at specific positions in the genome | |||
Mash | 2.0 | comparative genomics | mash | We don't use this as standalone software, but it is needed by Sourmash, which we use a lot | |||
MEGAHIT | 1.2.9 | assemblymetagenomics | megahit | If you want to assemble genomes from metagenomic data | |||
MetaPhlAn3 | 3.0 | metagenomics | metaphlan | ||||
MosDepth | 0.3.1 | NGS tools | mosdepth and plot-dist.py for plotting | ||||
Mothur | 1.44.1 | 16S | mothur | ||||
MultiQC | 1.13 | unpublished | QA/QC | multiqc | Our preferred choice for quickly and easily summarizing QC metrics, as well as outputs from MANY other programs, in a convenient html report | ||
Nextflow | 20.01.0.5264 | unpublished | workflow management | nextflow | If you want to set up an automated workflow on our server | ||
Picard tools | unpublished | NGS tools | java -jar /usr/local/bin/picard.jar | One of the main places we use this is for filtering out PCR duplicates in our ATAC-seq workflow | |||
Plink | 1.9 | comparative genomics | plink | Used for GWAS and other popgen analyses | |||
Porechop | 0.2.4 (no longer maintained/supported) | unpublished | QA/QC | porechop | When you have Nanopore reads and you want to trim off the adapter sequence | ||
Prokka | 1.14.6 | annotation | conda activate prokka then prokka | Great for quickly (and accurately) annotating a bacterial genome | |||
QIIME2 | 16S | source activate qiime2 | Anytime you want to figure out microbial community composition from 16S data | ||||
QUAST | QA/QC | quast.py | |||||
ROP | /usr/local/bin/rop/rop.sh | ||||||
RSEM | 1.3.0 | read alignmenttranscriptomics | rsem-prepare-reference , rsem-calculate-expression , rsem-tbam2gbam , rsem-bam2wig | ||||
samtools | 1.7 | NGS tools | samtools | A powerful suite of tools for working with aligment files (bam, sam, etc) | |||
seqtk | 1.2-r101-dirty | working with fasta/fastq | seqtk | I use this anytime I want to quickly subsample a fastq file | |||
Sourmash | 4.5 | metagenomics | conda activate sourmash then sourmash | ||||
seqKit | 0.12.0 | working with fasta/fastq | seqkit | Anytime you need to manipulate a fastq/a file. Some overlap in functionality with seqtk | |||
Sickle | 1.33 | unpublished | QA/QC | sickle se or sickle pe | |||
snpEff | 5.1d | unpublished | SNPs/INDELs | java -jar /usr/local/bin/snpEff/snpEff.jar for snpEff and java -jar /usr/local/bin/snpEff/SnpSift.jar for snpSift | |||
Sourmash | 4.5.0 | comparative genomics | sourmash | Fantastic software that takes an alignment-free approach to compare two or more fastq files to each other, or to all of refseq or genbank to understand what organisms might be present in the data. | refseq and genbank microbial reference 'sketches' are in /data/reference_db/sourmash_refs | ||
SPAdes | 3.12.0 | assemblymetagenomics | spades.py [options] -o <output_dir> | If you have a metagenomic sequencing data and want to assemble microbial genomes de novo | |||
SQUID | 1.4 | transcriptomics | squid | If you have some RNAseq data and want to find fusion and non-fusion transcript sequence variants | |||
SRA toolkit | 2.9.1 | public data | fasterq_dump , sam-dump , and more | ||||
STAR | 2.6.1c | read alignment | STAR (all caps) | Very fast and popular base-wise aligner | prebuilt STAR indexes for several species present in /data/reference_db/star | ||
Sunbeam | metagenomics | source activate sunbeam | |||||
Trimmomatic | 0.39 | QA/QC | java -jar /usr/local/bin/Trimmomatic-0.39/trimmomatic-0.39.jar | Anytime you need to trim or filter raw reads from a fastq file based on base quality scores or length | |||
Unicycler | 0.4.8 | assembly | unicycler | If you have short (Illuminati) and long (Nanopore or PacBio) reads from a bacterial isolate and want to get a complete genome assembly | |||
VCFtools | 0.1.16 | variants | vcftools | ||||