We have a lot of software already installed on the server that covers applications ranging from QC analysis and preprocessing of raw sequence data, transcriptome analysis from RNAseq data, 16S and shotgun metagenomics pipelines, WGS tools, and more. If you have an account on our cluster, then you already have access to all of the software below, so get started!
If you’re looking for a piece of software and don’t find it below, just reach out to Dan Beiting to inquire about getting it installed.
|software||Version||Website||Citation||category||how to run||When to use||Reference files or databases|
When you're ready to dive into Metagenome Assembled Genomes (MAGs).
genome assembly visualization
BaseSpace Sequence Hub CLI tool suite
Anytime you want to calculate genomic metrics from sequence data (e.g. coverage)
read alignmentMicrobial ecology pipeline for 16S rRNA data
One of the best and most popular base-wise aligners. Even if you don't use it as your primary aligner, it is still used by many other software tools under the hood.
prebuilt bowtie2 indexes for many species are located in /data/reference_db in folders named by genus and species
I don't use this directly, but used by other programs for alignment
If you have a bacterial genome assembly and want to check the quality of the assembly
uses checkm_db, which is located in /data/reference_db
If you have an RNAseq dataset with multiple timepoints and want to identify 'tight' modules of co-regulated genes across these timepoints. Clust also allows comparison of modules between datasets/experiments.
When you have metagenomic data and want to put de novo assembled contigs into genome bins
navigate to /home/shared/softwares/Cytoscape_v3.5.1 folder. Double click to open program
If you want to identify strains in your metagenomic data
multiple sequence alignment
If you have a bunch of protein AA or translated DNA sequences that you want to align.
The preferred choice for rapid quality control assessment of raw reads in a fastq file
simple tool for figuring out if you fastq file has 'contaminating' reads from specific species. uses bowtie2 under the hood.
If you have Oxford Nanopore long read data and want to filter your raw data to remove reads based on length or quality
When you are working with SNPs/variants
A convenient wrapper around the fasterq_dump software that makes it easy to grab sequences from SRA, ENA, MGRAST and iMicrobe
If you want to create a visual link between microbiome samples, their phylogenetic relationship and sample or patient metadata
You have shotgun metagenomic data from a microbial community and want to understand functional content (e.g. bacterial metabolic pathways). Note that humann2 uses DIAMOND, MinPath and Bowtie2 under the hood
If you’ve already aligned data using a base-aware aligned (e.g. STAR or Bowtie2) and you want to summarize reads to genomic features (genes, exons, etc)
Our preferred choice for mapping RNA-seq raw reads to a reference transcriptome
prebuilt kallisto indexes for a few species in /data/reference_db/kallisto
If you want to remove 'contaminating' reads from a fastq file. Uses bowtie2 under the hood
biological pathway reconstructions using protein family predictions
kracken reference database is in /data/reference_db/kraken2db_standard/
Anytime you have ATAC-seq or ChIP-seq data and want to identify 'peaks' or read pile-ups at specific positions in the genome
We don't use this as standalone software, but it is needed by Sourmash, which we use a lot
If you want to assemble genomes from metagenomic data
Our preferred choice for quickly and easily summarizing QC metrics, as well as outputs from MANY other programs, in a convenient html report
If you want to set up an automated workflow on our server
One of the main places we use this is for filtering out PCR duplicates in our ATAC-seq workflow
0.2.4 (no longer maintained/supported)
When you have Nanopore reads and you want to trim off the adapter sequence
Great for quickly (and accurately) annotating a bacterial genome
Anytime you want to figure out microbial community composition from 16S data
A powerful suite of tools for working with aligment files (bam, sam, etc)
working with fasta/fastq
I use this anytime I want to quickly subsample a fastq file
working with fasta/fastq
Anytime you need to manipulate a fastq/a file. Some overlap in functionality with seqtk
Fantastic software that takes an alignment-free approach to compare two or more fastq files to each other, or to all of refseq or genbank to understand what organisms might be present in the data.
refseq and genbank microbial reference 'sketches' are in /data/reference_db/sourmash_refs
If you have a metagenomic sequencing data and want to assemble microbial genomes de novo
If you have some RNAseq data and want to find fusion and non-fusion transcript sequence variants
Very fast and popular base-wise aligner
prebuilt STAR indexes for several species present in /data/reference_db/star
Anytime you need to trim or filter raw reads from a fastq file based on base quality scores or length
If you have short (Illuminati) and long (Nanopore or PacBio) reads from a bacterial isolate and want to get a complete genome assembly