QIIME2: analysis of data pulled from MicrobiomeDB

There may be times when you want to use QIIME2 to analyze microbiome data from published studies. One simple way to do this uses our MicrobiomeDB.org resource to bypass the need to locate, download and preprocess raw sequence files. MicrobiomeDB.org is home to microbiome data from over 30,000 samples pulled from the published literature.

Downloading data from MicrobiomeDB.org
Importing MicrobiomeDB data into QIIME
Preparing your metadata

Downloading data from MicrobiomeDB.org

Open your preferred web browser and navigate to MicrobiomeDB.org,

Once on the website, find the 'card' for the study that you're interested in analyzing. For example, here is the card for the DIABIMMUNE study that was carried out by the Broad Institute. Click on the 'Download Data' icon at the bottom of card (circled in red below).

Clicking on the 'Download Data' icon will take you to an FTP download site (below). There will be multiple files available for download, and the exact number and names of these files will depend on the dataset. Here's an example for the DIABIMMUNE dataset. You'll want to download the .biom file corresponding to the DADA2 taxon abundance output, as well as the sample details in .tsv format (indicated in red below).

Importing MicrobiomeDB data into QIIME

Import data using QIIME2. Note that the .biom files obtained from MicrobiomeDB are .biom version 1, so we have to let QIIME2 know this during import.

qiime tools import \
  --input-path DIABIMMUNE.16s.DADA2.taxon_abundance.biom \
  --output-path DIABIMMUNE.qza \
  --input-format BIOMV100Format \
  --type "FeatureTable[Frequency]"

If you have more than one .biom file (e.g. two from MicrobiomeDB, or one from MicrobiomeDB and one of your own .biom files), you can merge them as follows:

qiime feature-table merge
–i-tables DIABIMMUNE.qza
–i-tables DATASET2.qza
–o-merged-table merged.qza

Preparing your metadata

Open the sample_details.tsv file you downloaded from MicrobiomeDB.org and copy/paste the contents into a new Google Sheet. Install the Keemei plugin for google sheets, and use the plugin to check whether you sample details are correctly formatted for QIIME2.

Once Keemei says your data is good to go, then copy/paste back into the tsv file (if no changes were needed, then you can use the original .tsv).

You're now ready to import the sample details into your existing QIIME2 artifact

qiime feature-table summarize \
  --i-table DIABIMMUNE.qza \
  --o-visualization table.qzv \
  --m-sample-metadata-file DIABIMMUNE.16s_DADA2.sample_details.tsv