- What sequencing hardware do you have?
- Which library prep kit should I use?
- How ‘deep’ should I sequence each sample?
- Single-end or paired-end?
- How many replicates should I have?
- How much will it cost?
- How long will it take?
- Do you analyze the data for us?
- So I can just give you my samples then?
What sequencing hardware do you have?
We operate an Illumina NextSeq 2000 machine, which is capable of producing reads ranging in length 75-150 bp. These reads can be produced as single- or paired-end reads (dictated by the reagents, not the hardware). Reagents are available for three different run sizes. With P1 kits, approximately 100 million reads or read-pairs are produced. With P2 kits, approximately 400 million reads or read-pairs are produced. With P3 kits, approximately 1.2 billion reads or read-pairs are produced. The total data output depends on library quality and clustering density (more on that below).
We also have access to an Illumina MiniSeq machine, which is also capable of producing reads 75-150 bp in length, single- or paired-end. The MiniSeq has a mid- and high-output mode, which generate 8 or 25 million reads (or read-pairs), respectively. This machine is ideal for sequencing PCR amplicon libraries (or other low-diversity libraries), and for sequencing relatively small genomes (bacterial, viral, etc).
Which library prep kit should I use?
The decision of which library prep kit to use can be overwhelming. We created this decision tree you find the kit that matches your needs. While there are many other options for library prep technology and companies, these are the ones we use in our lab.
We put together this decision tree to help guide your choice of library prep kit.
How ‘deep’ should I sequence each sample?
This depends on the size of the transcriptome (mammalian vs bacterial), as well as the complexity of the library that is prepared (mRNA vs total transcriptome). That said, there is no generally accepted rule for depth of sequencing, only a range that has been reported in the literature and which serves as a useful, albeit broad, set of guidelines. Generally, for RNA-seq on mammalian cells or tissues, it would be typical to aim for 20-40 million reads per sample if you were sequencing an mRNA library, or 40-80 million reads/sample for a library with both mRNAs as well as non-coding RNAs. The added sequencing depth in the later case helps ensure adequate coverage across the more diverse pool of molecules present in your library. Illumina provides this tool for estimating the depth required for your organism. Remember, our machine produces up to 1.2 billion reads, so if you sequence 30 samples on one run, your depth would ideally be about 40 million reads per sample. For this reason, experiments involving mammalian cells or tissues will generally have up to 30 samples per run on our machine.
Single-end or paired-end?
My default answer to this question is almost always to go with single-end. The reason for recommending this is that it is MUCH cheaper, and numerous studies, like this one, have shown that when it comes to measuring gene expression, once you go beyond about 50 bp of sequence, very little is gained. I recommend paired-end data if your reference genome is poor (or non-existent), or if your primary goal is to understand alternative splicing. In these cases, it is worth spending the extra money.
How many replicates should I have?
As many as you can reasonably afford. Seriously. When it comes to boosting your statistical power to detect differentially expressed genes, you are always better served by increasing group sizes than you are by increasing the depth of sequencing. This has been shown many times in the literature, such as here. As a general rule, the more variance you expect in your experiment, the more replicates you should try to include. This means that simple in vitro experiments with a homogenous cell line will likely need fewer replicates than would a study of primary cells derived from human subjects.
How much will it cost?
Costs vary depending on the specifics of your experiment, but are primarily driven by three factors: 1) the number of samples you plan to sequence; 2) the kind of library you will prepare from these samples; and 3) the length and depth of sequencing carried out on these samples. Don't hesitate to contact us for a quote.
How long will it take?
Once you have isolated high-quality RNA from your samples, expect to spend two days preparing mRNA or total transcriptome libraries. We QC these libraries in a few minutes using our Agilent TapeStation. Barring any issues in library prep, sequencing can begin almost immediately. Plan to devote half a day to diluting and denaturing your library, thawing the reagent pack for sequencing, and setting up the sequencer. Each run takes 16-36 hours depending on the type of sequencing being done. Putting all these steps together and allowing for some amount of troubleshooting and scheduling around other runs, we typically take 1-2 weeks to get from start to finish.
Do you analyze the data for us?
CHMI offers a semester-long course on the bioinformatics associated with RNA-seq data analysis. Assistance is also available through 1:1 consultations with Dan Beiting.
So I can just give you my samples then?
Our tiered service model affords flexibility in how we work with you. See the table below for a summary of the tiers. If you are interested in using our reagents, we recommend starting with Tier 2 as you are learning the protocol, then moving to Tier 1 for future projects once you are comfortable.
- Tier 1 – I just need the reagents; I can do the bench work myself
- Tier 2 – I need reagents and some hands-on help.
- Tier 3 – I need reagents and would like you to do all the bench work