This 2 day workshop is hosted by Dan Beiting, and is based on the online course he runs each year called DIYtranscriptomics. The course website will serve as the source for lecture videos, code, data and other resources for this workshop. This is a fast-paced workshop best suited for individuals with experience working in R and using the RStudio IDE. In addition, experience using the Tidyverse suite of R packages is recommended. The workshop page below is meant to serve as a high-level overview as we move through the material.
Workshop schedule
Time | Description | Topics covered | Video lectures | Comments |
---|---|---|---|---|
9:00 - 9:30 | Grab some coffee and get settled in | None | ||
9:30 - 11:15 | Module 3 | Read mapping with Kallisto (2 videos) | You will NOT have time to do the read mapping yourself. However, these two videos will give you clear instructions on HOW read alignment and 'pseudo alignment' work, and how you could do read mapping on your laptop with the workshop dataset on your own time. For the purposes of this workshop, you'll start your analysis using the outputs from Kallisto, which are already in your RStudio Cloud project | |
11:15-11:30 | Coffee/stretch break | |||
11:30 - 12:30 | Module 4 | Measuring gene expression; only watch Part 1 video | This lecture covers important concepts that are fundamental to understanding counts, RPKM/FPKM, TPM and normalization | |
12:30 - 1:30 | Break for Lunch | |||
1:30 - 2:15 | Discussion + coding start | Importing Kallisto counts directly into R/bioconductor | This really the first place where you'll be working in the RStudio cloud project. I'll join you by Zoom to guide you through launching the project and running the code in the Step 1 script in order to get all the data imported into your R environment. | |
2:15 - 3:45 | Module 6 | Filtering and normalization (2 videos) | Start and finish the code in the Step 2 script | |
3:45 - 4:00 | Coffee/stretch break | |||
4:00 - 5:00 | Module 7 | Principal Component Analysis (PCA); only watch Part 1 video | Start and finish the code in the Step 3 script | |
5:00 - 6:00 | Discussion and Day 1 wrap-up | We'll pick up with the PCA result and work together to plot the results, explore and discuss | A chance to review key concepts and to have a Q&A discussion | |
Time | Description | Topics covered | Video lectures | Comments |
---|---|---|---|---|
9:00 - 9:30 | Grab some coffee and get settled in | None | ||
9:30 - 11:15 | Module 9 | Differential gene expression; only watch videos for parts 1, 2 and 3. | Start and finish the code in the Step 5 script | |
11:15 - 11:30 | Coffee/stretch break | |||
11:30 - 12:30 | Module 10 | Module identification; only watch videos for parts 1 and 2 | Start and finish the code in the Step 6 script | |
12:30 - 1:30 | Break for Lunch | |||
1:30 - 2:30 | Discussion and code review | Review step 5 and 6 scripts together, discuss DEG analysis, What to do if you don't have any DEGs...or too many? Q&A. | ||
2:15 - 3:15 | Module 11 | Functional enrichment analysis; videos for part 1 and 2 | Begin Step 7 script | |
3:15 - 3:30 | Coffee/stretch break | |||
3:30 - 4:30 | Module 11 | Functional enrichment analysis; videos for part 3 and 4 | Finish Step 7 script | |
4:30 - 6:00 | Discussion and wrap-up | Q&A, demo Rmarkdown report, highlight reproducibility topics (e.g. Code Ocean) |
Getting started
In interests of time, we will not install R, RStudio, Bioconductor, Kallisto, or any other software on your computers. Instead, I have set-up a RStudio Cloud instance that is already pre-stocked with all the software, R packages, data and code that you'll need for the workshop. You should have received an email with a link that allowed you to join this RStudio Cloud space. This space includes all the materials you'll need for the workshop, as well as the larger DIYtranscriptomics course, should you decide to pursue other content after the conclusion of the workshop. You will need the following to participate in the workshop:
- A computer
- An internet connection
- Join the RStudio Cloud space for the workshop.
Helpful tips
- 'Work while you watch' - Try to work in your RStudio cloud project alongside the videos. When I'm working on a 'step' script, you should open the same script and being working along with me.
- If you get stuck on something, don't let it derail you. Watch the video and try to understand the concepts and main ideas. You can always go back and work on the scripts later.
- Write down any questions as they come to mind. Save these for our discussion sessions each days.
Working with raw fastq files
We won't have time to do read alignments during the workshop, but newer 'pseudoalignment' softwares like Kallisto and Salmon, have made read mapping possible even on the modest resources available on most modern laptop computers. If you're interested in installing Kallisto on your own computer (not our RStudio cloud project) and trying read alignments, you can see our SOP for installing Kallisto here. Again, this is not part of the workshop and should only be tried on your own time. Similarly, we'll discuss the software I like to use for quality checking raw reads, and you can read more about our SOP for QC here, but we won't actually install or run these programs during the workshop.
Practice after the workshop
Practice makes perfect (or at least better!). If you enjoyed the workshop, there's a lot more to explore on the DIYtranscriptomics website. Here are a few suggestions:
- Module 3 - try installing FastQC and MultiQC for quality checking reads. Install Kallisto and align raw data
- Module 7 - explore plotting of PCA results and making interactive graphics (videos 2 and 3)
- Module 8 - learn how to access large amount of publicly available RNAseq data that has already been mapped to human or mouse using Kallisto
- Module 9 - explore differential transcript usage (DTU) analysis (video 4)
- Module 10 - install clust and explore its usage for making 'tight' clusters from timecourse data
- Module 12 - create an Rmarkdown document from your analysis
- Module 13 - learn to use custom functions and R packages to document and share your analyses
- 'Hackdashes' - the course website will include three hackdashes that challenge you to apply your skills in a 2hr coding event that uses a different RNAseq dataset.