- Mount the Sunbeam Conda environment using the following command:
- Move to the directory where you plan to set up and initialize your run:
- After building your directory, you will need to make a few edits to your config file. Open your editor for the sunbeam_config.yml file in the analysis_sunbeam folder and make the following changes:
- Though not necessary, under the all header, edit the output_fp parameter if you want to change the name of your analysis output folder.
- Under the qc header, edit the host_fp parameter to the location of the .fasta files for host/contaminant genomes - all files should be located in the same directory.
- Under the sbx_kraken header, edit the kraken_db_fp parameter to the path to the Kraken database that you wish to use for taxonomic assignment.
- Run Sunbeam using the following command:
conda activate sunbeam4.1.0
sunbeam init analysis_sunbeam --data_fp /path/to/directory \
--force --format {sample}_R{rp}_001.fastq.gz
This will create a folder titled analysis_sunbeam and generate a config file and .csv file of the samples located in /path/to/directory. As currently written, files in /path/to/directory need to be paired files that are in .fastq.gz format with the following suffix: {sample}_R{rp}_001.fastq.gz, where {sample} defines the sample name and {rp} defines the 1 or 2 in the read-pair.
sunbeam run --profile analysis_sunbeam all_decontam
Choose which step you need to run: all_decontam can be replaced with other steps in the pipeline (e.g., all_decontam, all_classify, all_assembly, all_reports).
If restarting from the previous step, add the following to the command:
--rerun-triggers mtime --rerun-incomplete -k
Additional Resources
https://sunbeam.readthedocs.io/en/stable/
https://github.com/sunbeam-labs/sunbeam
https://github.com/kylebittinger/metagenomics-workshop/blob/main/bioinformatics-steps.md