Sunbeam: How to Set-up / Run

Mount the Sunbeam Conda environment using the following command:

conda activate sunbeam4.1.0

Move to the directory where you plan to set up and initialize your run:

sunbeam init analysis_sunbeam --data_fp /path/to/directory \
	--force --format {sample}_R{rp}_001.fastq.gz

This will create a folder titled analysis_sunbeam and generate a config file and .csv file of the samples located in /path/to/directory. As currently written, files in /path/to/directory need to be paired files that are in .fastq.gz format with the following suffix: {sample}_R{rp}_001.fastq.gz, where {sample} defines the sample name and {rp} defines the 1 or 2 in the read-pair.

After building your directory, you will need to make a few edits to your config file. Open your editor for the sunbeam_config.yml file in the analysis_sunbeam folder and make the following changes:

Though not necessary, under the all header, edit the output_fp parameter if you want to change the name of your analysis output folder.
Under the qc header, edit the host_fp parameter to the location of the .fasta files for host/contaminant genomes - all files should be located in the same directory.
Under the sbx_kraken header, edit the kraken_db_fp parameter to the path to the Kraken database that you wish to use for taxonomic assignment.

Run Sunbeam using the following command:

sunbeam run --profile analysis_sunbeam all_decontam

Choose which step you need to run: all_decontam can be replaced with other steps in the pipeline (e.g., all_decontam, all_classify, all_assembly, all_reports).

If restarting from the previous step, add the following to the command:

--rerun-triggers mtime --rerun-incomplete -k

Additional Resources

https://sunbeam.readthedocs.io/en/stable/

https://github.com/sunbeam-labs/sunbeam

https://github.com/kylebittinger/metagenomics-workshop/blob/main/bioinformatics-steps.md