Sunbeam: How to Set-up / Run

  1. Mount the Sunbeam Conda environment using the following command:
  2. conda activate sunbeam4.1.0
  3. Move to the directory where you plan to set up and initialize your run:
  4. sunbeam init analysis_sunbeam --data_fp /path/to/directory \
    	--force --format {sample}_R{rp}_001.fastq.gz 

    This will create a folder titled analysis_sunbeam and generate a config file and .csv file of the samples located in /path/to/directory. As currently written, files in /path/to/directory need to be paired files that are in .fastq.gz format with the following suffix: {sample}_R{rp}_001.fastq.gz, where {sample} defines the sample name and {rp} defines the 1 or 2 in the read-pair.

  5. After building your directory, you will need to make a few edits to your config file. Open your editor for the sunbeam_config.yml file in the analysis_sunbeam folder and make the following changes:
    • Though not necessary, under the all header, edit the output_fp parameter if you want to change the name of your analysis output folder.
    • Under the qc header, edit the host_fp parameter to the location of the .fasta files for host/contaminant genomes - all files should be located in the same directory.
    • Under the sbx_kraken header, edit the kraken_db_fp parameter to the path to the Kraken database that you wish to use for taxonomic assignment.
  6. Run Sunbeam using the following command:
  7. sunbeam run --profile analysis_sunbeam all_decontam 

    Choose which step you need to run: all_decontam can be replaced with other steps in the pipeline (e.g., all_decontam, all_classify, all_assembly, all_reports).

    If restarting from the previous step, add the following to the command:

    --rerun-triggers mtime --rerun-incomplete -k

    Additional Resources

    https://sunbeam.readthedocs.io/en/stable/

    https://github.com/sunbeam-labs/sunbeam

    https://github.com/kylebittinger/metagenomics-workshop/blob/main/bioinformatics-steps.md