Bash basics
Bash basics

Bash basics

This document accompanies the first lab exercise from the DIYtranscriptomics course.

Helpful command line tips

If youโ€™re new to Bash, there are lots of online resources for learning, but here are a few of the commands that will help you move around and carry out basic tasks. Note that some of these commands may only work if run as sudo.

Note: for Windows OS, you will need to install Git for Windows and add C:\Program Files\Git\usr\bin to your system variables path for all of the common bash commands to work

Common bash commands

typing this (if you're on a Mac)or this (if you run Windows)does this
cd /
cd /
takes you to the root directory
cd ~
no direct shortcut
takes you to your home directory
cd ..
cd ..
takes you up one level in your file directory
cd ../../
cd ../../
takes you up two levels in your file directory
cd path/to/some/folder
cd path/to/some/folder
take you to some folder on your computer
tar -xvzf [fileName.tar.gz]
tar -xvzf [fileName.tar.gz]
unzip a .tar file
gzip [filename]
gzip [filename]
Compresses a file to be filename.gz
ls and ls -l and ls -a
ls and ls -l and ls -a or dir
list all files and folders in your working diretory with info on permissions. -a option shows hidden files
ls -l | wc -l
ls -l | wc -l
counts files in a directory
mv [fileName or folderName] [directory]
mv [fileName or folderName] [directory]
Moves a file or folder to a new location. Important, if the new location doesn't exist then mv renames your file to the destination name
du -a -h | sort -hr
no direct shortcut
lists all files and folders in your working directory sorted by size
du -sh *
du -sh *
simpler version of the command above. lists all files in a folder and shows their file size
df -h
df -h
view free/used disk space by drive
tree -d
tree -d
lists all files and folders in your working directory as a tree structure
lsblk
no direct shortcut
lists drives and their size (as well as used/free space on each)
pressing up arrow
pressing up arrow
recalls previous command
chmod ### [fileName]
chmod ### [fileName]
edits permissions on file. See graphic below for the appropriate numbers to use in place of ###
chown [yourUserName] [fileName]
chown [yourUserName] [fileName]
makes you the owner of a file
chgrp [yourUserName] [fileName]
chgrp [yourUserName] [fileName]
assigns you as the group for the file
rm -rf [directoryName]
rm -rf [directoryName]
removes a folder and all of its contents
wget [URLtoFile]
no direct shortcut
downloads a file from the web
nano [file.txt]
nano [file.txt]
opens up a text file in a text editor directly in your Bash application
export PATH="/path/to/your/software/:$PATH"
no direct shortcut
add a new piece of software to the system PATH so it is executable from anywhere
alias something="something else"
doskey something=something else
add lines like this to your ~/.bash_profile to create a keyboard shortcut, in this case typing 'something' actually does 'something else'

Understanding file permissions

If you're trying to do something to a file or folder in Bash (e.g., delete, move, edit, create) and are unable to, chances are you need to modify the permissions (chmod), the owner (chown) or the group (chgrp). See below and read more about permissions here

image

Getting to know Bash parameters

Create your first parameter.

dna="ATCG"

Use the echo program to see what you created.

echo $dna 
echo ${dna} are my favorite bases

Try closing and relaunching your Bash application window. Notice that your 'dna' variable no longer exists, because it is a 'shell variable', which means it is contained exclusively within the shell in which it was set or defined. You can read more about variable types here.

Before moving on, go ahead an re-create your dna variable.

We forgot about poor uracil...what kind of RNA-seq class is this?! Let's add it to our variable now.

echo $dnaU #note that this didn't work!

We can fix this by taking advantage of parameter expansion using curly brackets {}

echo ${dna}U

Getting to know 'for loops'

For loops allow you to iterate over a list of items (in our case, files in a folder) and carry out any number of actions on those files. Here's the general format of a for loop.

# don't try running this code
for item in [LIST]
do
  [COMMANDS]
done

Now we'll create a simple for loop that actually runs

for i in TACG CTAG CCTC GAAT
do 
	echo "$i is a DNA oligonucleotide"
done

Let's apply the concept of parameter expansion to our new loop in order to easily find/replace 'T's with 'U's. Two things to take note of here. First, we're declaring a new parameter within the loop, and it will only exist when the loop is running. Second, we have taken advantage of some handy syntax that let's us find (//) and replace (/) parts of our original parameter.

for i in TACG CTAG CCTC GAAT
do 
	FIX=${i//T/U}
	echo "$FIX is a RNA oligonucleotide"
done

Now we put this all together to make the creation of shell script really simple. What do you think would happen if you removed echo and the " " enclosing the fastqc and kallisto commands?

for FASTQ in *fastq*
do
	OUT=${FASTQ//.fastq.gz/_mapped}
	LOG=${FASTQ//.fastq.gz/_mapped.log}
	fastqc $FASTQ
	kallisto quant -i Homo_sapiens.GRCh38.cdna.all.index -o $OUT --single -l 250 -s 30 $FASTQ &> $LOG
done

For loops can be combined with conditional statements (if/then) to loop over only certain files in a directory

for FASTQ in *fastq*
if [[ $FASTQ == *subsample* ]];
then 
  echo "$FASTQ was mapped"
else 
  echo "$FASTQ was not processed"
fi