This document accompanies the Git lab from the DIYtranscriptomics course. The chocolate muffin recipe file that is used as an example in this tutorial can be downloaded from the lab page.
- Set-up Visual Studio Code to work with Git and Github
- Connecting our local git repo with Github
- Time to make some muffins!
- Understanding .gitignore files
- No more muffins!
- Describing your repo
- Cloning repos (a.k.a. future-proofing your work)
- Tips and tricks for getting the most from Git
Set-up Visual Studio Code to work with Git and Github
There are many ways to work with Git and Github, including the command-line as well as third part point-and-click applications. I think that VS Code is probably one of the easiest ways to interact with both Git and Github and eliminates the need to remember command-line functions.
- To being, open up VS Code and click on the Git icon on the left-hand toolbar
- You may see that VS Code is telling you that you need to download and install Git in order to use this feature. If so, go ahead and do that now.
- Click on the extension tab of your toolbar and search for âGithub Repositoriesâ in the search bar. Click on the this extension and choose âinstallâ.
- Start a new folder on your computer. Open this directory in VS Code (file â open folder) and click on the the Git tab of VS Code again. You should now see a option to âInitialize Repositoryâ. Choose that option.
- After weâve initialized version control in our folder, letâs take a look at the contents of that folder in our bash application. Navigate to your folder using Terminal, and type
ls -ato view all files including hidden files. Notice that you can now see a hidden folder.git. This is your local git database that was created when you clicked âinitialize repositoryâ above. - Now letâs drag and drop our chocolate muffin recipe file into to our working directory. What do see in VS Code? This is a great opportunity to make our first âcommitâ and talk about what is happening under the hood, and discuss the purpose of staging and committing changes. Note that none of these operations require a connection to GithubâŚitâs all happening locally and can continue that way as long as you want.
Connecting our local git repo with Github
- Right now, we have a version controlled working directory that is not connected to Github. Git operates independent of Github, but itâs incredibly useful to connect our local Git version database to Github so we can store our version history on the web. Letâs revisit our Git tab in VS Code and choose âPublish to Githubâ. If you donât see the large blue button at the top of VS Code, you should look for the small cloud icon with an up arrow at the bottom of VS Code (see screenshot below). Both accomplish the same thing. If youâve never used VS Code to access Github, youâre going to be asked to authenticate. Youâll only need to do this once. Just say yes/OK to any all questions during this authentication. Once youâve authenticated, you will be asked if youâd like to publish to Github as a public or private repo (your choice!).
- Now that our local repo is connected to a remote repo on github, we can continue to work on our files locally, commit changes locally, and then at any point decide when we want to push these changes up to Github. Awesome!
Time to make some muffins!
- For the reminder of the lab, letâs use our chocolate muffin recipe to get more comfortable using VS Code to work with Git and Github. I suggest you use branches in Git to make three separate changes to our muffin recipe (each change that weâll be making is documented at the bottom of the recipe, but feel free to make your own changes). At the end of the demo, you will have your original recipe on the âmainâ development branch, and each of the three recipe changes living on its own named branch.
- To create a new branch in VS Code, simply click on the âmainâ branch icon at the bottom of your screen, which will reveal a dropdown menu at the top of VS Code. Choose â+ Create new branchâ and give your branch an informative name. From now on you can toggle between main and any branch using by revisiting this button at the bottom of your screen.
- After youâve made a few branches and committed some changes locally, try pushing your changes up to Github. This is as easy as clicking on the âSync Changesâ button at the top, or selecting âsyncâ from the source control menu, or clicking the circular arrow at the bottomâŚ.lots of options (see screenshot below)! Congratulations, youâre now familiar with 95% of Git/Github functionality that youâll ever need for your research!
Understanding .gitignore files
- Everything weâve discussed above dealt with simple text files (e.g. our recipe). One critical thing to understand is that Git and Github are not intended for large files! For example, large raw data files (e.g.,fastq files) should never be tracked with git because they are raw data and should never be changing. If a file doesnât or shouldnât change over time, then we donât need to keep a version history of it! Even if we did accidentally start tracking a large file, we would not be able to push this version history up to Github because Github blocks file that are greater than 100 Mb. To avoid these issues, we could simply leave large files untracked, but itâs easy to accidentally commit or push large files. To avoid this, we will use a special hidden file that git can read, called a .gitignore.
- A .gitignore file tells your local Git program which files it should always leave untracked. If a file isnât tracked, then it has no version history stored in that hidden .git database on your computer, and therefore cannot be pushed up to Github.
- Use the gitignore.txt file I provided in the section above. Move this file to your test working directory with the chocolate recipe file and rename it from gitignore.txt to .gitignore.
- With the .gitignore file in place we can now test it out. Drag and drop a file type that is included in your gitignore (e.g. one of the small fastq files from the last lab) into your working directory. This file will appear in your working directory (because you just put it there) but is not visible in VS Codeâs file browserâŚ.because you told it to ignore this file type.
No more muffins!
- The recipe example proved useful to understand some of the key concepts behind Git and the relationship with GitHub, but now itâs time to apply this to our course directory (you can delete your test directory and the associated GitHub repo whenever youâd like).
- Take some time to organize your course working directoryâŚ.Iâll show you what mine looks like
- Download another copy of my example gitignore.txt file, add to this directory, and rename to .gitignore using the terminal (VS Code might not let you do this).
- Now use what you learned from working with the muffin recipe to initialize a repo for your working directory, stage and commit all the files, and push everything up to github. Confirm that you now have a repo (I donât care if it is public or private) that mirrors what you have on your laptop.
Describing your repo
No matter what youâre working on, itâs critical that you take the time to write a thorough README. All github repos include a README text file. When you published your local repo to Github for the first time, Github took the liberty of create a blank README. Itâs fine to leave this blank when your repo is just getting started, but eventually youâll want to use this file to explain to others what the repo contains and how one might use this code moving forward. You can edit this file locally (after pulling from Github) or directly on Github. The more details you can provide, the better.
- Download the README.md file from todayâs lab page.
- Drop this into your active working directory for the course
- Stage, commit, and push this change to the GitHub repo.
- Check out your repo on Github to see how this file appears. You can now edit this file either locally, or on Github. If you choose to do the latter, youâll want to âpullâ your changes down to local computer at some point.
Cloning repos (a.k.a. future-proofing your work)
At some point youâll want to either work with someone elseâs code, or revisit your own code from a previous project. To do this, weâll explore the process of âcloningâ a repo from GitHub.
Tips and tricks for getting the most from Git
- Make meaningful commit messages (i.e. not just âI updated some stuffâ). Good commit messages say why you made a change, not what change was made (anyone can figure that out from just diffâing a file).
- Smaller commits are better than bigger ones. My developer friends like to say that they have never seen a commit that was too small, but frequently see ones that are too big. Big commits are much more likely to have multiple problems that need to be separated out.
- You donât necessarily need to use branches. Different people have different styles of working with Git. If youâre the only person working on a repo, for example, then you could consider never creating a branch and just committing all your changes to main. Youâll never lose anything, because itâs all version controlled and you can always return to any previous state of a document. On the other hand, if youâre working on a repo with other people, you will definitely want to work on a branch to avoid problems.
- Merged branches should be deleted. If you ever get to the point where you merge a branch with main, then youâll want to delete the branch afterward so no one (including your future self) mistakingly thinks that the branch is still under active development.
- Always begin your work with a âPullâ. If youâre collaborating on a shared github repo, you need to start your work with the most current version of the code, so it is best practice to always begin by pulling the repo from Github to ensure that youâre up-to-date.
- The entire DIYtranscriptomics.com website is just a bunch of text files stored in a GitHub repo. When I update the course, I simply edit the relevant files in VS Code and push these changes up to GitHub and the site automatically updates.