Version control systems allow programmers to keep track of their programs. They work by creating backups of source files incrementally. This allows us to do several things:
Git is the most popular version control program.
When you use Git, you create a repository for your project, which contains all of the files as well as their history. As you work, you will commit your work which saves the current state of the project.
Before using Git, it wants to know who you are. Run these commands to tell it (replacing the arguments with your actual name and email of course).
finlaysoni@myvm:~$ git config --global user.name "Your Name" finlaysoni@myvm:~$ git config --global user.email "email@example.com"
It uses this information when working with multiple users.
You also should tell Git which editor you want to use for writing commit messages:
finlaysoni@myvm:!$ git config --global core.editor vim
If you like, you can also tell Git to use
vimdiff for displaying
differences between files, and for merging multiple copies of files - we will
see how to use this later on, but you can set it up now as you're configuring Git:
finlaysoni@myvm:~$ git config --global diff.tool vimdiff finlaysoni@myvm:~$ git config --global merge.tool vimdiff
Git also by defualt prompts you each time you look at differences which I find annoying. To turn it off, use the following setting:
finlaysoni@myvm:~$ git config --global difftool.prompt false
These commands put their contents in a file in your home directory called ".gitconfig". You can edit this file directly if you wish. Note that this is one of those "hidden files" that start with a '.'
Git works by creating a repository to store your code changes. A repository is created in a directory by entering the command:
finlaysoni@myvm:~$ git init
This will create a hidden sub-directory called ".git". This directory contains the repository (or "repo" for short) in which Git will store the current state, and history of the files that comprise your project.
I'd really recommend creating a Git repository for each project that you work on. Even a few hours of work is too much to lose. When starting a new project, first create a directory for it, then go into the directory and create a Git repository:
finlaysoni@myvm:~$ mkdir project1 finlaysoni@myvm:~$ cd project1 finlaysoni@myvm:~$ git init Initialized empty Git repository in /home/finlaysoni/project1/.git/
Note that a directory can only contain one Git repository. This is one reason to make a directory for each project you work on.
Git does not track your changes automatically. You need to tell it which files to keep track of. Suppose we have three files, "input.py", "output.py", and "main.py".
To add these files to the repository, we'll use the
git add command:
finlaysoni@myvm:project1$ git add input.py main.py output.py
Generally, everything you create yourself (code, input files, tests etc.) should be added. Compiled files (such as Java .class files) should not be added as they can be easily re-created even if they are lost.
Once we have added files, we can commit changes:
finlaysoni@myvm:project1$ git commit -a
The "-a" flag tells Git to commit all files that have been
changed since the last commit. Each time you run
it creates a checkpoint of your work.
When you run this command, it will launch Vim for you to write a commit message which should describe what changes were made for your own reference. Write the message at the top of the file, and then save and quit Vim.
If you quit Vim without writing a commit message, Git will abort the commit.
You should perform a
git commit every time you want to checkpoint
Each commit that you make creates a revision of
your project. A revision is a state of your project files at a point
in time. We can see the revision history with the
finlaysoni@myvm:project1$ git log commit 9cf99039f4d1a4de2acda4dc2dea80f0d8389f08 Author: Ian Finlayson <firstname.lastname@example.org> Date: Tue Jun 11 13:00:05 2018 -0400 Add error checking of user input.
Because this project only has one revision so far, that is all that is displayed. The commit string "9cf99039f4d1a4de2acda4dc2dea80f0d8389f08" is a hash which is how Git identifies individual revisions. A hash is just a number which is computed based on data somehow. So Git basically puts all of the files together, adds up the values of the bytes in them, and comes up with a big base-16 number which forms the hash.
The log also shows the author, date of the commit, and the commit message.
If we add some more revisions, those are shown under
git log as well:
finlaysoni@myvm:project1$ git log commit 18325e137dd43af4857c1501cb2df849ec0a73e3 Author: Ian Finlayson <email@example.com> Date: Tue Jun 11 13:15:02 2018 -0400 Fix bug in sorting feature. commit e18012e8cacc16044f20a9c0b3d7636b91c31cba Author: Ian Finlayson <firstname.lastname@example.org> Date: Tue Jun 11 13:14:28 2018 -0400 Add sorting feature for output. commit 9cf99039f4d1a4de2acda4dc2dea80f0d8389f08 Author: Ian Finlayson <email@example.com> Date: Tue Jun 11 13:00:05 2018 -0400 Add error checking of user input.
The oldest revisions are shown at the bottom, and the newest are on top. Note that these are not very good commit messages, but are just put in as an example. A good commit message is more detailed and lets you know what the change accomplishes.
We can ask Git for the differences between any two revisions with
finlaysoni@myvm:project1$ git diff 18325e137dd43af4857c1501cb2df849ec0a73e3 e18012e8cacc16044f20a9c0b3d7636b91c31cba diff --git a/input.py b/input.py index 7756d0a..5d7c164 100644 --- a/input.py +++ b/input.py @@ -1,5 +1,4 @@ -# get a number and return it def get_input(): return int(input("Enter a number N: ")) diff --git a/output.py b/output.py index 8379835..021ac16 100644 --- a/output.py +++ b/output.py @@ -1,5 +1,4 @@ -# show the output def show_output(value): print("The value is:", value)
The output of
git diff can be hard to read. We will talk about
comparing files with diff tools later on in this course.
Using the full hashes is actually not necessary. We only need to include the first four characters (or more if that would be ambiguous):
finlaysoni@myvm:project1$ git diff 1832 e180
git status command is also helpful to see what changes
have been made since the last commit:
finlaysoni@myvm:project1$ git status On branch master Changes not staged for commit: (use "git add <file>..." to update what will be committed) (use "git checkout -- <file>..." to discard changes in working directory) modified: main.py no changes added to commit (use "git add" and/or "git commit -a")
This tells us that we have changed the main.py file since our last
commit. If we are up to date,
git status will tell us:
finlaysoni@myvm:project1$ git status On branch master nothing to commit, working directory clean
If we accidentally delete a file, we can recover the latest committed version of it with
finlaysoni@myvm:project1$ rm main.py finlaysoni@myvm:project1$ git checkout main.py finlaysoni@myvm:project1$ ls input.py main.py output.py
Git is worth using even just for this. You will accidentally delete files at some point, and you will be very happy if they are under Git.
We can also move through our history. If we want to get to a previous revision, we can do so by checking it out.
Below I go all the way back to the initial version with empty files:
finlaysoni@myvm:project1$ cat main.py import input import output # the simplest three file program ever value = input.get_input() output.show_output(value) finlaysoni@myvm:project1$ git checkout 9cf9 Note: checking out '9cf99039f4d1a4de2acda4dc2dea80f0d8389f08'. You are in 'detached HEAD' state. You can look around, make experimental changes and commit them, and you can discard any commits you make in this state without impacting any branches by performing another checkout. If you want to create a new branch to retain commits you create, you may do so (now or later) by using -b with the checkout command again. Example: git checkout -b new_branch_name HEAD is now at 9cf9903... Just added blank files finlaysoni@myvm:project1$ cat main.py
All of the files in the project1 directory have now been set back to the state at the first commit. Git gives us some good information in the output. We will not discuss branches in detail, but they allow you to have separate versions of the same code base at once.
We can look at this version and test it. Getting back to the current state is accomplished with:
finlaysoni@myvm:project1$ git checkout master Previous HEAD position was 9cf9903... Just added blank files Switched to branch 'master' finlaysoni@myvm:project1$ cat main.py import input import output # the simplest three file program ever value = input.get_input() output.show_output(value)
"master" is the main "branch" of our project, so checking it out moves us back to the current state.
Being able to go back and forth through our history is very valuable for working on larger programs. If you notice a bug, and aren't sure where it was introduced, you can go back and check past versions to see where it was introduced.
Another use is for when you removed code, but decide later that you want it. You can go back in time, copy the code someplace, and then paste it into the project's current state.
We can also permanently undo commits with
finlaysoni@myvm:project1$ git revert e18012 [master e21dbb9] Revert "Added comments" 2 files changed, 2 deletions(-) finlaysoni@myvm:project1$ git log commit e21dbb9f16e81cb4c50d636732f3f2a3de51c0a1 Author: Ian Finlayson <firstname.lastname@example.org> Date: Tue Jul 11 13:43:30 2018 -0400 Revert "Add error checking of user input." This reverts commit e18012e8cacc16044f20a9c0b3d7636b91c31cba. commit 7bb278f9fe1529db9a16f80beba46209ae9ae462 Author: Ian Finlayson <email@example.com> Date: Tue Jul 11 13:27:53 2018 -0400 Fix bug in sorting feature. commit 18325e137dd43af4857c1501cb2df849ec0a73e3 Author: Ian Finlayson <firstname.lastname@example.org> Date: Tue Jul 11 13:15:02 2018 -0400 Add sorting feature for output. commit e18012e8cacc16044f20a9c0b3d7636b91c31cba Author: Ian Finlayson <email@example.com> Date: Tue Jul 11 13:14:28 2018 -0400 Add error checking of user input. commit 9cf99039f4d1a4de2acda4dc2dea80f0d8389f08 Author: Ian Finlayson <firstname.lastname@example.org> Date: Tue Jul 11 13:00:05 2018 -0400 Just added blank files
You must commit everything you have before performing a revert. This way if you change your mind about reverting, you can "undo" the revert by reverting the revert commit itself. Git makes it really hard to lose your work accidentally!
You can revert your most recent commit, or prior ones as well.
git revert undoes only the specific changes made by
the commit you are reverting.
In order for this to work well, it's best to make each commit one discrete thing. For example, if you have a commit which adds a couple of features to the program, fixes a few bugs, and also changes some output messages, then it won't make as much sense to revert that commit. If you want to remove one of the bug fixes from that commit, you will need to do some extra work. However, if each feature/bug fix/change is its own commit, reverting them is easy.
Binary files such as compiled executables, or compiled
Java .class files should not be put under version control.
For one, it does not matter if we lose them because they
are generated from files which will be under Git. Also,
their changes will show up in
git diff making it
hard to see important changes.
To tell Git to ignore these files, we can create a file in our project directory called ".gitignore". This file contains a list of file names, or wildcards one per line. Each one is something Git will ignore.
For example, by default, Git will warn us about files not being managed:
finlaysoni@myvm:project1$ git status On branch master Untracked files: (use "git add <file>..." to include in what will be committed) Main.class
Here Git is warning us about Main.class being untracked. To fix this, we can add the pattern *.class to .gitignore:
finlaysoni@myvm:project1$ vim .gitignore finlaysoni@myvm:project1$ cat .gitignore *.class finlaysoni@myvm:project1$ git status On branch master Untracked files: (use "git add
..." to include in what will be committed) .gitignore
Now Main.class is not listed as being untracked. Git is now ignoring it. Of course, now the ".gitignore" file itself is untracked. We should add it to the repository to fix this:
finlaysoni@myvm:project1$ git add .gitignore finlaysoni@myvm:project1$ git commit [master 948c205] Added the .gitignore file to the repository 1 file changed, 1 insertion(+) create mode 100644 .gitignore
The basic workflow is:
git addevery file you want to track.
git commit -aeach time you want to save your progress.
It is good to know that you can see and revert to older versions, but you will not normally need to do it very often.
As we will see, Git can also be used to:
Version control tools are used universally by professional programmers. Taking the time to get used to Git now, will make your life easier, and prepare you well for working as a programmer.
Copyright © 2022 Ian Finlayson | Licensed under a Creative Commons Attribution 4.0 International License.