Revision Control / SCM

Overview

Revision control (a.k.a. source code management, or SCM) is the greatest thing since sliced bread. I know we’re throwing a lot at you all at once, but this will simplify your life tremendously going forward, and is a ubiquitous tool in software development. Read about it. You won’t be working in large groups just yet, but even on your own, these systems provide you with lots of nifty benefits including:

No more saving a billion copies of the same project on your computer. No more worries about losing a once-working piece of code. No more commenting hundreds of lines that you really wanted to delete, but couldn’t quite let go of. It’s just better. Now on to the details…

Git

There are a lot of SCM systems out there, but we’re going to use Git. It has already been installed on the virtual machine, but there are clients for most any OS.

For in-depth reading, you can visit the references below. Note: git is vast, and trying to take it all in at once will likely be overwhelming. For now just focus on the few features we need, which are outlined below.

Cloning a Repository with Git

To get started, let’s try a simple example. We’ll begin by cloning a repository. (A repository is just a database of source code.) Cloning just makes a local copy of a repository on your machine. You’ll use this when getting the homework assignment skeletons, for example. First, get a terminal running. Now try the following commands:

$ cd
$ git clone https://bitbucket.org/wes_ccny/csc103-lectures.git

This should create a new subdirectory called csc103-lectures. Let’s see what’s in there:

$ cd csc103-lectures/testing/
$ ls

In this case, there should just be a hello.cpp file. When we cloned the repository, we actually imported the entire history as well. That’s right – even without a network connection, we can now take a look at everything that’s happened to this code.

Viewing (and making) history – Commits

Now run the command gitk and a window ought to appear with the revision graph on the left top, and with the bottom of the screen detailing the changes to the files. This project doesn’t have much history. Let’s make a few edits to see what is going on. Close the gitk window, and get back to the terminal. Open up the file hello.cpp with whatever editor you like.1 (Just type $ open hello.cpp to use the default.) Make a few edits – it doesn’t matter what. Maybe delete the comments at the top, and change the “hello…” message to something else. Save the file, and get yourself back to the terminal. Do you remember exactly what you just typed? Maybe you do this time, but in general, it is hard to keep track of. Git, on the other hand, does a fantastic job keeping track. To see precisely what you’ve done since the last commit, run this command:

$ git diff

And in the terminal you’ll see exactly which lines you changed, and how you changed them. (Note: you’ll see removed lines in red, prefixed with a dash, and added lines in green, prefixed with a plus.) Press q to exit. If you’re less comfortable with the terminal, you can also run the command git gui, and you’ll see similar information in a graphical window.

Now run git status and you’ll see some messages from git telling you that hello.cpp was changed (that is, there are changes in the working copy which aren’t yet in the repository). To tell git that you’re done with a batch of changes, and are ready to put them in the repository, run the command

$ git add hello.cpp

Now run git status again, and you will see a slightly different message from git, telling you that it has some changes which are ready to commit. At this point, you can record your changes in the repository (making them part of the history) via the command

$ git commit -m 'meaningless changes; just testing'

The stuff following -m is the commit message. It should be a brief summary description of your changes. Note: there is a shortcut for the above steps: if you want to commit all of the changes you made (in case there were several files you’ve edited) then run this command:

$ git commit -a -m 'message goes here...'

The -a option tells git to add all of your changes. All right – so now let’s again look at the history by running gitk. You’ll now see your commit at the top of the revision graph, and if you select it, you’ll find the details below. Select the hello.cpp file, and you’ll see on the left exactly what you removed, and what you added. Useful! Note: if the bit about adding changes (with git add or git commit -a) is confusing, I would recommend you follow up with this guide to get a better idea of what is going on. Note: If you type git commit -a and don’t add a commit message, you will be dumped into an editor (most likely Vim) and asked to type a commit message. Don’t panic; you can type :q and hit enter if you are uncomfortable with Vim, and just want to cancel the commit. Note that you can also manage what to commit, the commit messages, as well as actually committing the changes by running git gui.

Pulling Changes from Others

Git was designed with collaboration in mind, and has very rich features to help people work together. You won’t do too much collaboration in this course, but nevertheless, there are some basic things that you’ll need. Here is the typical scenario: you cloned a repository for the class notes, or the projects, and (as I have instructed you) you’ve been editing away, and making commits as you go. The history of the files would look something like this:

      A---B---C   "A,B,C" are your new commits
     /
D---E             "E" is what you cloned from me

Meanwhile, I’ve been working too! Perhaps I added some more class notes, or the next project skeleton. I too, have been committing my changes, so now my repository looks something like this:

D---E---F---G  "G" is the latest work in my repository.

Leaving the combined history like this:

      A---B---C  yours
     /
D---E---F---G    mine

At some point, you will want to import (“pull”) my changes. Your repository and mine are different, but they still have the commit E in common. To bring in my changes, git will figure out all the things I’ve done since E, and try to apply them to your repository. A convenient command for this is:

$ git pull --no-edit

The result will be as follows:

      A---B---C---X  "X" contains both of our work.
     /       /
D---E---F---G

Note: there is a new commit X, which corresponds to the merge. This is actually what the --no-edit parameter in the git pull command is about: it instructs git to use a default commit message for the commit X, rather than having you edit one. If for some reason you want to edit the message, just run git pull by itself.

Some minor points

You should probably tell git a little about yourself so that the commit logs don’t have erroneous information. Edit the .gitconfig file in your home directory:

$ cd
$ open .gitconfig

and change the user name and email address.

Pushing to your own remote

To keep multiple machines synchronized (or just to have a backup copy of your work), you might find it convenient to setup an account on bitbucket or github2. In this case, you will need to setup an additional remote corresponding to your bitbucket / github / whatever.3 For the purposes of this example, suppose that the url of your copy is https://me@bitbucket.org/me/csc212-projects.git. Assuming that this repository is newly created and empty, proceed as follows.4 First, list your current remotes:

$ cd csc212-projects/
$ git remote -v
origin  https://bitbucket.org/wes_ccny/csc212-projects.git (fetch)
origin  https://bitbucket.org/wes_ccny/csc212-projects.git (push)

Now we’ll add yours, push the changes there, and furthermore set it up as the default remote for pushes and pulls:

$ git remote add mine https://me@bitbucket.org/me/csc212-projects.git
$ git remote -v
mine    https://me@bitbucket.org/me/csc212-projects.git (fetch)
mine    https://me@bitbucket.org/me/csc212-projects.git (push)
origin  https://bitbucket.org/wes_ccny/csc212-projects.git (fetch)
origin  https://bitbucket.org/wes_ccny/csc212-projects.git (push)
$ git push -u mine master
Counting objects: 25, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (18/18), done.
Writing objects: 100% (25/25), 16.04 KiB | 0 bytes/s, done.
Total 25 (delta 4), reused 25 (delta 4)
To https://me@bitbucket.org/me/csc212-projects.git
 * [new branch]      master -> master
Branch master set up to track remote branch master from mine.

Important: because of the -u in the push command, from now on git pull and git push will default to using your copy of the repository. So, the next time you want to get the latest stuff from me, you should do this:

$ git pull origin master

If you have another computer you want to work from, I would recommend cloning my version initially and then adding your remote to keep the names consistent. Here is an example (which you would run from your other machine!):

$ git clone https://bitbucket.org/wes_ccny/csc212-projects.git
$ cd csc212-projects/
$ git remote add mine https://me@bitbucket.org/me/csc212-projects.git
$ git branch -u mine/master

At this point, the two (or more) computers you work from should behave the same way. To summarize:

Hints for Collaborating

Remotes for Collaborators

The above instructions on adding you own remote can be used just as well to add another remote corresponding to the repository of one of your collaborators. If the remote (and perhaps your collaborator) is named sally, then the command git push sally master would push the changes in your master branch to Sally’s (provided you have access).

Avoiding Merge Conflicts

Once you have all the remotes set up, perhaps the most likely source of trouble will be merge conflicts. This happens when two people edit the same exact piece of the same file. There’s no way for git to know what to do, so the burden falls on you to sort it out, and this usually isn’t a pleasant experience. The best thing is to not let it happen in the first place. I’d recommend a strategy like this:

  1. Everyone can work independently on whichever parts they feel like.

  2. Before committing anything, coordinate with your collaborators (e.g., maybe I’ll commit function blah() and Sally will commit function yay()).

  3. Use git add -p. This command lets you choose carefully what to commit. In particular, you can stage pieces of a file for committing. Git will ask what to do with each piece of the diff, and I’ll say “yes” when it asks about adding blah(), but “no” regarding yay(). Then commit.

  4. Use branches or git stash if needed to save your own versions. If you made some changes to yay() and then try to pull from Sally, it will likely fail. If you want to scrap your yay() changes, you can just checkout the file before pulling. If instead you want to save the work you did on yay() somewhere, there are a few strategies. This is probably the best option:

     $ git checkout -b my-yay-branch   # make a new branch
     $ git commit -a -m "my version of yay"   # commit changes
     $ git checkout master   # switch back to the master branch

    Now you’ll have your own version in a new branch if you ever need it. The other option is git stash. It is the quickest solution, but not as robust as a new branch. See the man pages for details.

References

You can find some good references here, and an excellent visual guide here. If you are really are in need of more details, you can read the git book. Don’t worry if not all of this makes perfect sense just yet.


Back to the class homepage


  1. Once again, please do yourself a favor and learn vim as the semester goes on.↩︎

  2. It is also very easy to host your own git repositories if you have a server, although sharing them will take a little more work.↩︎

  3. Remotes are just versions of a repository, typically on a physically different machine (hence the name “remote”).↩︎

  4. If you are using bitbucket or github, you’ll have to log on to the web interface to create the repository, as far as I know. If you are hosting this yourself, it may be as easy as git init --bare.↩︎