github

Version control Part 2: Remote repository

The second stage in my version control workflow is to push my local changes to a remote repository. A remote repository is basically an identical repository to the one stored locally on your computer, but is on a remote server somewhere in the internet ether. Much like using dropbox, this provides an additional layer of backup for your project (with the advantage of a full version history). So if you ever lose your local copy of your project for some reason, you can just re-clone it from the remote repository to get everything back (not including files that were never committed, of course). ** NOTE that I don’t recommend using this, or any one tool, as your only backup: your scientific projects should be backed up with multiple means, in multiple locations, all the time **.

However, the main advantage of pushing things up to a remote repository is that this facilitates sharing. With various methods that I’ll outline below, you can keep the remote repository private and invite your collaborators to use it, or you can make it public so that anyone can see it, clone it, etc (though of course in this case, you control whether to use anyone else’s changes or not).

I use two services to host remote repositories: Github and Bitbucket. These companies offer similar services with a few key differences, which means that in my current workflow I switch between both.

Github

Github is the “one that started it all”. They have a really slick web interface, awesome graphs for looking at repository activity, great tools for interacting with other members, wikis and issue trackers that can be associated with a repo, and a big user base. Plus they offer the free GUI that I talked about in my last post. However, their pricing structure is that they charge you to have private repositories. That is, they host unlimited public repositories – i.e. anyone can see the respository’s contents, contributors and history. If you want to keep your repository to a few invited collaborators however, you need a paid account. Seven dollars a month gets you 5 private repositories. The idea here would be that you have some projects on the go, then when one is ready for sharing (say, the article is accepted), you switch the repo from “private” to “public”. Now everyone can see your code and data, and you have another private repo slot to use.

However, since I know that I would need more than 5 private repos (projects languishing, maybe one day, etc), I’ve so far avoided a paid Github account (the idea of just working with everything open is for another post). Thankfully we’re helped by Bitbucket.


UPDATE: Thanks to Ariel Rokem for pointing out in the comments that Github actually offer a Micro plan (5 private repositories) for educational users. Send them an email with your educational email address at this site.


 

Bitbucket

Bitbucket is basically Github with a different pricing structure. Their web interface and user community is a fair bit behind Github. For scientists however, the advantage is that they offer unlimited free private code repositories. The catch is that you’re only allowed 5 collaborators (i.e. people who have joined any of your repositories, like co-authors). However, an academic email address will get you unlimited collaborators too, so this is essentially a free service.

Using Remote Repositories on Bitbucket with the Github GUI

Generously, Github have not restricted their GUI to use Github repositories. So what I do is basically use the Github GUI to manage my version control day-to-day, but push the local repository to a remote repository on Bitbucket. I can share this with collaborators and keep it private.

Here’s how:

  1. Set up a local repository as explained here.
  2. Log in to your Bitbucket account in a web browser.
  3. Follow the steps to set up a new repository. Select “git” as the version control flavour.
  4. This should then give you an option to “push up an existing repository”.
  5. On the command line that starts `git remote add origin`, copy the following link to your repository (something like `git@bitbucket.org:tomwallis/test.git`. This might look different, depending whether you’re using SSH or a password to authenticate (if you’re using a password, your link will start with https; either works). The Bitbucket / Github help pages will explain how to set up an SSH key if you’d like to do that.
  6. In the Github GUI, open your local repository and go to the “Settings” pane. On the line that says “Primary Remote Repository”, paste in the link to your repo. Hit “Update Remote”.
  7. Switch back to the “Changes” pane of the Github GUI. See the button in the top left? It should have changed from “Push to Github” to “Sync Branch” (if not, close and re-open the Github GUI).
  8. Press this button. You might be asked for your password (depending whether you’ve set up an SSH key).
  9. Github should synch, and the list of “unsynched commits” should disappear.
  10. Refresh your web browser on your Bitbucket account. Your code should now be in your private repository on Bitbucket!

You can now share this repository with your co-workers, friends and family, and take advantage of all the nice things about collaborating with version control!

When the code is ready to be made public, you can simply push the repository to a public repository on Github by changing the primary remote in the settings pane. This lets you take advantage of the bigger user community (for example, see the PsychtoolboxPsychoPy,  Psychopy_ext, Julia) and better web tools on Github for publishing your code. Maybe in the future I would just use Github exclusively (i.e. paying for private repos), but for now the dual solution (Bitbucket, Github) works well. Of course, you can also just make your Bitbucket repository public and not worry about using Github at all, but then you’re stuck with their (relatively) clunky web interface.

As always, test this process out for yourself to see that it works for you before using it for important stuff, and always keep independent backups of your project and data (Dropbox, Time Machine, etc).