Developing HPX with Git: A Tutorial

EDIT: Heller pointed out that “newer subversion versions come with a single .svn directory in the top level directory; the single .git is not the reason why you can have multiple local branches”. And along with his advise to use Git over SSH instead of HTTPS and to deploy the public key with github.com.
EDIT2: revise the “feature change” section and improve workflow.

 

The HPX code base has moved from SVN to Git at github.com. This post is meant to serve as a tutorial to help with the transition.

SVN vs. Git – the Repo Conceptual Difference

From git-scm.com:

Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.

The distributed nature is reflected in the differences of repo models. As summarized by Ole Morten Amundsen, with a few edits:


Subversion: 2 levels:

  • local checkout  – Files on your computer, in the file system ( svn checkout https://svn.xx.yy/projectxyz)
  • server, the central subversion repo – The server which every user checks in to (svn commit) and checks out from (svn up)

Git: 3+ levels:

  • local file system of a clone (git clone git@github.com:oma/projectxyz as checkout in svn). As above, yet different… Must be seen in context of the local index, below.
  • local index of a checkout (clone) in the same file system. This is similar to server above. You mess around with the local files as always (level 1), then you commit to the local index (git commit -a -m “msg”).  The git repo server is still unchanged. Then make more changes and commit and so on. Any commits you make, are written to the index, stored in the directory .git, locally. Unlike subversion, which creates a .svn directory within each director, git has only one .git directory. This enables you to create multiple branches in the same directory in the file system! It deletes and adds files automatically when switching. OK, make as many commits you want locally. When satisfied, push them all to the server. Read on below.
  • server, the central git repo (or one of the many!). When you’re satisfied with your local changes and committed them locally (git commit) then you perform a git push which pushes the changes to the repo you cloned from). To get changes from this repo/server, you perform git pull.
  • the “4th” level is the git server repo of each developer, which your server need to pull from and vica versa.

Install Git and Get HPX

Install Git clients for your platform. github.com has Win and Mac clients. Linux should have Git package in the standard distributions, check here. After Git is installed, it is beneficial to setup the username and email, which will be marked on your commits.

$ git config --global user.name "Ste||ar" # set up username
$ git config --global user.name # verify username

$ git config --global user.email "stellar@cct.lsu.edu" 
  # set up email
$ git config --global user.email # verify email

To get HPX, run the following commands in your working directory.

$ git clone https://github.com/STEllAR-GROUP/hpx.git

$ git clone git@github.com:STEllAR-GROUP/hpx # over ssh instead of https
  # check github:help on generating SSH keys

Then a copy of the HPX code base will reside under the folder hpx/. Whenever someone has made commits on the server repo, the changes can be pulled to the local repo.

$ cd hpx/  # hpx/ contains the .git folder
$ git pull # update local repo

To check the status of the working directory, type

$ git status

Then it says 'On branch master'. It is the default branch name. Also the remote server repo has a name origin. In other literature, the remote server repo may be called upstream repo. In the following context, server repo is used most of the time.

Simple Changes: Single-Branch Workflow

If only simple changes are made and does not deserve a branch, these changes can be made to the working directory and committed to the local repo and then pushed to the server repo. As an example,

$ vi README.rst # did some changes
$ git status # will show modified file
$ git commit -a -m "changes on README.rst" # commit to local repo
$ git status # will show 'nothing to commit'

Rest assured, all things done so far only happen in the local repo. The server is not aware of any of these. Then finally after some more changes and commits, it is ready for submission.

$ git push # push to server repo

If nobody pushed new commits while you are editing, it will go smoothly providing the github.com username and password are correct. Otherwise, a big error shows and indicates that "Maybe you are not up-to-date and need to pull first?". Here comes the trick, it is preferable to user the following command

$ git pull --rebase  # instead of just 'git pull' for clean commits
$ git push  # should make it if no conflict occurs

To explain the reasons, first of all, Git is distributed and local repo contains all histories. The local repo can be seen as a branch of the server repo, and 'git pull' is about merging them. It will create on the local repo, a new commits indicating

Merge branch 'master' of https://github.com/STEllAR-GROUP/hpx,

which contains all the pushes made by others after your last pull. It is nothing wrong, totally standard for a merge. However, it is just a little bit clumsy on the server repo after the push. Because after the push, the 'merge' commit is actually empty since the server repo already has these changes before your push. It is still harmless, but annoying when many empty 'merge' commits show up.

On the other end, when using '--rebase', it is acting like these two–local and server–were one branch, like in SVN. Git does three things,

  1. Un-commits all of your local commits that are not already in server repo.
  2. Stuffs the new commits from the server repo onto the stack
  3. Plays your changes back on top of them

You no longer merge–you actually commit on top of the server repo. This will result in a cleaner commit history on both local and server repos. This is also where the name comes from, single-branch workflow. This post has a graphical comparison of pull vs. rebase.

Feature Changes: Two-Branches Workflow

If adding a new feature or involving a complicate task, it is encouraged to create a branch and merge/rebase back to the master branch, the default name of the main branch. The name, two-branches workflow means to create a new branch asides the master branch. Git keeps the whole commit history locally, thus creating a branch is very cheap. Let’s create a branch ‘newfeature’ as an example.

$ git checkout -b newfeature  # new branch
Switched to a new branch "newfeature"

[.. changed some files, made a few commits ..]

During the development, if there are commits on master and need to be merged to the newfeature, the following command can be issued while on newfeature branch.

git rebase -i master
   # bring the branch up-to-date with master
   # check Interactive rebase for more information
   # will put branch changes on top of master

If conflicts are found while attempting to play back your changes, it throws you into an unnamed branch and gives you a chance to merge everything together. One article about resolving merge conflicts might help. After correcting the conflict(s) you would type

$ git rebase --continue

as directed by the conflict message.

Probably you want to share these changes with colleagues for code review. One way is to push the branch to the server repo.

$ git push -u origin newfeature 
  # push branch to the server repo
  # -u tells git that you want to be able to easily 
  # push and pull changes to that branch in the future

Afterwards, a colleague can get the branch with

$ git fetch origin 
  # update local repo with changes from server repo
$ git checkout --track origin/newfeature
  # creates a local branch named newfeature that 
  # matches the origin/newfeature branch 
  # and tells git that he wants to be able to 
  # easily push and pull from the branch

After the code review is done, it is time to rebase/merge into the master branch. Here rebase is preferred with the same reason mentioned in the previous section.

$ git rebase -i master
   # first bring the branch up-to-date with master

$ git checkout master 
Switched to branch "master"

$ git rebase -i newfeature 
   # interactively put newfeature commits on top of master

Thus the changes of newfeature have successfully merged to master, nice and clean.

Finally the newfeature branch can be thrown away and the commits pushed to server repo.

$ git branch -d newfeature # delete locally
$ git push origin :newfeature # delete on server repo, mind the ":"

$ git push # merge with server repo

This completes a two-branch workflow.

One thing to mention, although rebase keeps things tidy, it could be destructive and potentially dangerous if you don’t know what you are doing. Details can be found in the git-rebase manual page. Jarrod Spillers proposes a simple rule.

Don’t rebase branches you have shared with another developer.

It is always good to ask around if uncertain.

Acknowledgment

This post is not possible without the kind discussion with falcou@#ste||ar IRC channel. For more information, check the Pro Git book and other good sources. Happy Gitting.

GD Star Rating
loading...
Developing HPX with Git: A Tutorial, 5.0 out of 5 based on 1 rating

    Leave a Reply

    Your email address will not be published. Required fields are marked *