Wildfalcon

Laurie Young: Scrum Master, Dancer, Photographer and Entrepreneur

Best Practice work-flow with git

Introduction

A quick look about on the web will bring you up-to speed on pretty much all you need to know about git. There are some great introduction’s to what it is, detailed manuals, and best of all an explanation of how it works aimed at people who understand computer science (and if you can’t follow that, you’re not going to earn much working as a programmer). However there is somethings missing from all these pages, and that’s some best practices on how you should actually use git. What work-flow should you use, and what best practices should you follow.

This is really important. The problems with subversion were not that some of its operation’s could be slow. Personally I never found myself staring at my screen, twiddling my thumbs, or going for a quick round of Mario Kart while waiting for subversion to finish something. The problem I always had with subversion was that my team and I were always treading on each others toes. We had a number of releases that were late because we were all committing code over each others’ work, and introducing unnecessary complexity.

However, there is a solution, and we found it. To fully understand it though, you need a good understanding of the problems that need solving.

The real problem with subversion

When I first came across continuous integration, I thought it was an absolutely great idea. If, as it often suggested, integrating work from different developers is hard, with the difficulty increasing roughly quadratically with time since the last integration, it make a huge amount of sense to integrate as often as you can. But I learned the hard way that it’s not true. At least not at the small scale. When someone else in my team is working on code, they, like all developers often go though a phase of sketching out their solution in code. This is normally pretty bad code from a production point of view. Their next step is to tidy this up, and make it into production quality code. It’s at this point that integration is good. Any fool can see trying to integrating my production code with a colleagues sketch code is bad. This is what happens with subversion though. Everything gets committed, otherwise you risk loosing your work if there is an issue (you won’t believe how often developer leave their laptops in bars). Branches are of course for exactly this reason, and I will talk about them later. For now lets just say I don’t know anyone using them successfully in subversion.

The problem with subversion then is that there is a tension, between trying to integrate your code with the rest of your team, and trying to get far enough down the route of maturing your code that you don’t create a bottle-neck. I experienced at least one case personally where one developer was doing a major chunk of refactoring, and it acted as a bottle-neck, preventing any bug fixes from other developers being committed and deployed. Subversion makes avoiding this too expensive.

The solution is branches, merging, and testing at each stage

Hopefully you knew that already, but does your team actually do it? This was always the answer I would give if asked how manage a code base, but no team I worked in ever managed it.

Why? Well quite simply, merging is hard. Subversion merges don’t work well with code that moves. If I move a chunk of code from one directory to another, subversion no longer tracks it well between branches. This is something I do a lot when refactoring code, and it breaks subversion.

On the other hand, because git expects merges, and moves, to be regular events it handles them very well. This is probably because under the hood it tracks the contents of your files rather your files, but at this level of understanding, all I care about is it works. I can create a branch, work on it, and merge, and apart from some annoying glitches between how editors handle whitespace, most things just work.

Git work-flow

This is the work-flow we chose (and if your skimming this article, this is the best bit to read)

First of all, we had a centrally hosted repository. I’m of the opinion that trying to run git with no repository being authoritative can work, but adds various complications, and pretty much no benefits. It might be cool, but that’s what the kids who give you cigarettes at school always said.

Then we had an authoritative branch on that repository. We all set our authoritative repository to be called “origin” and the branch was called master. Thus origin/master represented the state of the art production code. However, no-one, absolutely no-one was allowed to work on master. Most of the time everyone in the team (except the gatekeeper) did not even have master set up as a local branch.

Secondly, we had another branch on origin, called stable. Stable was always an ancestor of origin/master, but lagged behind a bit. Stable had various tags placed on it, which represented the actual public releases we made. More on this later.

Next each developer had as many branches as they wanted. Foremost though, each developer had a branch named after themselves. So I mostly worked in the ‘laurie’ branch, which was also on origin as origin/laurie. Along with this, each developer had a copy of the deployment platform on their workstation, and on on a staging server, in my case this was called laurie-stage. Each developer then works there. They write their tests, modify their code etc, making lots of commits along the way (the local nature of git commits makes regular small commits a very easy habit to get into, and it’s a very good one when you need to debug something that went wrong a while ago). When I am happy with my work, and its tested and working locally, I merge master into it:

git fetch && git merge origin/master && rake spec

This command gets the latest version of master, and applies any new changes to my code base. Master is not changed. I run all my tests again, and then deploy to stage-laurie. I then pass this over to my quality assurance guys (which could be me in another hat, but we were lucky enough to have a secondary team of people who were in a position to do the testing instantly). They test the product, checking that the feature has been added correctly, or the bug fixed, and that no new bugs have been introduced (though your unit tests will catch that – right?). This is continuous integration happening right here.

Git work-flow

After I have gotten my code to a point where this all passes, I push the state of my local branch to origin/laurie, and I go and talk to the gatekeeper.

In our team the gatekeeper was a person, though if your brave you could automate him. The gatekeeper has a local master branch. After I have told him that the changes in laurie are good to go, he asks round the rest of the team. Are any other branches good to go, generally there will be about 2-3 branches ready to go at any one time. He then gets a summary of what the changes are, and orders them in order of business value. Then, starting with the most critical change, he merges it into master, and runs all the tests. He then does this for the next most critical fix and so on. If at any point one of the merge results in code that fails the tests, he can simply un-merge (moving the post it note mentioned in the Git for Computer scientist article, – you did read it I hope).

Other developers can help with this process on their workstations. After my changes (which are of course the most important) are successfully merged into master, everyone else can pull master again, and merge it back into their branch, – preempting any conflicts and fixing them.

Once the gatekeeper has merged in all the changes, or at least all the ones that don’t conflict and break tests, he deploys this to a master staging server. Once again the quality team takes a look, this time concentrating on making sure that no existing functionality has been blatted by any of the changes. Assuming that passes the gatekeeper then merges origin/master into origin/stable. Tags it with the latest release revision number, and deploys onto our production environment.

We found this flow worked really well. Conflicts and merge related issues did occur, but always when merging the master branch into a developers local branch, so at most one team member was held up by this.

We took the policy of releasing as often as possible, so we would often release a new production code-base 2-3 times a day, each time with fully tested code. Sure, there were a few mistakes, but even when we were making big change, and developing the work flow, no bug serious enough to need us to roll-back the production code got through the safety nets.

If you need to guarantee that there are no mistakes, then like any project, you need to increase the depth of your test phase. Ours was relatively fast, as most of the users were alpha/beta testers :)

Summary

Releasing production code that often was a great asset too. The management team could see that work was progressing. Even if it wasn’t going the speed they wanted (is such a thing possible) they were greatly comforted to know that the users would see several improvements per day. As a team we had the freedom to allow one or two developers to pick up a slightly longer scale project, such as refactoring an important sub-system while the rest of the team got on with pushing out live improvements, and of course the users got the experience of a system constantly being updated. As we were sensible with listening to the users before choosing the next piece of work, they also got the feeling that the application was very responsive to any change request they made.

Tagged as: , , , ,
  • http://ninjaverification.wordpress.com/ Martin d’Anjou

    Laurie,

    What do developers pull from to populate their local repos in the work-flow you describe? stable or master?

    Also, in my case, when multiple branches are “good to go”, development cannot stop while deltas are queued up for QA tests. But we don’t know yet if they will be accepted. Still developers may need to build on top of the changes that are pending in the QA queue. How do you think this should be addressed? – Thanks, Martin.

  • Laurie

    Hi Martin. I would suggest that developers populate their local branches from Master, as it will make it easier for them to merge back into master, git may get a bit confused otherwise.

    We worked slightly different to you. We tried to stick to the concept of “Done done”. which means we do not start on a new feature until the first one is totally done, and has been QA’ed. This feels a bit slower at first, but forces you to smooth out this process, and saves you all the time of context switching back to the old feature if there are QA issues (and there always are! :-) )

  • murdocj

    I don’t understand why people think branches are a problem in subversion.  Branches are cheap and quick to create and easy to use in subversion.  We used them quite successfully, including doing some major development for about a year on a branch while we released several times from trunk.

  • http://wildfalcon.com Wildfalcon

    The real problem comes with merging, and consists of two parts:

    * You need to remember what the last merge point was, and provide it as command line option when doing the merge. Other tools like git work even if you can’t remember the last point you did a merge at
    * The conflict resolution algorithm in SVN is not as advanced as with most other tools. That means you spend more time having to resolve the conflicts after a merge

  • murdocj

    I think the latest version of subversion does remember merge points between branches, but we were using an older version that didn’t. 

    The merging did work most of the time.  There were conflicts where two people edited the same line in the file, but I don’t see how you can get around that sort of thing. 

    BTW, we’re trying to settle on a workflow for using Git, which is why I was interested in this page.  I’m used to the flow of committing to a main line or trunk, and when a release is done, branching at that point to support the release (e.g. fixes made on the branch and merged to trunk).  Do people work that work that way with Git?  For a small group with a release cycle of every month or so that seems more straightforward than the flow of having a gatekeeper merge my branch to master and then master to stable.  We’re small enough that the gatekeeper is likely to be the person making the change.

Featured Categories

Get Adobe Flash playerPlugin by wpburn.com wordpress themes