I was honored to be part of the lecturer team for Continuous Delivery 3.0 in Utrecht; An 8 week training course in which my parts were Continuous Integration and Continuous Deployment.
Version control was and perhaps always will be a subject of debate. It seems at times we are in a search for a singular “best’ way to manage versions of code under all circumstances. Since about 2008 git has increased popularity significantly and is the most popular version control system since 2012. This meant a new round of debate on how to use the most popular version control system at the time: git. Since it is 2017 now, has the dust settled, and are there any conclusions?
While git is conceptually very different from it’s competitors: it is a distributed version control system and all other older popular version control systems we non-distributed (mercurial excluded). Having no central server is a critical feature when working in a open source setting, for closed source in a corporate environment it is less critical but opens up a lot of new possibilities. Having a distributed or non-distributed version control does not make too much of a difference in the sense of branching policy. With a distributed version control system everyone has a full, authoritative copy of all versions of the code. A central git repository only becomes central because you deem it to be central, rather than the olden days where there was just-the-one server and all others were a client to that central system.
Central questions in usage/policy of software versions were, are and perhaps always will be:
- When do you branch?
- What branches are there?
- When do you merge?
If you standardize on a branching policy there are a couple of options to choose from: Gitflow, Github flow and Gitlab flow. Because of popularity I’ll only discuss the first two: Gitflow and Github flow
First described by Vincent Driessen in 2010 and detailed on his blog describes a straightforward usage of git using a number of standard branches. each of them serving a different purpose:
- master: Stable software only, everything that is pushed to production. production ready software lives here
- develop: Integration and ongoing development: things that will be in the next release
Then there are 3 supporting branches for workflow
- feature: are branches off develop for a “feature development” this is where the main debate is.
- release: link between master and develop done for integration work
- hotfix: branch off master for emergency fix
Below is a diagram detailing all branch-types and how the merge back into their originating branch.
The gitflow process is straightforward, well tested and a good way to start. Especially is your current deploy-process is 100% release focussed and has many manual steps in it. There a a couple of downsides, or ‘misbehaviour’ I’ve seen in the wild. Chief amongst which is the policy of starting a feature branch for everything. Step 1 in this antipattern is to create a featurebranch when you start to work on a new item (Bug, Feature, Fix, Improvement). A negative consequence of this is that the branch creator stops communicating with the rest of the world (Team, other teams). It is this pattern that I think is counter-productive to a normal version control policy.
As an alternative to gitflow, the people at github described their version control process. It is clearly based on gitflow but has quite some things removed: gitflow made simpler essentially. Github did this to allows for high frequent releases.
As you can see from the diagram it is a simpler branching an merging the idea behind it that you in principle work on the master branch. Only when you suspect features to pose a problem when merging or a speedy stable working using things like feature toggles or branch by abstraction you create a branch for them: a feature branch. When you finish the work you merge the featurebranch back into master.
So now that we have the background of version control policies covered it is time to use one of them. I’d suggest starting with analyzing the components and systems first and how they are currently managed in terms of version control. Are there long integration cycles with 1 or 2 releases per year? chances are you are using neither of the two flows described earlier. I bet you use something more traditional where all upcoming releases have their own project-branch.
The change from such a model to a gitflow model is relatively small: you make sure that all development is integrated into the ‘develop’ branch first and make teams work together if needed on the same component. Use external stakeholders to the teams (for instance: architects, leads, ‘component stewards’ ) to play an active role in facilitating concurrent changes on shared components. These people may not be component gatekeepers or judge and jury. This process will take some time in adopting and getting all teams involved in accepting this new way of work. Our tendency as engineers is to be a more of component dictators rather than component stewards. Some engineers go through lengths defending that they, and only they are allowed to perform a change on a component. This is sometimes even codified in branching policies. The long term goal is not to make all engineers able to do all changes all the time: we need to make sure that the policies do not cause friction in delivering changes to production. Creating a dictator-based bottleneck is a form of friction that we need to remove ASAP.
So now we have a simple, version control policy and are changing the way we deliver software. The next step is to remove more and more of the feature-branches as they are likely to be created by default by all engineers in all teams as soon as you implement any item from the backlog. The reason for trying to remove the mass-creation of feature-branches lies in the communication breakdown between people and teams. The breakdown in communication is always there when you branch: every form of branching diminishes the communication between parts of the organization. This can be as simple as the breakdown between 2 people working in the same team on different features. When you branch you create an isolated world for you or your team to influence: it allows for you to not coordinate with other people who may be working on the same code. If you do this too long (longer than 1 day) you run the risk of missed coordination. This missed coordination can result in merge conflicts but more importantly is a missed opportunity to learn from you colleagues. With some communication and coordination you may have had the opportunity to refactor the code or to design it in such a way that both features are implemented.
When you compare gitflow to github-flow you can see that github flow is less bothersome in the branching since it prescribes less branches and is better suited for smaller batches (releases). This way of working is ideal for Continuous Delivery and is in fact a real Continuous Integration strategy. It is not for everyone though. You can make a comparison to the way in which you can implement Lean in a manufacturing process; one of the hallmarks of lean is that you strive to have as little inventory as possible. The worst way you can implement this is by removing all inventory racks in one go. Striving for less stock inventory will mean you have less inventory racks, in time. The same can be said of implementing Continuous Integration. All branching that goes on are a form of stock inventory: if your process is currently very high on inventory (many, many branches) that means that gitflow will be a better initial fit. Only in time: with lots of focus on the architecture, workflow and processes will you be able to (through continuous improvement) lower your stock inventory. You are able to release software in smaller batches and will find that gitflow branching is becoming bothersome. Right before you reach this moment is the time you switch from gitflow to github flow.
Feature-branches may be a necessity because of the situation the software is in. I believe you must strive to remove as many feature-branches as possible. A change you can make right away is to make the creation of a feature branch conditional. If you implement this simple policy you encourage teams to work together constantly on shared components through the use of common branches and delivery. This tension between teams should result in improvements so that multiple changes can take place concurrently on a component. Perhaps the component should not be shared and split up. this all depends on the efforts done by the team in the technical design of the product(s)
In the end: by practicing some restraint in the use of feature-branches with gitflow you remove some of the main risks. These risks are all related delayed communication and will result in merge conflicts, defects, technical debt and optimizations not applicable to today’s situation. What remains in gitflow are a number of very useful branches to facilitate integration: A release branch for stabilization, a hotfix branch in case of production emergencies. This base will work nicely when your release frequency is about once every 3 months to once every 2 weeks. When you start to move into the realm of continuous delivery and continuous deployment you will find that even with the banning of feature branches the training-wheels that helped you so well (develop, release, hotfix branch) will start to be a bother. This is an ideal moment to start to move towards github flow. By now you probably have shrunk the delivery size of each new deploy: smaller components, better aligned architecture. In my experience there is little truth in the ‘singular best way to version control’. You can summarize it with a common design mantra: “It depends”. Not all software is immediately ready for a low-inventory way of working like in the github-flow. Using gitflow make the transition easy if you do not fall into the “create a feature-banch for absolutely everything” pitfall. therefore gitflow is a great way to start, if you use it “responsibly”