Continuous Integration vs. Feature Branching: Best Practices for Team Collaboration

Convert Your Audio To Text

4.9/5

3723 customer reviews

Explore the differences between continuous integration and feature branching, and learn best practices for effective team collaboration in software development.

Continuous Integration vs Feature Branch Workflow

Added on 09/30/2024

Speakers

Add new speaker

Speaker 1: When writing code, how do you organize your work so that it's compatible with that of your team? Do you create a feature branch, work until the feature is complete, and then issue a pull request for the feature to be merged? If you do, then this isn't really continuous integration. In this episode, I want to talk about feature branching and continuous integration. Hi, I'm Dave Farley of Continuous Delivery. Welcome to my channel. If you haven't already, please hit subscribe, and if you enjoy the video, hit like at the end. Continuous integration is something very specific. Here is the description that first introduced continuous integration and was published on the C2 wiki. They described it in the following terms. Development teams use code ownership to minimize conflicts among people editing. The longer engineers hold on to modules, the more important it is to minimize conflicts. What if engineers didn't hold on to modules for more than a moment? What if they made their correct change and presto, everybody's computer instantly had that version of the module? You wouldn't ever have integration hell because the system would always be integrated. You wouldn't ever need code ownership because there wouldn't be any conflicts to worry about. The fundamental assumption of continuous integration is that there is only one interesting version of the code, the current one. I think that's a really nice description. I've actually spoken about that before on my channel. But what it means is that continuous integration is not about the tools, it's about the way in which we think about change and that we should think about change differently. Continuous integration is about working so that our changes are always visible at least to our teammates. This presents a problem when we start to think about branching because branching is exactly the opposite of that. Branching is about isolating change. By definition, we create a branch to hide changes in one part of the code from another. As a result, I have been known in the past to offer the following advice on branching. Don't branch, don't branch, don't branch. I do that something as something of a joke. It may not be very funny, but in information terms, a branch is the same as that kids game where you fold a piece of paper and then draw funny pictures. Each person takes a turn at drawing a part of the picture. They don't get to see the other part of the picture and you end up with a funny result. Each person draws a separate part of the picture and then we all laugh at the results. If we all get together and can see what everybody else is doing, then drawing a nice picture is really easy. But if we each work independently of one another, then we may end up with a funny, if you're playing a game, or an unfortunate, if you're writing software, result. This is the problem that continuous integration was invented to solve. Continuous integration is like everybody seeing everybody else's drawing as it evolves. The subtitle of the book that popularised continuous integration was Embrace Change, not Hide Change. Branches used to be costly. They were badly implemented in most version control systems, so there was a natural pressure to avoid them because they were a pain in the neck to create. Modern distributed version control systems changed all of that. Git, in particular, made branching really easy. It really vastly improved the merge tools too, the tools that allowed us to recombine separate branches at the point at which we wanted to. This made branches feel cheap. Unfortunately, while branches are cheap to create, they can become quite costly in other ways very quickly. The problem is that whenever we have a branch, we have two different versions of the truth. So how do we tell which one is the real truth? Often, to be honest, it's neither. It's usually some combination of the two versions that form a truth in the sense of code that's going to make it into production. Let's look at this from the perspective of testing for a moment. If we want to test our systems, what we're trying to do is that we're trying to evaluate some kind of idea to a standard where we're comfortable with releasing that idea into production. So we'd like to be able to evaluate our ideas quickly and efficiently on the route to production. We're going to inevitably create branches of some kind. We're going to, as we're working on the code, there are certainly going to be times when we don't want to share the code that we're working on with everybody else. My argument is going to be that those times should be very short, but we'll come to that. So we're going to create these branches. Now, how, if we want to be definitive, if we want to really control the variables, if we want to really understand whether our change is correct or not, safe or not to release into production, then where do we test those changes? If we've got these separate branches, then clearly we must consider the points at which we merge the changes back into the line of code that is going to end up into production. Master, trunk, head, whatever you want to call it. Certainly, we must be writing tests at these points to be sure confidence in our changes. If that's the only place though where we run our tests, there's a problem. Because if that's the only place when we run our tests, what that means is that if I'm working on a branch and my branch lasts for a long time, I don't get to know whether my changes are good, safe, alongside everybody else's until I merge at the end, at the point at which I think I'm finished. That's much too late for me to find out that I've been doing something stupid. I want to find out that I'm doing something stupid much sooner than that so that I can stop myself doing stupid things. What some teams do is that they will run what they call continuous integration on their feature branches. The downside with this is that they are now testing a version of the code that is unlikely to ever be the version of the code that will go into production. We're spending time and effort and money on evaluating code that isn't the truth. We're evaluating a different version of the truth in some way. A better strategy is to essentially eliminate those branches and always work on the line of truth in the code. That's what continuous integration is. As I said, there are always versions of the code. At the point at which I'm making local changes, they're going to be local. What we are really talking about is shrinking those branches down to their absolute minimum. Typically, when I'm working in continuous integration, I'm going to write a test, run it, see it fail, write some code, run it, see it pass, refactor the code in the test. At that point, I'm going to commit. If I'm doing well, I'm probably committing every 10 or 15 minutes throughout the working day. That means that I'm only ever 10 or 15 minutes away from the truth of master, head, trunk, whatever you want to call it. That is continuous integration. That means we still evaluate every time we commit our change to master, but that keeps the changes flowing. It allows us to get much deeper insight into what our teammates are doing and, more importantly, how our changes work with their changes. The definition of continuous integration says that members of a team integrate their work frequently. Usually, each person integrates at least daily. That means that each member of a team is involved in the team. If you are not integrating at least daily, it doesn't really count as continuous integration. The DevOps handbook says, the data from the Puppet Lab's 2015 state of DevOps report is clear. Trunk-based development, that is continuous integration, predicts higher throughput and better stability and even higher performance. The man who invented Git, the version control system that nearly all of us use these days, says if you merge every day, suddenly you never get to the point where you have merge conflicts that are hard to resolve. All of this points in the same direction. We should work in tiny steps. We should commit those steps often and we should at least daily and we should, by preference, prefer to do a lot more than that. We want to commit many times, many small changes every day. As one of my friends once said when describing continuous integration, continuous is quite a lot more often than you think. Inevitably, this way of working implies a trade-off. We can't afford to wait until our feature is complete before we commit. That represents a significant change to the way in which we organise our work. It means that we need to become comfortable with the idea that we can commit partially complete features. If we're practising continuous delivery, it gets worse. Not only do we need to feel comfortable committing partially complete features, we need to be comfortable with them being deployed into production too. Continuous delivery is working so that our software is always in a releasable state. That means that any of our changes may end up in production. How do we do this? How do we organise to work in this way? The jargon in continuous delivery parlance is to separate deployment from release. The idea is that we are happy to deploy a partially completed feature into production that is not yet released in the sense of being available for people to use. There are three common strategies. There are more than this, but there are three common strategies that I want to talk about today. The first is dark launching. We deploy something that people can't see yet. It's not wired into the production system in a way that's useful or available to the people using the system. It could be some code in a test that isn't connected up anywhere else. It could be that the code just isn't used yet. Or maybe we build the back end of some feature before adding the UI at the end or not connecting up the UI to the rest of the UI so people can't find it. This is an excellent strategy and allows us to do some really quite sophisticated things. I once built a very high performance messaging system which involved clustering and failover and distributed asynchrony using this strategy step by step by step and putting that into production step by step by step. The next of these key strategies is branch by abstraction. This is a great technique for allowing us to release big changes incrementally. We improve the abstraction in our existing code step by step, allowing cleaner interfaces, evolving cleaner interfaces over time to improve its modularity. Then we can start to replace small components of the system with new ones over many deployments and releases potentially. We can use this in combination with dark launching to really good effect. We can even run old versions of a module in parallel with the new so that we can figure out whether the new one fulfills its intended behaviours. Finally, the strategy that everybody has heard of is feature flags. We hide a new incomplete feature behind a software switch, a flag. We continue to develop the new feature over time, over many different deployments perhaps, behind the flag until it's ready for release and then we flip it, make it available by flipping the flag. Again, this works really, really well in combination with the other strategies. I would recommend that you adopt these approaches pretty much in the order that I've mentioned. Feature flags is often the first thing that people jump to. I would suggest that dark launching is probably a little less risky in some ways. Feature flags has the downside of which version of the code do you test. But all of these strategies are really useful. If you've been thinking carefully about what I've been saying as I've been describing this, maybe you're thinking of this question. Isn't this just information hiding the same way that a branch is? Well, yes, this is. All of these strategies involve a form of branching. But they are quite different in an important aspect and have different properties as a result. Imagine for a moment making a big change in your code base. If you're in a team working with feature branches, imagine making that change, replacing a widely used function or class with something a little different, perhaps. If I work on a team where every developer has a long-lived, that is longer than a day, feature branch, then my options are to race them to try and get my commit in before they do so that they don't force on me some horrible merge as a result of the widespread changes that I've just made. If I do that, then really what I'm trying to do is that I'm trying to give my colleagues the pain rather than I have the pain. My colleagues are going to have an unpleasant merge when it comes time for them to merge. Or I wimp out and I don't make the change at all because it's too hard. That's pretty much the choice that I have. The CI strategies branch, but they branch on behavior, not on code. Behavioral branching is different. We can have multiple behaviors in a committed set of code that we can all test in parallel with each other and evaluate in parallel with each other and change in parallel with each other, but they're not all available for use at runtime. So, if I create a merge problem for my colleagues by doing the same kind of refactoring in a CI world, then at worst I've cost them a few minutes worth of merge pain or work. In the very worst case, if they had to throw away their changes and redo them, maybe it's 10 or 15 minutes worth of work if they're working in the way in which I described. Because if you're committing regularly throughout the day, your changes will be small and simple, and so the amount of investment in those changes is small and simple. Version control is a foundational idea in continuous delivery. Our aim should be to ensure that every bit and byte that ends up in production is the one that we intended it to be, and to a large extent, the one that we tested. Thank you very much for watching.