Continuous Integration Made Simple, Step-by-Step Guide on Easier Software Development

Have you heard of Continuous Integration?

Do you think you practice Continuous Integration?

Do you really know what is it?

What business or technical problem does Continuous Integration solve?

Have you experienced at least one of the following?

  • Facing a lot of merge conflicts because a lot has changed while you were working on your complex feature.
  • Your code merges but it doesn’t work properly because the behavior of another code has changed considerably.
  • Too much code to review and the reviewer misses some issues, which then show up into production.
  • Code review takes too much time, and when an issue is found you have to delay what you are already working on to refocus on a feature you almost forgot about.
  • Working for weeks on a feature only to find out at review time that you did the wrong approach and wasted your time
  • Your boss or your client constantly asking for a long awaited feature. If only there was a way to show them your great progress

Yes? Then continuous integration might help you.

It is popular to talk about continuous integration (also known as CI). Every startup, consultancy or enterprise company talks about it. Yet, everybody looks very briefly at what problems it solves and what are its benefits, as if it is obvious.

It is not.

What’s even worse, some people claiming to practice continuous integration actually don’t. They think that having automated tests each time you make a change is CI. While this is essential, it is not even half of it.

What problems does continuous integration solve?

In his classic article on continuous integration, Martin Fowler tells us the story of a large company, working for years on a product but each team works separately.

Then one day, when all features are implemented, they begin merging all code bases together. What’s more, they also have to make them work together.

Together those two processes are what is called integration. The second part of the famous continuous integration.

In his story, the integration process is expected to take months or maybe even years, nobody really knows.

But that’s not me…

I know what you are thinking, because I was thinking the same thing the first time I heard this story.

This is not how I or the people I work with develop software. I don’t have that problem and therefore I don’t need continuous integration.

And I was wrong

That story really shows an extreme and sure in the daily development life of most of us, you rarely reach that extreme. Yet, I know that I was closer to that extreme then to continuous integration and maybe you are too.

How I think software development works

Here is how I used to develop features before. A feature is assigned to me. Naturally, I go to our main repository, make a local copy and then a branch. This branch is only for that feature. After all, with tools like git and mercury, branching is cheap and easy.

Then I work hard to meet the feature specs and deadlines. Once I am done, I ask a colleague of mine to do a code review and then if everything is well and all tests pass, my code is merged into the main repository and its main branch.

Sounds familiar? Yeah, this is how most of us do software development, or at least something similar.

It sounds like a good approach and it doesn’t really look like the extreme from the story above.

Well, I am wrong again.

There are a lot of issues with this process, and I’ve encountered them many, many times over the years.

How software development really works

Sometimes a feature is short and sweet, but sometimes it is complex, and it takes a lot of time before it is done.

However, you rarely work alone on a project. When it comes time for merging, somebody else has already merged their code and you might have some serious conflicts to solve.

What’s more, your complex feature has a lot of code, so the conflicts to solve can be quite a lot, too.

At the end of the day, it takes some time, but you manage to fix all of them, hopefully not after months or years like those poor developers from the story.

However, then your tests don’t seem to pass anymore, but why?

It turns out that while you were working on your feature, part of the code in the main branch has changed how it works.

So now you have to change your code, maybe even some part of your solution, so that it can work with that new behavior.

So far, even though the situation doesn’t look as bad as the story at the beginning, you are starting to face the exact same issues.

Unfortunately, it is not the end of it.

It comes time for code review. And your reviewer has tons of code to review, because you’ve added so many things during the development of this feature.

The reviewer is very enthusiastic initially, but as his reviewing goes on and on, he starts to lose focus, he misses an issue here and another there.

Finally, the reviewer is done. Then one of the three happens:

Case 1: Everything looks good, until…

The reviewer is happy with your work, the code merges well, and you ship it to production.

Until of course the issues he missed, due to the very long code to review, begin popping in production in the face of your customers.

Case 2: No worry. You just have stop whatever you are doing, right now…

The reviewer tells you that there are a few things that you have to fix. No big deal.

But you are already working on another feature. You are so focused into it that you nearly forgot how the first feature was working.

This interruption is huge pain because you have to refocus and also delay the feature you are currently working on.

Case 3: You’ve wasted your time…

In the worse-case scenario, the reviewer tells you that you wasted your time following a wrong approach. The work that you did was not what was supposed to be done.

You have to start all over again, hopefully doing perfectly everything this time.

In both cases you feel demoralized and have no desire at all to work on that feature again, ever. You just want to forget about it.

The innocent process that I proposed to you doesn’t look so innocent any more.

I’ve made all those mistakes. Much more often than I am happy to admin without being embarrassed.

Is there a better way?

At some point, I decided that there should be a better way, a way which allows you to

  • Make merging simple and avoid merge conflicts in almost all cases.
  • Easily integrate with the rest of the system, so that everything is working as expected.
  • Do simple reviews so that no issue is missed.
  • Finish reviews in very little time, while you are still focused on that work.
  • Spot critical issue early on and avoid wasting your time.

There are some agile techniques like daily standups, peer programming and more which might help, especially with communication issues, but still other problems persist.

Continuous Integration to the rescue

I remembered then that I’ve heard about something called Continuous Integration. So I learned as much as I could about it, then I begun practicing it on all my projects.

Before going deep, let’s have a look at how to practice continuous integration.

You have your central code base. It might be stored using git on GitHub, or it might be just be a folder on some server. It doesn’t really matter.

You get your copy of the code. You work on a feature or on a bugfix. You commit regularly the changes that you make. Each time you commit, several things happen.

First, automated tests are executed to ensure that your changes don’t break anything, and everything is working as expected.

Second, after successful tests your code is reviewed and then merged. Then automated tests are executed again on the merged code, and then if and only if they pass you can work on something else.

It doesn’t sound very different from the process we talked earlier. How can it be better then?

Let’s look at the key difference.

You commit regularly the changes that you make.

By regularly I mean, you have to commit at least once a day, even when your feature is only half finished, or even if only 10% of your feature is finished.

Now it sounds different.

What’s more, you have to commit your unfinished feature. That unfinished code should still not break the system. Remember, all tests should be passing successfully.

The system should always be in such a state that even if it is deployed right now everything should be fully operational.

And now you see the whole picture. This process is very different from the process we talked about earlier.

Goals

Let’s see why we do what we do in CI.

The corner stone of continuous integration is the regular commits and merging. Basically, the regular integration. When you do it regularly like once a day, or even multiple times a day, you cannot introduce too much change into the whole system, due to the time constraint, even if you are the fastest in writing code in the world.

The first goal is that if you are breaking tests, you can quickly pinpoint what breaks them and apply a solution.

The second goal is when everybody is practicing continuous integration, between your successive commits nobody else can introduce too big of a change, so you will be always aware how the rest of the system works and there will simply be no issues of integration.

The third goal is that due to the small changes that you and your teammates are making, it is much more unlikely to have a merge conflict, and when there is, it is again very simple to resolve it.

The forth goal is by having only to review very little code, the reviewer can finish it in very little time, thus he is able to keep his focus and identify any issues. This way no issue will be missed and also if you take a wrong approach to a problem, it can be identified it early on without losing too much time going into the wrong direction.

In addition, you won’t need to wait for days or even weeks for your code to be reviewed.

How to practice?

Now, that you know what continuous integration is, how do you practice it?

What problems might arise and how do you solve them?

I am glad you asked.

Sometimes, a feature is very simple. You can finish it in a few hours, with just a little bit of code. Then, the rest of the process is very easy to be applied.

Long Features & Bugfixes

Sometimes, however, your tasks will be a big one. A feature which takes days, weeks or maybe months to be fully completed and ready for production.

I know, you donit want to put a half-finished feature in your app. Imagine next time you open Chrome, that only half the web site loads, because they’ve only done half of the new job on displaying pages. Not pretty.

What do you do about it?

There is no single solution. Instead, there are many best practices that you can use.

The first and most important is

Breaking down large features into small, even tiny components.

I know that sometimes, you look at a feature, and there is just no way that it can be broken into small working parts. However, the more practice you get, the more you will find out that this is almost never the case.

For example, with object-oriented programming you will have different classes or modules and thus a class can be your tiny task. It is not an entire feature, but it can be build and tested and reviewed much more quickly.

Why not break that even further. A function in a class, which depends only on functions which are already implemented, can be your tiny task. You do that function, you add its tests, then you commit and follow the rest of the steps.

Those practices and principles can be applied to visible UI elements just as well. For example, you don’t need to implement the entire web design. You can implement it for a specific page initially, or maybe only a single component of the page.

Hide half finished work from end users

Naturally, these tiny tasks will not be the fully finished work, so it will be best if your users don’t interact with it.

In some cases, the changes in your code might not lay on the execution path of the rest of the application. Then, you can merge as often as you like, because the work in progress is not visible to the rest of the system, yet.

This is true for both backend code and UI elements.

However, sometimes this is not the cases, but you have multiple options how to handle it.

The first option is to temporarily hide the entry points of the new code, basically removing them from the execution path as discussed above.

The second option is to use flags to not execute the code.

These can be either a release flag, so when the code is released for production the part that should not execute, is just removed automatically. However, that requires additional tool support.

Runtime flags are a simpler option. During runtime, the code decides what should be executed based on flags. In production, the flags are usually off, except maybe for some testers, and your end users never see the new code.

Several new questions might come to your mind.

First, these flags, where do we use them?

The best practice is to only use them at the entry points of your new code or of your UI, so that you have the least amount of them.

The same goes for UI. If you are developing a feature for a web app, which has its own page, it is enough to only hide the link to that page.

Second, what do we with testing, will the flag interfere?

Only minor change is required in how you test. You should run your tests twice.

Once, with no flags enabled, because this is how your code will run in production.

Second, with all possible flags enabled, because this is how you can test your new code with your existing code.

What if there is no real entry point, and I just need to modify some existing code?

In case that the work is a very small modification to existing code, then you probably finish everything in less then a day, and have it all as part of the next commit. So no issue.

However, sometimes you might want to modify a large and complex function (but you should keep your functions small). Then what you can do is to find the nearest “entry point”. The beginning of the function for example.

Then you check the flag, and either run the existing code, or you run the modified code. You might need to copy a lot of the existing code, so that you have a single flag check, instead of checking all over the place.

These practices should be enough to help you keep the most important goal:

The code that is merged, should be always ready for production, even when having half-finished features in it.

How often should you commit?

This is another question that came to my mind the first time I heard about continuous integration.

You should commit at least once a day, yes the bare minimum is once a day.

A lot of people begin to think, well if one day is ok, two day is ok, too. After all, in most companies people don’t commit that often, so two days sounds like continuous integration, too.

Yes, sounds like but it is not. Like in many other things, fitness for example, when you begin to compromise, it is easy to fall in bad habits again.

You start with two days, soon three days doesn’t sound bad as well, then you move to a week and finally you don’t care anymore and you only merge when the feature is fully completed.

We are back where we begun.

This is why you should stick to commit at least once a day. This is the only way that you should be able to practice regularly continuous integration.

Actually, as we talked earlier, you should be breaking your tasks in very small tiny tasks.

As you get better at this you will be able to break them in parts so small that you can finish each in less then two hours. As a result, naturally you will merge multiple times a day.

Tests, tests, everywhere

So far, we’ve spent a lot of time looking into doing frequent commits. Frankly, any other article I’ve read spends almost no time at all at this critical part.

They all focus on the second most important thing in continuous integration. Testing

I know that everybody is fan of testing, including me, yet I’ve seen too many companies sacrifice testing in the name of speed of development.

The idea is that writing tests takes time, which you can spend moving further your project.

This cannot be further from the truth. When you don’t write tests and things break, it is exponentially harder to find the issue.

In addition, you have absolutely no way to be certain that your new changes don’t break some other functionality.

Last but not least, when you avoid testing, you usually end up doing a lot of manual tests, which is quite slow, where as once you write a test you can run as many times as you want and it only takes milliseconds.

Let’s automate

Now that we’ve established that testing is not bad, we should go a step further.

In Continuous Integration testing is mandatory to help you keep your pace and be confident in your changes.

It is easy to forget to run your tests, especially when you made just a tiny change.

There is not way that tiny change could bring your system down [find a link with example], except when it happens.

Automated tests come to the rescue. In Continuous Integration, each time you commit your changes tests run automtically. Each time you merge your changes, tests run again automatically.

These two steps are critical.

Even when your tests pass after you commit, it doesn’t mean that your tests will pass when your code is merged into the main repository. Only after that happens you can be sure that everything looks and works well and move on the next piece of your task.

How can this benefit small development teams or even one-person development team?

It is easy to imagine how useful Continuous Integration might be for larger teams, but maybe you are a solo developer, or a very small team.

You may be thinking that in this case you cannot reap any benefits.

Actually, that was what I was thinking, and I was wrong, one more time.

However, even when you are one person, you will probably be a part of larger team later on, so you should start practicing good habits from now, or it will be much harder to introduce them when the time comes.

What’s more even if for example you are only a team of two developers, you can still face all the issue we talked about earlier.

If you notice in the examples above, each time you had at most two people. You and a reviewer, or you and a colleague who also made changes.

As you see, two people is already enough to start reaping the benefits of continuous integration, or seeing the problems when you don’t apply this practice.

Even when you are a single developer, it is a great time to use CI, too. It will be very easy to introduce and build nice habits instead of changing them later on.

How does it fit with Code Review & Pair programming?

Several times I mentioned code reviews and how they become better because there is less code to review so it is easier to focus, find issues and then it takes less time to do them.

When you have a lot of code to review, it is easy to procrastinate. With less code, it is less likely to procrastinate.

What happens when a reviewer doesn’t review your code quickly?

It might be that your reviewer is called in a urgent meeting, or is unable to review your code quickly. The best approach in that case is to ask someone else for review.

However, this is not always possible. Most of the time you don’t actually know that the reviewer will not be able to quickly review your changes.

In addition, you cannot realistically expect that your reviewer will stop whatever he is working on and immediately review your code.

You don’t want to waste your time, so you should just continue working. When you are ready to commit again, in case your previous commit is not yet merged, you can add this new work to it.

In a system like GitHub, this would look like more commits to your existing Pull Request.

The longer it takes the code to be reviewed, and more commits from you are produced, the closer you get to your initial problem.

To prevent that situation, you should be proactive and ask your reviewers to review, or when they really cannot, you should ask someone else. Depending on your company and your team that might be easy or not so.

Pair programming is another practice used by some companies. The idea is that you and someone else work together on the same piece of code on the same screen.

It used to be on the same machine, but with more people working remotely you can also use tools (tmux & ssh, web tools and more) to do that from two different machines.

With pair programming the reviewing part happens while you are developing the solution, and this might allow in some cases to improve the pace of merging new code.

As soon as all tests pass, you can just merge because the navigator in the pair, the person which is not programming, has reviewed the code.

In view of Continuous Integration, pair programming completes it very well as it makes some aspects even simpler, but it is not required. If you enjoy it, use it.

Is branching really bad? Can I sometimes use long lived branches for something?

I haven’t talked too much about branching so far, because frankly, branching or not, it is not related at all to continuous integration.

However, this is a topic people ask question for and there are a lot of opinions.

With source code management tools like Git and Mercury, branching is easy and cheap. This is also one of the reasons why people tend to have long living branches making a lot of changes and then stumbling upon all the issues we discussed at the beginning.

So no, branching is not bad, especially when you don’t use it in a bad way. Long living branches on the other hand, for feature development, are usually wrong, as they are an incentive to not commit often enough.

That said there are at least two very useful cases for them.

The first one is doing experiments. Sometimes some code should never reach, in any form, your end users or at least not until you are sure that this is the path to go.

In that particular case, a long living branch for a particular experiment is very useful. You can easily switch between main code.

However, there is just one caveat. It is very easy, when you decide that this experiment works well, to actually transform the branch from experiment to a feature, and the dump a lot of code for merge and review into your main repository or your main branch.

You should avoid it. When you are experimenting, you should be very clear, from the start, in which case this is going to stop be an experiment and that should be as early as possible, so that you can begin applying continuous integration.

Another useful case for long living branches is to keep old versions. Maybe you are working for a company selling databases, and you have your latest version, but you still have to support for bug and security fixes your two previous versions.

In that case, long living branches are great for keeping versions separate.

However, it doesn’t mean that you should not be doing continuous integration on them. It is the identical process, just when you are integrating your changes you don’t use the main branch but the old version branch.

Benefits

Let’s recap and checkout what benefits can you expect by adding continuous integration to your development process

  • Make merging simple and avoid merge conflicts in almost all cases.
  • Easily integrate with the rest of the system, so that everything is working as expected.
  • Do simple reviews so that no issue is missed.
  • Finish reviews in very little time, while you are still focused on your current work.
  • Spot critical issue early on and avoid wasting your time.

This is all good, but if it is that good, why isn’t everybody practicing CI?

What is wrong with CI?

Like living and eating healthy, is not a secret and there is tons of information out there, still the numbers of people having issue due to overweight rises year over year, the same is true for CI.

First, there are people who just don’t believe in it and refuse to try it at any cost.

Then there are people who think that it is great but not for them and that they are in some unique state or condition which prevents them from doing it.

Also, there quite a lot of people, who have never heard about it.

Then there are the people, who have a lot of automated tests and think that this is CI or at least is more than enough.

Last but not least, the biggest problem with both having a healthy life and doing continuous integration is your habits. Changing habits takes time and consistency.

It is easy to fall on your old habits and give up.

People have current habits of development, and sometimes the pains might be not so big, so they continue to develop the way they do.

This is why you should start right now and continue tomorrow, and the next day and the day after and so on. True for both CI and living healthy.

Tools to help you with CI

I’ve talked a lot about the process but processes usually have tools to help them. Here are some which I’ve tried and found really good.

Git & GitHub will help you with managing your code. Easy way to keep the code done by different people together and to merge different changes together.

Jenkins and TeamCity are tools which can automate some of your processes. In this specific case the process that you want to automate is whenever there is a commit or a merge, tests should be running.

These are tools which are easy to install and mange. However, if you don’t want to install and manage anything, you are in luck, CircleCi & Travis are too great services that can help you.

These are just a few popular tools, for each there at least 5 more alternatives, but I won’t list them here as the tools are actually not important at all.

What is important is how to follow the best practices of continuous integration.

What’s next?

What’s next you may ask? It is not more learning. You’ve got everything you need to begin applying continuous integration and reap its benefits immediately.

Yes, this is the first step. Start immediately. If you decide to start at the next comfortable date, like when you begin your next task or even worse, when you start your next project, then your chance of actually doing it, will diminish and unfortunately you will need much more willpower to do it.

However, start small, even tiny. The worse thing will be to go heads on CI feel discouraged by a change too big or too hard. Your initial step should be daily commit & push to your repository. Even if this is not the main branch.

From then on, you can add automated tests, ask reviewers to review incomplete features and more.

The more you do those tiny tasks, the more comfortable you will feel and more benefits you will reap.