Continuous Integration – Integrating People
Introduction
I know it has been a while since I wrote anything and I can come up with a number of good excuses as to why:
- New job at globo.com. Learning A LOT and working hard to improve our current development ecosystem.
- Pyccuracy and Skink maturing. Both projects have reached a stage where I can call them mature and as such require some extra attention. The python community has been awesome in this, since we are getting some cool patches.
- Talking about Pyccuracy and Skink at FISL 10. This took some time to prepare both presentations and myself. It paid more than tenfold!
As I said before, those are excuses. Mainly I haven’t written because I’m lazy. I need to write more as a way to remind myself of the cool stuff I’m doing. That’s the main reason for this post. Me and the guys (and gals) at Globo.com are using some pretty neat agile techniques and today I want to share our vision for Continuous Integration.
Continuous Integration – Do we care?
Much has been said about Continuous Integration before. I’m sure you can easily grok the concept just by googling a little. That’s not what I want to discuss today.
At my current assignment at globo.com, we have a pretty unusual setting – a nine pairs team. That’s right, 18 devs + 2 Scrum Masters + 1 QA + 3 Product Owners = 24 people. I know that sounds astoundingly big for an agile team, since we’ve always learned that agile teams ought to be small. I’ll discuss the benefits and drawbacks of this approach in another post.
We needed a way for all this people to communicate as for the project’s current status and it made sense that this communication revolved around the codebase. If you keep track of my blog, it should be no news to you that I’ve built from scratch my own Continuous Integration server. Several people asked me why, to which I always replied that I didn’t find one that fits all my needs. Today I can show another use for it.
First I’ll show a picture of the main screen for Skink at globo.com, so you can see what the hell am I talking about. We have a projector displaying the current build status on the wall in a way that ANYONE involved in the project can see what is our build status at ANY given time. This is a major philosophy in our team: we want information to be as visual as possible. If something’s wrong EVERYONE should know immediately.
We can extract some important concepts from these two pictures:
- Our current project is code-named Vegetables. This means we have a pipelined build (only go to the next step if the previous one succeeds) for it – Vegetables-Unit, Vegetables-Func and Vegetables-Acc (meaning unit, functional and acceptance tests).
- As you can see we also integrate changes into EVERY single tool we’re using. It might seem that it does not make any difference to integrate these tools since we’re using stable versions, but this is a great tool to have in the event of us trying to migrate to a newer version of one of the tools. Or even to the head (most up-to-date) version.
- You can see next to each build status the name of a person and a message. Those are the commit author and message used for the last build. I’ll discuss this portion in a minute.
- In the first picture you can see an yellow row (Vegetables-Acc). In Skink, yellow means ‘Building’. This is great for anyone looking at the dashboard, since you can easily tell which project is currently being built.
- In the second picture you can see a red row (Vegetables-Acc). That means “Broken”. This is our visual tip that something’s amiss. It’s not the only one, though, as I’ll explain.
All of this probably shows how much we care about the Continuous Integration of our code. I’ll go even further: Continuous Integration of our team members and stakeholders.
Code Ownership and Collaboration
If you’ve ever worked with other people using some form of Revision Control, you know that all of us commit code that does not work or does not integrate properly with other people’s code. We try to minimize this as much as we can, but even with tests (unit, functional and acceptance) and using a distributed Revision Control(git), we still get a lot of broken builds. There’s nothing wrong with that. Getting a broken build just means that your Continuous Integration server WORKS. Keeping a broken build, now that’s something we do not tolerate lightly.
There are two reasons why we’ve added the name and comment of the commit that broke the build in the MAIN page for the CI Server (the one that gets projected).
The first and most important reason is to integrate team members into helping each other:
Bernardo - “Oh, it looks like Guilherme’s commit broke the build – maybe I can help him. Hey Guilherme, do you need help to fix the build?”
Guilherme - “That would be great, because some acceptance tests are breaking and I’m not sure why.”
(This dialog is completely hypothetical – we do not break builds, now do we?)
What just happened is that the tool facilitated the communication between two team members working in different things. This is great, because probably it means less downtime (Broken build = NO COMMITS or RELEASES).
Another common situation is that someone else’s commit broke the build, not yours. Due to the way the CI keeps a queue of builds, it just looks like it’s your commit. What happens usually is that the actual person who broke the build looks at the broken build, goes to the person whose name is showing and explains the situation. This way no false assumptions are made and we keep going.
This goes a long way at improving the collaboration of any team, and specially one as big as ours.
The second reason is that we’ve learned that peer pressure works magic. At globo.com we always joke with whoever breaks the build. That’s a humored way of saying: “please do not break our build again”. It works magic. People just want to get the red build out of the wall and their names out of it. The sense of ownership is improved here and thus the quality of the project increases.
Warnings on Failures
Given that our golden rule is that Broken build = NO COMMITS or RELEASES, we need to be warned against it in all possible ways.
If you think that a HUGE image on the wall is not enough to warn people that the build is broken, SO DO WE.
Our build server plays an INCREDIBLY annoying siren every time someone breaks the build. It’s annoying to the point that you just can’t ignore it.
Thus it’s not possible for anyone in the team to ignore that we have a NO GO status.
Stakeholders’ Opinions
Now, what do the non-developers in our team think of all this PRO-CI mindset?
The feedback we gathered so far of all of them (Scrum Masters, Product Owners and QA) is that the CI server is a GREAT addition to our development infrastructure.
The product owners told us that they love ALWAYS knowing whether they can deploy the application to a QA environment for testing. They’ve also reported that they feel A LOT safer when all the builds are green, so it improves on their confidence that we’re doing our jobs. This is major pro because trust is a very valued artifact in our team, and we are always trying to be more transparent as to what we’re doing.
The scrum masters can keep an eye to the build status, and if a build has been red for too long, they can come to us (devs) and ask if we have any impediments that are blocking us from getting a green build. Once again we are using the CI Server as a means of achieving greater collaboration among team members.
As for the QA, it helps to know whether it’s safe to test the project or not. If the build is broken (meaning automated tests failed) why bother doing exploratory testing anyway?
Conclusion
Skink has proven an invaluable tool at globo.com. It’s easiness of configuration allied to the great information radiation it provides, have helped us achieve the aforementioned.
If you are struggling at configuring a CI server and you’re using git as a SCM provider, just send me an email or leave a comment and we can help you get up and running (it’s a lot easier than you expect).
Just to finish the post I’d like to say that the reason we care so much about our CI server, other than the ones I already said, is that we do not want to become Shrödinger’s programmers. Excerpt from Guilherme Chapiewsky’s post (Translated from Portuguese):
I’m saying this because of the Shrödinger’s Cat Theory. Summarizing the story, the physicist Erwin Schrödinger once suggested that if we could put a cat in a closed box, where it’s life depended on the state of a sub-atomic particle, there would be a superposition of states that, from the point of view of quantum mechanics, would make the cat both dead and alive at the same time, until the box was open. It would be impossible to determine the cat’s state until the box was open. In my head I keep picturing the moment the box is open and a live cat (or dead one) shows up. What happened with the other cat? Probably he’s living in some parallel universe where all things are backwards (and people find umbrellas instead of losing them). Anyway, you can’t tell anyway what could happen.
I’ve said all of that as a meaning to explain that Schrödinger Programmers simply do not know what will happen with their system when it goes live to production. It might not display any bugs and they will get rich and buy private jets. Or not!
I’d rather write tests.
Guilherme Chapiewsky