What’s the ROI of Continuous Integration?

2020-05-24 20:43发布

问题:

Currently, our organization does not practice Continuous Integration.

In order for us to get an CI server up and running, I will need to produce a document demonstrating the return on the investment.

Aside from cost savings by finding and fixing bugs early, I'm curious about other benefits/savings that I could stick into this document.

回答1:

My #1 reason for liking CI is that it helps prevent developers from checking in broken code which can sometimes cripple an entire team. Imagine if I make a significant check-in involving some db schema changes right before I leave for vacation. Sure, everything works fine on my dev box, but I forget to check-in the db schema changescript which may or may not be trivial. Well, now there are complex changes referring to new/changed fields in the database but nobody who is in the office the next day actually has that new schema, so now the entire team is down while somebody looks into reproducing the work you already did and just forgot to check in.

And yes, I used a particularly nasty example with db changes but it could be anything, really. Perhaps a partial check-in with some emailing code that then causes all of your devs to spam your actual end-users? You name it...

So in my opinion, avoiding a single one of these situations will make the ROI of such an endeavor pay off VERY quickly.



回答2:

If you're talking to a standard program manager, they may find continuous integration to be a little hard to understand in terms of simple ROI: it's not immediately obvious what physical product that they'll get in exchange for a given dollar investment.

Here's how I've learned to explain it: "Continuous Integration eliminates whole classes of risk from your project."

Risk management is a real problem for program managers that is outside the normal ken of software engineering types who spend more time writing code than worrying about how the dollars get spent. Part of working with these sorts of people effectively is learning to express what we know to be a good thing in terms that they can understand.

Here are some of the risks that I trot out in conversations like these. Note, with sensible program managers, I've already won the argument after the first point:

  1. Integration risk: in a continuous integration-based build system, integration issues like "he forgot to check in a file before he went home for a long weekend" are much less likely to cause an entire development team to lose a whole Friday's worth of work. Savings to the project avoiding one such incident = number of people on the team (minus one due to the villain who forgot to check in) * 8 hours per work day * hourly rate per engineer. Around here, that amounts to thousands of dollars that won't be charged to the project. ROI Win!
  2. Risk of regression: with a unit test / automatic test suite that runs after every build, you reduce the risk that a change to the code breaks something that use to work. This is much more vague and less assured. However, you are at least providing a framework wherein some of the most boring and time consuming (i.e., expensive) human testing is replaced with automation.
  3. Technology risk: continuous integration also gives you an opportunity to try new technology components. For example, we recently found that Java 1.6 update 18 was crashing in the garbage collection algorithm during a deployment to a remote site. Due to continuous integration, we had high confidence that backing down to update 17 had a high likelihood of working where update 18 did not. This sort of thing is much harder to phrase in terms of a cash value but you can still use the risk argument: certain failure of the project = bad. Graceful downgrade = much better.


回答3:

CI assists with issue discovery. Measure the amount of time currently that it takes to discover broken builds or major bugs in the code. Multiply that by the cost to the company for each developer using that code during that time frame. Multiply that by the number of times breakages occur during the year.

There's your number.



回答4:

Every successful build is a release candidate - so you can deliver updates and bug fixes much faster.

If part of your build process generates an installer, this allows a fast deployment cycle as well.



回答5:

From Wikipedia:

  • when unit tests fail or a bug emerges, developers might revert the codebase back to a bug-free state, without wasting time debugging
  • developers detect and fix integration problems continuously - avoiding last-minute chaos at release dates, (when everyone tries to check in their slightly incompatible versions).
  • early warning of broken/incompatible code
  • early warning of conflicting changes
  • immediate unit testing of all changes
  • constant availability of a "current" build for testing, demo, or release purposes
  • immediate feedback to developers on the quality, functionality, or system-wide impact of code they are writing
  • frequent code check-in pushes developers to create modular, less complex code
  • metrics generated from automated testing and CI (such as metrics for code coverage, code complexity, and features complete) focus developers on developing functional, quality code, and help develop momentum in a team
  • well-developed test-suite required for best utility

We use CI (Two builds a day) and it saves us a lot of time keeping working code available for test and release.

From a developer point of view CI can be intimidating when Automatic Build Result, sent by email to all the world (developers, project managers, etc. etc.), says: "Error in loading DLL Build of 'XYZ.dll' failed." and you are Mr. XYZ and they know who you are :)!



回答6:

Here's my example from my own experiences...

Our system has multiple platforms and configurations with over 70 engineers working on the same code base. We suffered from somewhere around 60% build success for the less commonly used configs and 85% for the most commonly used. There was a constant flood of e-mails on a daily basis about compile errors or other failures.

I did some rough calculations and estimated that we lost an average of an hour a day per programmer to bad builds, which totals nearly 10 man days of work every day. That doesn't factor in the costs that occur in iteration time when programmers refuse to sync to the latest code because they don't know if it's stable, that costs us even more.

After deploying a rack of build servers managed by Team City we now see an average success rate of 98% on all configs, the average compile error stays in the system for minutes not hours and most of our engineers are now comfortable staying at the latest revision of the code.

In general I would say that a conservative estimate on our overall savings was around 6 man months of time over the last three months of the project compared with the three months prior to deploying CI. This argument has helped us secure resources to expand our build servers and focus more engineer time on additional automated testing.



回答7:

Our biggest gain, is from always having a nightly build for QA. Under our old system each product, at least once a week, would find out at 2AM that someone had checked in bad code. This caused no nightly build for QA to test with, the remedy was to send release engineering an email. They would diagnose the problem and contact a dev. Sometimes it took as long as noon before QA actually had something to work with. Now, in addition to having a good installer every single night, we actually install it on VM's of all the different supported configurations everynight. So now when QA comes in, they can start testing within a few minutes. Now when you think of the old way, QA came in grabbed the installer, fired up all the vms, installed it, then started testing. We save QA probably 15 minutes per configuration to test on, per QA person.



回答8:

There are free CI servers available, and free build tools like NAnt. You can implement it on your dev box to discover the benefits.

If you're using source control, and a bug-tracking system, I imagine that consistently being the first to report bugs (within minutes after every check-in) will be pretty compelling. Add to that the decrease in your own bug-rate, and you'll probably have a sale.



回答9:

The ROI is really an ability to provide what the customer wants. This is of course very subjective but when implemented with involvement of the end customer, you would see that customers starts appreciating what they are getting and hence you tend to see less issues during User Acceptance.

  • Would it save cost? may be not,
  • would the project fail during UAT? definitely NO,
  • would the project be closed in between? - high possibility when the customers find that this would not deliver the expected result.
  • would you get real-time and real data about the project - YES

So it helps in failing faster, which helps mitigate risks earlier.