How do you estimate a ROI for clearing technical d

2019-03-08 16:23发布

问题:

I'm currently working with a fairly old product that's been saddled with a lot of technical debt from poor programmers and poor development practices in the past. We are starting to get better and the creation of technical debt has slowed considerably.

I've identified the areas of the application that are in bad shape and I can estimate the cost of fixing those areas, but I'm having a hard time estimating the return on investment (ROI).

The code will be easier to maintain and will be easier to extend in the future but how can I go about putting a dollar figure on these?

A good place to start looks like going back into our bug tracking system and estimating costs based on bugs and features relating to these "bad" areas. But that seems time consuming and may not be the best predictor of value.

Has anyone performed such an analysis in the past and have any advice for me?

回答1:

Managers care about making $ through growth (first and foremost e.g. new features which attract new customers) and (second) through optimizing the process lifecycle.

Looking at your problem, your proposal falls in the second category: this will undoubtedly fall behind goal #1 (and thus get prioritized down even if this could save money... because saving money implies spending money (most of time at least ;-)).

Now, putting a $ figure on the "bad technical debt" could be turned around into a more positive spin (assuming that the following applies in your case): " if we invest in reworking component X, we could introduce feature Y faster and thus get Z more customers ".

In other words, evaluate the cost of technical debt against cost of lost business opportunities.



回答2:

Sonar has a great plugin (technical debt plugin) to analyze your sourcecode to look for just such a metric. While you may not specifically be able to use it for your build, as it is a maven tool, it should provide some good metrics.

Here is a snippet of their algorithm:

Debt(in man days) =
    cost_to_fix_duplications +
    cost_to_fix_violations + 
    cost_to_comment_public_API +
    cost_to_fix_uncovered_complexity + 
    cost_to_bring_complexity_below_threshold


 Where :

 Duplications = cost_to_fix_one_block * duplicated_blocks

 Violations   = cost_to fix_one_violation * mandatory_violations

 Comments     = cost_to_comment_one_API * public_undocumented_api

 Coverage     = cost_to_cover_one_of_complexity * 
                         uncovered_complexity_by_tests (80% of
                         coverage is the objective)

 Complexity   = cost_to_split_a_method *
                         (function_complexity_distribution >=
                          8) + cost_to_split_a_class *
                         (class_complexity_distribution >= 60)


回答3:

I think you're on the right track.

I've not had to calculate this but I've had a few discussions with a friend who manages a large software development organisation with a lot of legacy code.

One of the things we've discussed is generating some rough effort metrics from analysing VCS commits and using them to divide up a rough estimate of programmer hours. This was inspired by Joel Spolsky's Evidence-based Scheduling.

Doing such data mining would allow you to also identify clustering of when code is being maintained and compare that to bug completion in the tracking system (unless you are already blessed with a tight integration between the two and accurate records).

Proper ROI needs to calculate the full Return, so some things to consider are: - decreased cost of maintenance (obviously) - opportunity cost to the business of downtime or missed new features that couldn't be added in time for a release - ability to generate new product lines due to refactorings

Remember, once you have a rule for deriving data, you can have arguments about exactly how to calculate things, but at least you have some figures to seed discussion!



回答4:

+1 for jldupont's focus on lost business opportunities.

I suggest thinking about those opportunities as perceived by management. What do they think affects revenue growth -- new features, time to market, product quality? Relating debt paydown to those drivers will help management understand the gains.

Focusing on management perceptions will help you avoid false numeration. ROI is an estimate, and it is no better than the assumptions made in its estimation. Management will suspect solely quantitative arguments because they know there's some qualitative in there somewhere. For example, over the short term the real cost of your debt paydown is the other work the programmers aren't doing, rather than the cash cost of those programmers, because I doubt you're going to hire and train new staff just for this. Are the improvements in future development time or quality more important than features these programmers would otherwise be adding?

Also, make sure you understand the horizon for which the product is managed. If management isn't thinking about two years from now, they won't care about benefits that won't appear for 18 months.

Finally, reflect on the fact that management perceptions have allowed this product to get to this state in the first place. What has changed that would make the company more attentive to technical debt? If the difference is you -- you're a better manager than your predecessors -- bear in mind that your management team isn't used to thinking about this stuff. You have to find their appetite for it, and focus on those items that will deliver results they care about. If you do that, you'll gain credibility, which you can use to get them thinking about further changes. But appreciation of the gains might be a while in growing.



回答5:

I can only speak to how to do this empirically in an iterative and incremental process.

You need to gather metrics to estimate your demonstrated best cost/story-point. Presumably, this represents your system just after the initial architectural churn, when most of design trial-and-error has been done but entropy has had the least time to cause decay. Find the point in the project history when velocity/team-size is the highest. Use this as your cost/point baseline (zero-debt).

Over time, as technical debt accumulates, the velocity/team-size begins to decrease. The percentage decrease of this number with respect to your baseline can be translated into "interest" being paid on each new story point. (This is really interest paid on technical and knowledge debt)

Disciplined refactoing and annealing causes the the interest on technical debt to stablize at some value higher than your baseline. Think of this as the steady-state interest the product owner pays on the technical debt in the system. (The same concept applies to knowledge debt).

Some systems reach the point where the cost + interest on each new story point exceeds the value of the feature point being developed. This is when the system is bankrupt, and it's time to rewrite the system from scratch.

I think it's possible to use regression analysis to tease apart technical debt and knowledge debt (but I haven't tried it). For example, if you assume that technical debt correlates closely with some code metrics, e.g. code duplication, you could determine the degree the interest being paid is increasing because of technical debt versus knowledge debt.



回答6:

Being a mostly lone or small-team developer this is out of my field, but to me a great solution to find out where time is wasted is very, very detailed timekeeping, for example with a handy task-bar tool like this one that can even filter out when you go to the loo, and can export everything to XML.

It may be cumbersome at first, and a challenge to introduce to a team, but if your team can log every fifteen minutes they spend due to a bug, mistake or misconception in the software, you accumulate a basis of impressive, real-life data on what technical debt is actually costing in wages every month.

The tool I linked to is my favourite because it is dead simple (doesn't even require a data base) and provides access to every project/item through a task bar icon. Also entering additional information on the work carried out can be done there, and timekeeping is literally activated in seconds. (I am not affiliated with the vendor.)



回答7:

It might be easier to estimate the amount it has cost you in the past. Once you've done that, you should be able to come up with an estimate for the future with ranges and logic even your bosses can understand.

That being said, I don't have a lot of experience with this kind of thing, simply because I've never yet seen a manager willing to go this far in fixing up code. It has always just been something we fix up when we have to modify bad code, so refactoring is effectively a hidden cost on all modifications and bug fixes.