Thursday, April 12, 2012

3Q's Method: Measuring the Progress of Code Refactoring

Currently, I'm assisting in several projects to refactor large code bases of legacy applications. These are products of very bad, or better say, deteriorated design, and other products which did not follow any design principles what so ever.

First of all, let's agree on the basic idea that you cannot manage what you cannot measure. So, how would we measure how good or bad the code design?

The strategy that I will employ is the 3Q's strategy, described in the below diagram:

Systematic refactoring of legacy code using the 3Q's stratey

The first Q: Quick Wins

The first stage is to catch low hanging fruits, like identifying and removing dead code, removing duplicate code, and reduce method length. At this stage the following measures would be helpful:
  • Cyclomatic complexity: should be 10 or below
  • Average method length: 15 or below
  • Code duplication for 3 (or greater) lines of code
  • Overall code size: should be monitored and set targets for reducing it


The Second Q: Divide & Conquer

The next stage is to start thinking of pulling apart components, whether business (or functional) components or utility components. This re-organization of code is essential to break the complexity of code and will open a large list of refactoring opportunities. At this stage, we will add the following two measures:
  • Instability: to measure correct use of design layers
  • Efferent and afferent coupling between component interfaces and the outer world

The Third Q: Build Quality In

The last stage is to start writing unit tests to test component interfaces. This is necessary so that to baseline the code quality and start doing more profound refactorings. At this stage, we will add the following measure:
  • Unit tests code coverage

Development Process Effectiveness and Efficiency Measures

All of the above are measures of "good & healthy code". However, how would I measure the improvement of the development process itself? in other word, how would I know whether or not these measures improved the development process effectiveness and efficiency? The following measures would serve:
  • Ripple effect (aka change impact): Number of touched methods per change. This measure should start high, then decrease over time
  • Fault feedback ratio (FFR): Number of injected bugs over number of resolved bugs. In healthy projects, this measure should be less than 0.3
  • Average defect density: Number of defects per code size unit, averaged for all changes in an iteration. This measures the amount of defects, whereas FFR measures the healthiness of the code fixing process
  • Average cost of one change per size unit: This is a bit tough to measure. But, depending on the nature of the product, changes can be sized and the cost can be normalized by the change size
    It is worth mentioning that we should record readings for the development process measures starting from day 1. This would be the only evidence for improvement to higher management. It would be very indicative to tell the senior management that the FFR has decreased from 0.7 to 0.34, rather than telling them that the overall code size decreased from 35 KLOC to 20 KLOC :)

    If you have previous experience with similar projects, which measures did you use?

    No comments: