Sunday, July 20, 2008

Of EMMA's and Eclipse's

So last time I covered the joys of testing. Now if you do Test Driven Development, then you never have to worry about what you have tested and what is untested, but what about the scores of projects which aren't developed in a TDD fashion ? How do I figure out which among my thousands of classes needs tests most urgently right now ? Is the class which usually gets the most bug fixes my prime target, or are there even worse classes that should be tested ?

One solid easy way to identify this is to run some code coverage analysis on your code. What is code coverage, you ask ? It is one of the single most brilliant things which gives you coverage information about your code. Coverage information in this case basically provides you with knowledge on how much of your code and classes are executed by your tests, and which hotspots in your code are completely ignored by the test. Though you can run it without having a suite of tests and running your program manually, it provides the most bang for your buck (especially considering it is free) when you run it along with the tests. Code coverage tools generally provide information on a per package, per class, per method, per block and per line basis, so you can dig in as deep as you like.

EMMA is a free, open source tool which allows you to generate code coverage information for Java code. And if you have some issues with providing some random tool with your source code, fear not, for you can provide it with a jar file which it can instrument and generate code coverage information for. And it generates nice Html reports if you prefer, which you can again dig into as deep as you like.

The above image, grabbed from EMMA's official website, shows a sample html report for a single class. Notice how nicely it highlights the class. The green lines represent code which was covered by one test or the other, the red ones were lines not covered at all and the yellow ones represent code which was partially covered. EMMA is smart enough to distinguish partial matches, as in the mutli condition statement above.

This image, also grabbed from EMMA's official website, shows code coverage information on a package level. Notice how it breaks down the information to a method, block and line level. So this report can be used to easily identify classes lacking in testing and allows surveyors to tackle these hotspots.

To make it even easier for developers, EMMA is available as a plugin to most IDEs, including Eclipse. Available at EclEmma. This tool can be run along with the tests to generate code coverage information. SO you can instant feedback on any new test that you have written, instead of having to come out and run EMMA separately. This can also help give you feedback on your test, to ensure that you are testing the code paths that you intended.

Generally, it has been found that projects with code coverage less than 50 - 60% generally tend to have much more bugs and fixes than projects with higher coverage. And projects which are developed using TDD tend to end with high code coverage numbers, generally above 80%.

But this comes with a few caveats. Even if you do attain 100% code coverage, it does not mean that your job is done. In the end, code coverage is a statistic and can be bent or twist by a knowledgeable person. A high code coverage number ensures you are hitting a lot of your code paths and your tests exercise a lot of the system, but it does not necessarily mean that you covered all possible cases nor does it mean that your source code itself is testable or maintainable. It is also entirely possible to write as few tests as possible which exercise bigger amounts of systems and don't provide much value, rather than writing small fast unit tests which exercise just a small part of the system and still end up with good code coverage numbers.

But that said, code coverage is an excellent tool when used as part of a greater set of tools to evaluate your project and can reveal startling trends about your projects. Maintaining a historical trend of how your code coverage grows is an interesting metric and can reveal the practices of your developers as well. And considering how it is free for Java, I don't see any reason to not start using this for your projects.

1 comment:

Alok said...

A nice article. Lot of information. I have a small problem, though. I plugged in EclEmma to Eclipse and it works fine for stand alone applications. How does it work for web applications?