Monday, October 06, 2008

Coverage isn't everything

A few posts ago, I mentioned Emma and code coverage, and how it could be useful. Code coverage, in layman’s terms, is the lines of code covered by your tests. So generally, you want a high code coverage, implying that most of your code is well tested.

But since my post, I have talked with a few friends and colleagues, and one point has been brought up again and again. First, it was a discussion with my friend on his homework assignment. It was a hypothetical situation about a manager wanting 100 % code coverage before he launches his product. That discussion quickly devolved into a rant about why code coverage isn’t the alpha and the omega.

And then, just one or two days ago, I had another discussion with a colleague, who was telling / reminding me of the fact that a project having great code coverage doesn’t mean that they don’t need to test it anymore. I politely agreed and nodded my head, and both of us ended up repeating the same facts to each other. Which was that code coverage, like a lot of other things, can be faked.

The trick with code coverage is that a high code coverage number doesn’t actually tell you anything in and by itself. For it is just as easy to get a high code coverage with a single line of test which exercises the entire system as it is to do it the hard way and write a lot of small unit tests. A single unit test which executes main will most likely result in a high code coverage number, even without having any assertions.

Similarly, consider the following example :

void someMethod(int a, String b) {

if (a < someThreshold || isValid(b)) {
// Do first thing
} else if (a == someThreshold && isNumber(b)) {
// DO second thing
}
}

Now, it is possible to to write just two unit tests, and still attain 100% coverage. Here, if I call this test with a < a ="="">

And these are exactly the type of things which can demean the value of code coverage. Any manager worth his salt should understand this aspect of code coverage. It is possible and not too difficult to attain a very high code coverage even with a sub par quality product. The trick is in understanding how the tests are written, and how comprehensive they are.

Code coverage is a great tool to find out spots which are completely untested, and projects which develop their code in a Test Driven fashion often end up with high code coverage. But that does not imply that projects with a high code coverage are of the greatest quality. Careful consideration of their testing practices and comprehensiveness is essential in these cases. Track stuff like bugs, amount of test code written compared to amount of production code, etc to get a overall picture, rather than relying on just one metric.

Tuesday, September 30, 2008

My code's untestable

I have frequently heard this complaint, from some really great engineers in the past year. My code’s untestable, there’s no way i can test this, the only way to test my code is to write a full on end to end test. And in some cases, it was actually true. But the thing is, it doesn’t need to be that way. There are always ways to twist and turn the untestable code into testable code, in other words refactor.

But before you get excited, and go aha, thats what I am going to do, hold your horses for a second. What are you going to go refactor ? Well, don’t scratch your heads, there are a few things to look out for. Similar to the code smells we had, there are similar smells we can look for, which indicate that the code is untestable. And there are a few standard ways to tackle them, and make your classes testable. So without further adieu, lets take a look at the more common and annoying testability smells.

  1. Constructor doing work :
    This is one of the biggest things preventing a class from being testable. There are many names to this smell, including constructor doing work, Breaking the law of demeter, etc. but all it comes down to is the constructor doing more than just assigning stuff to local variables. For example, something like :

    XPathConvertor() {
    this.xpathDatabase = XPathDatabaseFactory.getDatabase();
    XPathMapper mapper = new SimpleXPathMapper(”Simple Mapper”);
    this.xpathTranslator = XPathTranslatorFactory.getTranslator(this.xpathDatabase,
    mapper);
    }

  2. While the above may seem a contrived example, or too simple, it exhibits what is at the centre of most bad constructors. One, its a default constructor. Then it goes out, grabs a database out from ether (Static factory method call), and then creates a mapper, and then passes those two to get a XpathTranslator object. Now, take my word for it, but the XPathConvertor only needs xpathTranslator. SO what is it doing with the darn database and the mapper. This is breaking the law of demeter, which states that “The constructor should only ask for what it needs, and nothing else.”

    Why is this bad ? Well, for one, if the thing your constructor is creating is a heavy piece of service like a database or something, there’s a huge hit in your test. Your test is no longer a unit test, but an integration test. Each call now has to travel to the DB and back, and just makes everything slower. Secondly, there are some cases in which it picks up a service which you just can’t work with in a test. Something which either needs the whole production setup, or just doesn’t work in a unit test. And since it reaches into static factories to get that, there’s no way for you to slip in your mock.

    So instead, start passing in what is needed to your constructor. This forms the basis of Dependency Injection, or slipping in a fake, or whatever you want to call it. Basically, your constructor takes in what it needs, and all it does is assign stuff to local variables. No work is done there. So the above code becomes something like :

    XPathConvertor(XPathTranslator translator) {
    this.xpathTranslator = translator;
    }

    So much cleaner, and only has what it needs. So in our test, you can create a translator and pass it in which uses a mock db, or pass in a fake translator or whatever. The point is, testing becomes easier.

  3. Global State :
    The second biggest complaint with untestable code. It usually has to do with Global state. Or as some people like to call it, putting things in and pulling things out of ether. This might be anything from using global static singletons to static method calls inbetween your method.

    How is this bad ? Consider some method you are testing. What if it suddenly reaches out into the ether, and grabs some object and uses it to perform its calculations. You say, ok, I can somehow add a setter which allows me to set its state. Now what if there are multiple tests running in parallel…. Yes, exactly. Not good. Furthermore, you can’t mock a static method, which makes life miserable.

    Consider this question then, why does it need to be static ? What benefit are you getting, other than the fact that you don’t need to create an object ? Is it really worth making code untestable in order to reduce one line of code of creating an object ? The answer 9 times out of 10, I find, is no.

Wednesday, August 20, 2008

Fakes and Mocks and Stubs, Oh My !!!

So we covered how to use EasyMock to write tests in one of the previous posts, but Mocking is not the only option when you want to test something that depends on a slow and expensive service. Mocking allows you to expect calls and return values when a call is made, and make sure the number of calls were correct and the order, but sometimes, you want a bit more, or not even that much. So in those cases, you have the options of using Stubs or Fakes. So without further delay, your options are :

  1. Mocks :
    Well, Mocks... What can I say. I have already ranted and raved about the awesomeness that is Mocks. Mocks and Mocking frameworks allow you to replace heavy and expensive services with Mocks, with which you can set expectations on what calls are made to the service, and return values. This gives you the advantage of knowing exactly what is called and returned, and makes your test deterministic and super fast. You can even set expectations on the number of calls and the order, though this is something that is not extremely recommended, since that makes the test dependent on the implementation, which is never a good thing.

    Advantages
    :
    • Fast, Reliable
    • Deterministic
    • Lightweight
    • Control over expectations and return values.

    Disadavantages
    :
    • Can become dependent on implementation of method being tested.
    • Can become a mockery if not careful, that is, can be testing interaction between mocks and nothing of the actual code. Especially when all that the method under test is doing is delegating to a bunch of service calls.
    • Can involve complicated setup for expectations, especially with unwieldy objects

  2. Stubs :
    Now, whereas with Mocks, you can specify what calls will be received and what will be returned making them somewhat intelligent, Stubs are at the other end of the spectrum. The dumbest of the dumb, the easiest of the easiest, these just stub out the methods of your expensive and heavy class. So a null Stub could return null for each method, or just return some constant value each time it is called. That means that regardless of what the method is called with, it returns the exact same thing, day in and day out. It doesn't adjust, it doesn't react, just returns. Of course, this means that you might not be able to test all the cases you want to inspect / test. But its simple, its easy, and of course, its fast.

    Advantages :
    • Fast, reliable
    • Simple
    • Consistent

    Disadvantages
    :
    • Dumb
    • Can't exercise different cases without different stubs
    • Returns only one value consistently

  3. Fakes :
    And finally, we come to the smartest (figuratively) of the lot. The Fakes. For once, being a fake can be good. While Stubs return the same thing again and again, and mocks return what you tell it to, Fakes are smarter. The easiest way to explain a Fake is with an example. Say your code depends on a heavy service like a Database. A FakeDatabase would be an in-memory implementation of the DB, thus making it faster while at the same time, providing the same logic as the normal DB would. There are different types of Fakes, like a listening / recording Fake which records all the values passed to it.

    Advantages
    :
    • Can test while preserving behavior of dependencies
    • Faster than using actual services
    • Can test complicated logic
    • Most comprehensive testing approach

    Disadvantages
    :
    • Can be complicated to set up
    • Not as fast as mocks / stubs
    • Not as easy to define expectations
    • When / Where do you test the fake (especially the complicated ones) ?

Thats my 2 cents on the different approaches to testing with dependencies. Now that we know what we can do with those darn heavy dependencies, I'll talk more on how to make sure you can use these different approaches to test your code soon.

Sunday, August 10, 2008

Your code smells!!!

I was planning to talk about Dependency Injection or the difference between Mocks, Stubs and Fakes, but I think I would prefer to get this one out first. For those of you who haven’t heard the term before, a code smell is something that indicates there is something wrong with the source code, be they design problems or signs that the code needs refactoring. So in this post, I would like to mention a few common code smells, their identifying patterns and how to fix them. So without any further delay, lets start :

  1. Too many comments
    The Problem : Lets start with the easiest to identify code smell. Comments are good, if they are describing a class or a method. But when you start having comments in your code explaining what a particular block of code does, you know you are in trouble.
    The Fix : If there is a block of code which does some complicated stuff, and you feel you need to comment it to make it easily understood, pull it out into a well named method, and let the method name describe what it does.
    .
  2. Long Class / Method
    The Problem :
    These two code smells actually are similar. If you have a method which goes beyond 10 - 15 lines, then you have a long method. While there is nothing technically wrong with long methods, its not as easy to comprehend as a nice small method, and can make maintenance a pain in the rear. Long classes also are similar, having too many things that its trying to do. If you generally try describing a class and have to use ands and ors, then you have a long class.
    The Fix : Pull out parts of the long method into smaller well named methods. The advantage is two fold. One, your method is much more readable now. And two, you can now test individual methods you have pulled out, making testing a much easier task than one giant method. Same with classes, break it up into multiple smaller easily testable classes.
    .
  3. Primitive Obsession
    The Problem :
    How many times have you had to write a class which takes in, say a phone number ? And how many times have you passed in a String or a long to the class which asks for the phone number ? If you raised your hand, then congrats, you have a code smell known as primitive obsession. This happens when instead of creating an object, you pass around primitives, and write functions to operate and convert that primitive from one form to another. So you might have to create utility classes which give you a phone number with brackets from a string, and so on and so forth.
    The Fix : Just give the poor thing a class. If you operate on phone numbers, create a PhoneNumber class which has methods to operate on the number. Makes it easier for anyone using the class as well, and of course, its testable :D.
    .
  4. Feature Envy / Inappropriate Intimacy
    The Problem :
    Feature Envy is when a method on a class is more interested in other classes than the class to which it belongs. The reason for this could be as simple as the method being in the wrong class to something more non trivial. Inappropriate intimacy is when two classes are so tightly coupled that one depends on the other to work a particular way for it to work.
    The Fix : For feature envy, just move the method to where it belongs. If it does work solely on some other class, then maybe it belongs on that class instead of where it is currently. For inappropriate intimacy, you need to figure out if the problem is something simple or something more complicated. It might be that the interfaces weren’t selected appropriately, or you might need to introduce a layer to keep the coupling loose, or even refactor the code to make sure it does not depend on another tightly coupled class.
    .
  5. Lazy class / Dead code / Duplicate code / Speculative Generality
    The Problem
    : All of the above are usually simple code smells which indicate you have code which you don’t need. Lazy class is when a class doesn’t do enough to justify its existence. Dead code is obvious, code which is not used or dead. Duplicate code, duh, Its duplicated. Speculative generality is the most interesting of the lot. Its when you write some code for something which you don’t need yet, but may need at some point in the future.
    The Fix : For the first three code smells, the fix is trivial. Delete it. Dont even think about it, just blindly delete it. Duplicate code is a pain to maintain as well, pull it out into a method and then delete the duplicates. Speculative generality is one thing people don’t realize they are doing or feel they need to do it now since otherwise, it might become difficult to do in the future. The interesting thing is that the feature they added speculatively is rarely ever used. Its an additional overhead to maintain and test for something you never use. Don’t do it. If you can implement it now, you can implement it when you need it. Just delete that darn code.

There are a lot more code smells than I could list out here, but these are a few of the most common ones you should keep a lookout for. Google search Code smells if you want to learn more about these insidious creations :D. Next time, before I start dependency injection, I think I will rant about things which make code untestable. So in a sense, Testability code smells.

Tuesday, August 05, 2008

Mock, Mock! Who's there ?

The concept of mocking is not a new idea, but its one that has been gaining traction recently. But still, whenever I tell people to mock out their dependencies when they are trying to test, they look at me as if wondering what the heck I’m smoking. Well, if I was smoking something, I would share it. But mocking is a great thing for writing small unit tests.

I get usually one of the two statements / questions when I tell someone to just mock something out.

  1. Well, if I am mocking things out, then I am not really testing it, am I ? OR I don’t want to mock things out. I need to test the method’s interaction with other classes.
  2. What about the mocked class ? We have to make sure it works too.

Well, to number 1, I say, if by mocking things out, you have nothing but a series of expected calls and returns, and there’s no actual class specific behavior there, then does that Class really deserve to be a class ?

And the second part of number 1, well…. Is that really a unit test then ? Ideally, you should have a lot of unittests to make sure the class specific logic is sound, and then a few integration tests to make sure that everything is hooked up correctly.

And number 2… This is where you go and write unit tests for the class you have just mocked. Don’t depend on large scale integration tests to test all your classes. Write nice small unit tests which are fast and precise. The larger a test gets, the harder it is to find why a particular test broke and how.

So hopefully by now, I have you convinced that mocking may not be all that bad. Fine, you say, so how do I go about this mocking thing ? Glad you asked. While there are many mocking frameworks out there, I am going to just talk about EasyMock for Java. JMock is very similar and could be pretty interchangable.

The first requirement before you can use any mocking framework is the ability to inject mocks into your class. This is Dependency Injection at its core, without which you will have to find workarounds like protected setter methods and the like. But basically, if a class uses some Service or Database, make sure that you can override it with a Mock Database by passing it into a constructor or setting it via a method.

Once thats done, the first step in using a mock is creating the mock object. In EasyMock, its as simple as :

MyClass myObject = EasyMock.createMock(MyClass.class);

Thats it, nothing fancier than that. It helps if MyClass is an interface, but I believe EasyMock supports mocking non interface classes as well. Once you have done that, you can inject this mock object into the classes which you are testing.

The next step is setting expectations. For void methods, it as simple as just calling the method with the expected parameter. For example :

myObject.myVoidMethod(”ThisStringWillBePassed”);
EasyMock.expect(myObject.methodWithReturnValue(”ExpectedArg”)).andReturn(”ReturnValue”);
EasyMock.replay(myObject);

The EasyMock.replay(myObject) tells EasyMock that you are done setting expectations and that the next time a method on the object is called, treat it as an actual call. So then in your test, you proceed as normal invoking the methods you care about, and then finally, you call :

EasyMock.verify(myObject);

This ensures that all the expectations set on myObject were met. Now EasyMock also supports additional features like setting expectations on number of method calls, throwing exceptions, flexible argument matchers and so much more. For more information, check out the EasyMock home page.

Now a few caveats with regards to mocking. It is very easy to degenerate some tests into what we call a mockery, where we end up testing mocks and their interaction instead of the actual class we want to test. So Don’t overuse mocks. Use them when you have to test something which depends on Database or expensive service calls. Also if your test ends up exercising a bunch of mock calls and nothing else, that might be a hint that your class really does not belong. And of course, it goes without saying that don’t mock the class you are testing.

Also, don’t set up Mock layers where a class which indirectly depends on some service object uses the mock layer. You should always mock the classes which your Class Under Test directly depends on, and not classes which it indirectly depends on. And sometimes, a mock might not be what you are looking for. Instead, a simple Stub or a Fake might be more useful. I might talk more on this or Dependency Injection in my next post.

Monday, July 28, 2008

The Testability Explorer cometh....

So last time, we explored on how to find hotspots and untested code in your code base. But then you start looking at your code, and then you realize there is a reason why you didn't test the darn thing. The code's untestable. Whoo hoo..

Well, there is no such thing as untestable code. Or rather, all untestable code can be refactored to make it much nicer and easier to test, through a variety of techniques. The first and foremost reason for untestable code ends up being "Constructor doing work." If the constructor of a class does anything more than stuff like "this.x = x" or if it tries to call a constructor itself or use a *GASP* static factory, bingo, you have a problem.

But fixing that isn't the target of this post. That will be covered in a later one, cause its a doozie. No, in this post, I want to talk about how to find these untestable code snippets without any manual effort. Every code base has atleast a few of these gems, which turn up being a nightmare to test, and in turn make everything depending on it a nightmare as well. Well, fear not, for the Testability Explorer cometh...

The Testability Explorer (http://www.testabilityexplorer.org) is an open source tool which looks at classes and does cyclomatic complexity analysis on it. What does that mean ? Well, it looks for things which make testing hard, like conditionals, and recursively dives in to objects it instantiates to find their testability score. In that respect, it is a static and recursive analysis of a code base. It takes all these into consideration and assigns a score to each class. Based on these scores, a class is either
  • An excellent to test class
  • A Good class but could use some work
  • A horrible class to test, needs a lot of work.
The following image, taken from the Testability Explorer website, shows a sample report generated by the tool :

As you can see, it generates html reports with bar graphs and pie charts. It can even, depending on the options you specify, allow you to dig in deep into the problem classes and find the method and line which causes you the most problem to test. This can give you great insight on deciding what classes need refactoring first to make it testable. A lot of times, fixing the most problematic one causes a ripple effect, which fixes a bunch of problems in classes depending on it.

Another great thing is that the Testability Explorer can take jar files, so if you don't want to expose your code directly to it, you have an option. Though sadly, the Testability Explorer is currently only available for Java code. It stands to reason that something similar could be done for C++, though you Javascript guys are out on your own.

All in all, a great tool. But don't depend solely on it. It is great as one tool in a repertoire of tools, but not just by itself. Testability Explorer is also a great way to notice trends, of whether your code is growing more testable or untestable, and other great things, just like code coverage. Though leading you to nothing more than testable code, you would be surprised at how much positive impact increased testability and tests can have on the quality of a project.

So go check it out. And enjoy.

Sunday, July 20, 2008

Of EMMA's and Eclipse's

So last time I covered the joys of testing. Now if you do Test Driven Development, then you never have to worry about what you have tested and what is untested, but what about the scores of projects which aren't developed in a TDD fashion ? How do I figure out which among my thousands of classes needs tests most urgently right now ? Is the class which usually gets the most bug fixes my prime target, or are there even worse classes that should be tested ?

One solid easy way to identify this is to run some code coverage analysis on your code. What is code coverage, you ask ? It is one of the single most brilliant things which gives you coverage information about your code. Coverage information in this case basically provides you with knowledge on how much of your code and classes are executed by your tests, and which hotspots in your code are completely ignored by the test. Though you can run it without having a suite of tests and running your program manually, it provides the most bang for your buck (especially considering it is free) when you run it along with the tests. Code coverage tools generally provide information on a per package, per class, per method, per block and per line basis, so you can dig in as deep as you like.

EMMA is a free, open source tool which allows you to generate code coverage information for Java code. And if you have some issues with providing some random tool with your source code, fear not, for you can provide it with a jar file which it can instrument and generate code coverage information for. And it generates nice Html reports if you prefer, which you can again dig into as deep as you like.

The above image, grabbed from EMMA's official website, shows a sample html report for a single class. Notice how nicely it highlights the class. The green lines represent code which was covered by one test or the other, the red ones were lines not covered at all and the yellow ones represent code which was partially covered. EMMA is smart enough to distinguish partial matches, as in the mutli condition statement above.

This image, also grabbed from EMMA's official website, shows code coverage information on a package level. Notice how it breaks down the information to a method, block and line level. So this report can be used to easily identify classes lacking in testing and allows surveyors to tackle these hotspots.

To make it even easier for developers, EMMA is available as a plugin to most IDEs, including Eclipse. Available at EclEmma. This tool can be run along with the tests to generate code coverage information. SO you can instant feedback on any new test that you have written, instead of having to come out and run EMMA separately. This can also help give you feedback on your test, to ensure that you are testing the code paths that you intended.

Generally, it has been found that projects with code coverage less than 50 - 60% generally tend to have much more bugs and fixes than projects with higher coverage. And projects which are developed using TDD tend to end with high code coverage numbers, generally above 80%.

But this comes with a few caveats. Even if you do attain 100% code coverage, it does not mean that your job is done. In the end, code coverage is a statistic and can be bent or twist by a knowledgeable person. A high code coverage number ensures you are hitting a lot of your code paths and your tests exercise a lot of the system, but it does not necessarily mean that you covered all possible cases nor does it mean that your source code itself is testable or maintainable. It is also entirely possible to write as few tests as possible which exercise bigger amounts of systems and don't provide much value, rather than writing small fast unit tests which exercise just a small part of the system and still end up with good code coverage numbers.

But that said, code coverage is an excellent tool when used as part of a greater set of tools to evaluate your project and can reveal startling trends about your projects. Maintaining a historical trend of how your code coverage grows is an interesting metric and can reveal the practices of your developers as well. And considering how it is free for Java, I don't see any reason to not start using this for your projects.