During my geekSpeak screencast last week, one of the attendees asked:
Any recommendations for refactoring existing code to insert interfaces? (e.g., what’s the best dependency to break first, the database?)
Excellent question! Most of us do not have the luxury of working on greenfield projects, but instead work on brownfield projects – existing applications that could use some tender loving care. Brownfield projects are often inflicted with legacy code. What do I mean by legacy code? I agree with Michael Feathers’ definition:
Legacy code is simple code without tests.
Michael elaborates further saying:
Code without tests is bad code. It doesn’t matter how well written it is; it doesn’t matter how pretty or object-oriented or well-encapsulated it is. With tests, we can change the behavior of our code quickly and verifiably. Without them, we really don’t know if our code is getting better or worse.
If you haven’t read Michael’s book, Working Effectively with Legacy Code, you really must. It is on my short list of must-read development books. Michael provides excellent strategies and patterns for safely implementing tests around untested code, which is crucial in order to make non-breaking changes to an existing application AKA refactoring. If you can refactor your code, you can improve your code.
Now back to the question at hand… Which dependency to break first? Let me turn the question around. Which dependency is causing you the most pain?
In my experience, it is typically the database as round-tripping to a database on each test to fetch or save data dramatically slows the tests down. If you have slow tests, you’ll be unlikely to run them as often and then the tests start to lose their value as a safety net. (N.B. You still want integration tests that access the database. You just don’t want each and every unit test to do so.) As well it requires a lot of effort to keep consistent data for tests, often using test data setup scripts or rolling back transactions at the end of tests.
Other areas that often cause pain are integration points – web services, DCOM/Enterprise Services, external text files, … Anywhere your application is relying on an external application. Integration points are problems for tests because if you’re relying on them, your tests will fail when the external applications are unavailable due to crashes, service outages, integration point upgrades, external network failures, behaviour changes, … Imagine that your e-commerce website integrates with 6 external systems (credit card processor, credit check, inventory, sales, address verification, and shipping). Your development environment integrates with DEV/QA versions of each of these services. Each service has 95% uptime, which translates into 1.5 days of downtime a month for maintenance, upgrades, and unexpected outages. The chance of all systems being available is the product of their availabilities or 95%*95%*95%*95%*95%*95%=73.5% uptime for all 6 integration points. If your tests directly use these test systems, your test suite will fail over 25% of the time for reasons beyond your control. Now is that because you introduced a breaking change or because one of your integration points is temporarily unavailable or misbehaving? Life gets worse when you integrate with more systems or when the availability of those systems is lower. Imagine you have to integrate with 12 integration points with 95% availability – your test suite will only pass 54% of the time. Or if your 6 test systems only have 90% availability, your test suite only passes 53% of the time. In each case, it’s a coin toss whether you know for certain whether the change you just made broke the application. When you start getting a lot of false negatives (failing tests when the problem is in an integration point), you stop trusting your tests and you’re essentially flying without a parachute.
By decoupling yourself from your integration points by using interfaces for the majority of your tests, you know that the code base is healthy and you can separately test the interactions with your integration points.
To figure out which dependency to break first, take a critical look at your codebase. What is causing you the most pain? Pain can manifest itself as long debug sessions, waiting for external services to be available, high bug counts, … Solve the pain by decoupling yourself from that dependency. Then look at what is now causing you the most pain and solve that. Lather, rinse, repeat. It might take a few weeks or months, but eventually you’ll have a codebase that is a pleasure to work with. If you don’t believe me, read this email that I received from Karl (originally posted here):
I’m writing you because I just wanted to thank you!
It was about two months ago and I attended TechEd/Orlando. I have to say that it was the first time for me and it really was an honor to be the one and only chosen from approximately 300 software developers working in my company to go to Orlando. I was very impressed about the good quality of the sessions, especially on the architecture track, but your tiny little discussion on Tuesday evening really opened my mind. At that time I had been working as a software developer for 7 years with about 5 years experience in software design and project management and 5 years of .NET experience. I was fighting with a 400,000 LOC .NET legacy monster that’s used by public safety agencies around the world in security related areas. I have a team of 12 developers and we were trying really hard to keep this beast up and running and extend it with new features. I think you can imagine that most of the time we were using the trial and error approach to get new features in (just hack it in and prepare for long debugging sessions hunting weird bugs in parts of the application you never expected to be affected by the new feature…). The main problem was – guess what – the dependencies. There were about 50 classes (all singleton “Managers”), and every single manager was tied to at least 10 other managers – no interfaces at all – and that’s just one of several layers of the app… We knew that dependencies were our problem, but we had no clue how to solve it – there was this chicken/egg problem – I want to decouple my system, which needs a lot of refactoring. To ensure that I don’t break anything I’d need unit tests but I can’t use them because my system is so highly coupled We have tried TypeMock, but my feeling was that this went in the wrong direction. Needless to say that this attempt failed.
During the discussion after your session you gave me some quite useful hints:
1. Read Applying Domain Driven Design and Patterns by Jimmy Nilsson
2. Read Working Effectively with Legacy Code by Michael Feathers
3. Use ReSharper (especially for Refactoring)
4. Use a Mock-Framework (preferably RhinoMocks)
5. Use a Dependency Injection Framework like Spring.NET
I bought Jimmy Nilsson’s book in the conference store and read it cover to cover on my flight back to Vienna. Then I bought the second book and read it within one week. I started to use ReSharper more extensively to apply some of the patterns from Michael Feathers’ book to get some unit tests in place. I extracted a lot of interfaces, brought Spring.NET into action and used RhinoMocks and the VS2005 built in UnitTest-Framework to write some useful unit tests. I also used the built in code coverage functionality in combination with the unit tests. In addition we already started Design for a messaging based service application that we want to develop in a very TDDish style.
As you can see there’s a lot going on since I attended your session. It was really this discussion about agile principles that gave me sort of a boost.
So again – thanks for opening my mind and keep on doing this great work!
In my opinion, Karl and his team are the real heroes here. You can be a hero too by taming your software dependencies!