How to Test Legacy Code: Best Practices to Follow
“Code without tests is bad code. It doesn’t matter how well written it is; it doesn’t matter how pretty or object-oriented or well-encapsulated it is. With tests, we can change the behavior of our code quickly and verifiably. Without them, we really don’t know if our code is getting better or worse.”
Michael C. Feathers, an author of the book “Working Effectively with Legacy Code”.
Testing legacy code is somewhat of a punishment for most developers. How do you even begin creating tests for legacy code in a messy codebase?
In this article, we’ll help you understand how to approach legacy code testing. You’ll discover some of the most common problems you’ll come across while testing and refactoring legacy codebases and techniques to help you solve them. More importantly, we’ll share our best practices for automated testing of legacy codebases and reveal the issues you may face with legacy code testing. So, let’s begin!
Why Testing Legacy Code is Different from Normal One
In terms of software QA testing, legacy code is qualified as code that doesn’t have a suite of automated tests. It may initially seem like ‘untestable code’ — difficult to modify because of the high risks of emerging vulnerabilities and critical bugs.
When writing tests for untested code, you often have three possibilities: automated testing, manual testing, and unit testing. For manual tests, the test cases are executed manually without using any automated tools. With unit testing, you’ll be writing tests for the smallest units of your code, this approach is lengthy and not the most suited for legacy code.
Automated testing of legacy code requires you to use particular software tools to automate the process of reviewing and validating code. The goal is to make legacy code testable, maintainable, and scalable — with minimal effort.
Why is legacy code testing different from other testing systems we’re currently building?
- Legacy codebases have very low test coverages. This means the chances of breaking up are very high.
- Hard-wired dependencies: With legacy code, you often have loads of classes, methods, and control structures that were not written to be tested and are considered one of the furriest disadvantages of legacy systems. It becomes next to impossible to isolate dependencies.
- Poor or no code modularization. So it’s hard to mock out parts.
- Code is very bulky; thus, unit testing legacy code is a bad idea or won’t work.
- You might be required to change the production code in order to add the tests. Doing this adds risk because you may end up introducing a bug when manipulating code manually.
Regardless of the challenges, testing legacy applications is very possible.
7 Best Practices for Legacy Code Testing
When developers are asked to write tests for some legacy codebase, they will be writing tests as they would with regular code, but with a few key changes. Here are some of the best practices to be adopted when writing tests on legacy code. Also covered are techniques used when approaching such code.
1. Practices developers should give up from the jumpstart
- Desire for 100% test coverage. You’re not testing normal code, so developers cannot focus on getting 100% coverage. It’s almost impossible to reach! Good or medium coverage is better than nothing!
- Legacy code unit testing: Legacy codebases are generally huge and complex. Instead of trying hard to test a particular loop of a certain method of a certain class, focus on “big” tests like integration and functionality tests.
- Frequent context switches: Make all your tests pass without changing the model code as you would normally do.
2. Get the build and existing tests working
Get your system build and existing tests working, this is essential. To do this, you must first evaluate the current state of your codebase.
- What tests already exist?
- What code is covered by these tests and how much coverage does that provide?
- When do these tests run?
Make sure all existing tests, if any, are up and fully working first. Then, write your first tiny test. Doesn’t matter whether it’s on a getter or a two-string method, write that first test. This is in order to get the build system set up. But make sure not to eliminate failing tests until you understand why they’re changing.
Think of what to test next. Maybe the main method. It could be a little messy to do so, but it’s sometimes a good place to start if you can. Testing the main method may bring up threads, GUI, etc.
Here’s a sample test on the main method for a healthcare management system — testing the code’s functionality using JUnit.
Whatever method or class is convenient, write your first tests. The simpler, the better. The goal is to begin the transformation by raising the coverage one step at a time.
3. Go big
With legacy code, it’s better to see the bigger picture and look at the tests from a wider perspective. You must write tests for legacy code on a larger scale. Avoid tiny, unit tests: class -> method -> loop -> if statements… You should write broad tests for particular tasks, such as GUI tests or integration tests to verify large processes.
More importantly, prioritize test cases. In general, test case prioritization is the process of determining which tests are most important to execute first. It is usually performed in order to improve the efficiency or effectiveness of testing. The goal is to maximize the probability of detecting a defect or other failure condition as early as possible.
The first thing to do is to identify the most important metrics for prioritization. And here is a list of some of them:
- How critical is the feature to be tested? If a feature is key and does not work, give it a higher priority.
- Test case execution time (the less, the better).
- Test case code coverage (the more, the better)
- Number of test cases passed (the more, the better)
- Risk associated with the feature or bug. Risks can impact your ability to deliver new features or fix bugs quickly. Prioritize tasks based on the impact they have on your business.
More examples of big picture tests are; tests for driving UI or testing APIs.
4. Implement testing within CI/CD pipeline
CI/CD pipelines ensure that new changes don’t break existing functionality and that automated tests are run on every check-in to the code repository. Continuous Integration (CI) helps organizations deliver new features faster.
The best way to test legacy code is to automate tests and run them in a Continuous Integration (CI) environment. Here’s how to do it:
1. Write a set of tests (test suite) for each class in your legacy application, using whatever language or tooling you prefer. If you’re using JUnit or another unit testing framework, this step should be easy enough. But if you’re using something more exotic, then writing those tests might be a bit trickier than just writing regular JUnit tests.
2. Make sure all of these tests pass before moving on to the next step — don’t skip this step! After all, if at least one of these tests fails then it means that some aspect of your code needs fixing before proceeding with further steps in the process. The goal is to create a test suite that ensures.
Tools: TeamCity, Jenkins.
5. Prepare the set of smoke tests
What if there was a way to figure out what parts of your legacy code are worth testing?
That’s where smoke tests come in. Smoke tests are basically like a snapshot of your application at a particular state. They let you see what needs testing and what doesn’t. They don’t test every possible scenario, just enough to get a general idea of how things work together.
A good smoke test suite is a must for any legacy code base. It’s a good practice to create such a set when you start working on a legacy project. You’ll use this to ensure that your changes don’t break anything in the system.
To prepare the set of smoke tests for your legacy code base, follow these steps:
1) Identify the main areas of application functionality.
For example, if your application is a healthcare management system, then there are certainly several areas of functionality such as patient registration, billing, doctor scheduling, etc. Identify each area separately — because they may have different types of tests required (for example, integration tests).
2) Create unit tests for all public methods in each area identified above (this should be done before doing anything else).
3) Create integration tests for all external dependencies (such as external services, databases, etc.).
Smoke tests aren’t meant for long-term use — they’re meant to give you an idea of where things are going wrong before writing more extensive tests or even working on the feature itself.
6. Plan regression tests
The main problem with legacy code is that there may be no regression test suite. Regression tests are essential because they make sure that any changes to our codebase do not lead to code corruption or unexpected behavior changes.
There are two different types of regression tests:
Functional Regression Tests — These are the most common type of regression test and are also known as black-box or behavioral tests. Functional regression tests focus on testing an application or website against a set of functional requirements. A functional regression test is usually automated but can be manual if necessary.
If you have a large number of functional requirements, it’s not practical to write a test for every single one of them. Selecting which ones to automate will help you prioritize your effort and ensure that you cover the most important parts of the system first.
Acceptance Regression Tests — Acceptance testing is done by users and stakeholders to assess if their experience with the software meets their expectations and is working as expected.
Regression testing can also be performed manually, but automated tools are typically used because they are faster, more reliable, and easier to maintain over time than manual testing methods such as Excel spreadsheets or spreadsheets.
7. Code coverage measurements
With legacy code, 10% code coverage is way better than 1% coverage. So another great practice is to run code coverage measurements to generate coverage reports that give you a clearer idea of what is yet to be done or what you’ve missed.
You want to make sure that the coverage you’ve obtained is well spread. That is, the 20% coverage should cover the patient file management system and not just the consultation scheduling package — for our hospital management system example illustrated above.
Tools: Emma, Cobertura, etc.
8. Invest in technical documentation
What do developers dislike more than legacy code? Incomplete or missing documentation! That’s why your team must always document all the changes that have been made to the system.
Technical documentation can be an external document that explains the architecture of the system, or it can be embedded into your test code. The goal is to help future developers understand how to work with your code and reduce their learning curve.
Such documentation will help your developers understand how the application works and how the system interacts with other components, such as databases or servers.
Technical documentation is good practice in software development as a whole, but most companies write it for these purposes:
- When there’s a need to rewrite or an existing feature has been modified.
- While adding new features and improving existing ones;
- When you want to refactor your codebase;
More importantly, your documentation will help keep track of what has been tested and what hasn’t been yet in case there are gaps in coverage. Yet the greatest pain point that documentations solve is how they reduce time and effort for maintenance and the development of new functionality.
Testing Legacy Code during Software Migration
Automated testing is a good suit for software migration projects. There are various techniques and approaches to move your applications to modern environments, and consequently, they require a unique testing approach (roadmap). If you choose rewriting or re-architecting that involves a lot of changes to the original system, you’ll get a real chance to restructure the software according to the best practices.
We cannot fail to mention here test-driven development (TDD) and behavior-driven development (BDD) that stand for effective software engineering. While building your solution from scratch, it’s recommended to use these methodologies as developers and testers can start with a clean slate without breaking anything.
From other perspectives, it’s almost impossible to use them with other software modernization approaches like refactoring or rehosting. The risk of breaking the app is abnormally high.
Most developers won’t touch your legacy codebase. It’s as they say… “very scary.” That’s where ModLogix comes in to offer you a cost-effective alternative to testing and refactoring legacy code. To get you started, you’ll receive a legacy platform code audit for your product.
How to Test Legacy Code: Avoiding Common Problems
One of the main goals of testing legacy codebases is to clean up code and avoid regressions — breaking things by accident. To give more insights into how to test legacy code, here are some of the most common problems you’ll face, and their solutions.
This occurs when you have loads of classes that were not written to be tested. You cannot isolate these dependencies. For example, a static class that makes reference to another class B with special requirements, such as the need for the system to be deployed on a web server.
Use SEAMs to bypass the hard-wired dependency between 02 classes.
“A seam is a place where you can alter behavior in your program without editing in that place.” Michael C. Feathers — Author of the book: Working Effectively with Legacy code.
According to the author, there are different types of the SEAM model, such as Preprocessing Seam, Link Seam, and Object Seam. Thus, you can choose the preferable variant depending on the source code and the language.
Problems with understanding elements of the codebase
Given that legacy is often written by someone else or probably a long time ago, you should face problems understanding elements of the codebase.
If you’re wondering how to write unit tests for legacy code, that may be what you should avoid; and focus on broader tests already mentioned above. Once you understand what major sections of the code are meant to do, it becomes easier to write automated tests.
The characterization testing technique is another solution to the above problem. Here’s how it works for a particular code section under test…
- Assume that the code is providing the right output.
- Write a test, expect a random value… say ‘100’. See what value comes back in the test failure method. Then change your tests so that you have the expected value.
You can do this for all methods you’d want to test.
Should you test or debug?
You’ll be faced with the challenge of whether to write and run tests first or preferably debug code.
Troubleshooting code to find bugs, and then decide. How do you decide? You should fix the bug if…
- It is simple, obvious, and local.
- You understand the code where the bug appears.
- You understand the fix.
Otherwise, spend time expanding your test suite.
What to test?
There’s a lot to test, so it can be confusing where to begin, what to prioritize, and what to ignore.
Focus on the main path of the application, not the edge conditions. Start with obvious inputs/test conditions (say 5!… for a factorial method). As the test suite expands, start looking at the edge cases — especially if you get a bug report.
You’ll still need tests for the main path of the application.
Without tests, the code is almost broken. So, by all means, get some tests written for whatever legacy codebase you’re working on.
In this article, you’ve discovered 07 best practices for testing legacy code, including techniques like approval testing, characterization tests, and auto-generating tests. We’ve also explored some of the major problems that developers face during legacy code testing, plus solutions to deal with them.
But if you struggle with issues that are not described here, ask ModLogix for help to modernize old code with minimal effort from you.
Originally published at https://modlogix.com on July 14, 2022.