The First Step in Legacy Code Remediation
Most of the time we work with existing code, rather than writing completely new applications. Much of the code could benefit from some improvement, to make it more readable, more cohesive, less tightly coupled, easier to isolate, and so forth.
That sort of clean-up calls for refactoring, or changing the structure (design) of code without changing its externally-visible behavior.
To improve in an applied skill, it’s helpful to practice that skill in a mindful way. If we only apply the skill in the moment when we need it “for real,” it’s hard to improve.
We know how to practice writing new or “greenfield” code. We do it like this. But can we practice remediating legacy code?
Well, sure we can. Here are a few examples of legacy code remediation exercises easily found online:
- Refactoring Katas: Marco Emrich
- Loop to pipeline: Martin Fowler
- SOLID: Florin Preda
- Yahtzee: Jon Jagger
We use similar exercises when screening technical job candidates. Here’s the starter code for a Java refactoring exercise: Java Legacy and here’s a sample solution and documented walkthrough for it: Java Legacy solution.
The starter code for that exercise is designed to contain a number of code smells. We ask the candidate to clean up the code.
That’s probably a little unfair. In real life, you wouldn’t approach a code base and start ripping it apart, just like that. Instead, you’d perform incremental refactoring in the normal course of making modifications to the application.
When we ask people to clean up the code in an open-ended way, they aren’t sure where to begin or end. On the other hand, for purposes of screening job candidates, we want to find out how they think. So maybe it’s fair enough in context.
In any case, it’s possible to practice refactoring, just as it’s possible to practice test-driven development, or karate, or music, or cooking. You wouldn’t try to learn self-defense techniques during a mugging. You wouldn’t play a piece of music for the first time ever during a concert. You wouldn’t learn how to boil water while preparing dinner for your boss. By the same token, there’s no need to try and figure out refactoring while you’re in the middle of fixing a bug or enhancing a feature.
Some of the refactoring exercises you find online (especially Fowler’s) lead you by the hand through a specific series of moves. That’s not necessarily the only way to solve the problem. It’s only one possible approach. That’s true of the sample solution to our own exercise as much as it is of any other.
When you’re looking at a “real” piece of code, no one is going to lead you by the hand. You have to figure out what to do. It’s interesting to see what people do when we ask them to clean up the messy code, and they don’t have a User Story to tell them which parts of the code are interesting at the moment.
Some people are reluctant to change any of the code unless they have a chance to talk to the original author, or the team’s technical lead, or the application architect. Others are on the opposite end of the spectrum: They dive right in, changing anything and everything that looks “funny” to them, without validating any of their assumptions.
Many candidates chase squirrels. They notice one code smell, start to remediate it, and notice a second code smell. They stop what they were working on and address the second code smell. Then they notice a third. And so on.
Get your bearings
I’ll suggest the first step ought to be to get your bearings. If there’s a User Story, it will tell you what application behavior is supposed to be modified or added. You only need to refactor code that’s relevant to that behavior. Even if you see another code smell that seems more egregious to you, your task is to complete the User Story and not to fix everything you don’t like about the code base.
If there isn’t a User Story, and you’ve been asked to “fix stuff” in an open-ended way, as we do when we’re screening job candidates, then you need some sort of rational approach.
When considering job applicants we don’t have any particular approach in mind, just so long as it’s rational. You’d be surprised how many people have no approach at all, but just latch onto whatever bit of code happens to catch their attention. Then they spend the next 90 minutes working on a relatively trivial issue in the code.
One rational approach is to look for low-hanging fruit. The first code smell noted in our sample solution for the Java remediation challenge is worthless class-level comment. Is that such a horrible thing? Maybe not, but it’s an obvious fix, quick and easy. And there’s nothing like a quick win to get your confidence up.
Another approach is to look for problems that have a relatively large impact. Our Java remediation exercise comes with JUnit test cases. We notice whether the job candidate pays attention to the test cases. An XP practitioner would normally start to explore a code base by running whatever executable test cases were available.
A person who doesn’t have a baked-in test-oriented mindset might not even think to run the test suite (tsk, tsk). And indeed, many candidates never even look in the project’s test directory.
If you did that, you’d see that the test cases itFindsAddisonTexasBy5DigitZipCode() and itFindsMaranaArizonaBy9DigitZipCode() take far longer to run than the rest of the test suite. Investigating further, you’d discover method setZipCode() in JobApplicant calls an internet-based service to look up the city name based on the zipcode.
There’s no way to isolate this call from the rest of the method. So there’s a potential starting point for refactoring. Another rational approach.
Ask me no questions and I’ll tell you no lies
An almost-universal mistake job candidates make is that they don’t ask questions. At the outset, we tell them we (whichever of us is facilitating the session) can serve as the subject matter expert or business analyst or Product Owner or whatever, if there are questions about the intent of the code. Most people never ask a single question.
One person saw the variable named segundoNombre and immediately changed it to middleName, citing as his rationale that names should be consistent. He was right: Names should be consistent.
He didn’t ask the SME why segundoNombre was in the code in the first place. Had he asked, he would have learned that Spanish names don’t have a “middle name.” Segundo nombre is the correct domain term.
In a “real” application, assuming it followed the principle of using ubiquitous language, there would be documented standards about that name, such as:
- Display: Segundo Nombre
- Short Display: Seg. Nom.
- Database column: SEGUNDO_NOMBRE
- Java variable: segundoNombre
- Ruby variable: segundo_nombre
That person spent quite a while on the problem of renaming the variable. The issue with that part of the code is not the variable names. It’s the way Spanish names are jammed into the code without using Java’s internationalization support. A software archaeologist would immediately perceive that support for Spanish names had been added at some point after the application had been released, without considering how to incorporate support for internationalized names. This person never saw that problem, because he was preoccupied with the variable name, and he never asked for clarification.
Basic refactorings comprise a single step. Most fully-featured IDEs include support for the most common single-step refactorings via a keyboard shortcut, a context menu, or a selection from the main menu. Multi-step refactorings comprise a series of simple refactorings (and sometimes a little bit of manual editing) performed in a specific sequence so as to maintain safety.
It’s easy to find exercises to help you practice the sequence of steps to follow for frequently-used refactorings. But don’t overlook the very first step in the process: Think about what needs to be done.