A lot has been said about good and bad design decisions all over the web, and I hardly think I could add anything new to the list of code smells. Yet they still keep popping up in every single piece of legacy code I come across. My theory is that many of these patterns do not simply arise as the evil doings of a single bad programer, but more as a result of collaboration between brilliant programers, who just fail to realize the problem early. If you look at it that way it becomes apparent that bad designs are inevitable in the long run, and so eventually they need to be fixed.
Or do they? Well... it all depends of course. There is nothing more harmful than putting countless hours of work into changing code for the sole purpose of aesthetics. I'm an idealist at heart, and I'd love to spend all the time I can find on prettifying the existing code base before I ever change behaviour, but the reality is that beautiful code won't earn you a single penny unless it delivers business value. On the other hand we all know that bad code can hurt us big time: not only makes it expensive to add behaviour, it also gets hard just to keep things from falling apart.
Are you confused yet? Should you refactor or even rewrite that piece of ugly spaghetti code tomorrow morning, or not? Let's say that by default you should never touch legacy code that works as expected. (It gives me enormous pain to even think that this is a good idea...) Accepting the premise here is a list of the most important exceptions to this rule, when you should most definitely become obsessed with cleaning your code.
Change without tests
When you do not have tests for a code base you need to be very cautious. In this case I would suggest changing as little as possible. If you can not easily create a trustworthy set of characterization tests and you only make changes rarely you should keep things as they are. Checking in refactored code here can only get you in trouble.
However if it helps you with understanding a part of the code that you are about to change you may want to try exploratory refactoring. The catch is that what you learn from this unsafe frenzy can help you with making the change with more confidence. If you can save the refactored code somewhere for reference do it, but do not check it into production.
On the other hand if you find yourself making lots of changes and bug fixes in a poorly written code base that you are absolutely unable to put under any kind of automated testing, then you should consider if it implements something strategically important that your business heavily relies on. If occasional malfunction during development and maybe shortly after release is acceptable, you should try to make the code testable by changing it as little as possible. Since you have no tests to help you I would suggest the use of pair programming. Your goal should be to get to a point where you have characterization tests in as few steps as possible.
If you have a part in your software so crucial that a malfunction is unacceptable you should consider rewriting that part, since it might only be a matter of time before a bug takes the company down for good. Now here is the thing about rewrite projects. They usually fail. The problem is that they take a lot of time, and during that time the team is not delivering value, at least not something that the business guys can see, feel or touch. The only way to reach success with a project like that is to keep it very focused on the strategic parts of the code, and to be very transparent about it. If the rewrite takes more than a few days, you need full buy in from business side, and for that you need to convince them, that the part you are rewriting is really that important and in an inherently unstable state. Having to convince business about the importance of a rewrite is also a good way to make sure that the part under discussion is in fact really important, and it's not just your disgust that makes you believe so.
The scenario described above should be extremely rare. Most of the time you'll be able to add at least characterization tests that can aid you with your refactoring, but you should always remember that you are still walking on a tightrope, and if you accidentally fall you may miss your safety net. If you are careful it becomes unlikely that you introduce bugs without noticing, but it's never as safe as refactoring code that has been developed with TDD from day one.
The problem with characterization tests is that they take time to create but still break easily forcing you to throw them away at some point. That ofcourse also means that staring to write characterization tests with no clear goal is a complete waste of time. Only add those test when you are about to make a larger change, and if you do add them, try to accomplish as much as you can while they are still there. If you had to hack in something not long ago, now is the time to clean up that part. If you see an abstraction lurking among the lines of a God class, now is the time to extract it, and add proper unit tests. However always time box these changes: if you can't finish a refactoring within the time you though was enough roll back, and try to move forward in other directions.
When using characterization tests to aid your change there are two important cases. The easy ones are behaviour preserving changes, that only add new functionality, but keep around the old behaviour with a similar interface. In these cases your characterization test can be used as a substitute for regression tests that keep you from breaking former features. The only thing you should be careful about is to avoid any duplication of behaviour among your characterization tests, so that you can adjust them to the new interface easily.
Fixing bugs or adding features that change existing behaviour with the aid of characterization tests is a lot more tricky. The problem is that characterization tests are designed to preserve all existing behaviour, which of course means that they conserve bugs just as well as they conserve features. At some point during your bug fix you will inevitably break your safety net so badly that it just can't be fixed. When I have to fix a bug that will break some of my characterization tests, I first clean up the code to a point where the bug fix looks like a simple change. Once I'm sure that it will not break something else I do the fix and throw away the tests. Even better if along the way you added some proper unit tests that can already replace the now defunct characterization test.
Perfect test coverage
This post is mostly about refactoring in legacy code base, but my list wouldn't be complete if I didn't share my thoughts on refactoring code that is under proper automated tests. In this case you should refactor by default. When I know that I have pretty good test coverage I rarely think twice before refactoring something that is in my way. But even in this case I would advise you against refactorging parts of the code that you would not have to touch otherwise. The main reason for this is that automated tests do not catch all errors, and if you do not make other changes in the neighbourhood then probably QA won't test anything around there either. Another reason why l'art pour l'art refactoring may hurt you, is that people usually have different preferences, and because of that teams tend to change things back and forth.
Finally there is the case of project wide refactorings, where you want to get rid of concepts that used to look like a good idea, but turned out to be like plague: it's painful, and infects everything. If a refactoring seems to force a change nearly everywhere, and it takes several days you should consider it a small rewrite. As with big rewrites this involves a risk of failure if the team is not entirely transparent about their plan towards business or they fail to communicate the business value in the refactoring. If you decide to do a refactoring that affects most of the application be sure to consult several people: possibly everyone who may be affected by your change, but most importantly warn QA about possible risks.
One thing that I learned the hard way, is that not even the slack time between tagging the new version and it's actual release is a good time for such big refactorings. It is a real pain to maintain two completely different versions of a code base until the next release when your project wide refactoring makes it into production. What can make this worse is not finishing before the new tasks are in: than you'll end up trying to sneak in the remaining steps of the refactoring into the first few iterations, making it a seemingly endless struggle.
All in all refactoring is desirable and inevitable. If you have good test coverage, and your refactoring takes only a few minutes and makes your changes a little easier, then do it. If you see a bad concept polluting a small number of classes, get rid of it while it's easy to do so. On the other hand, when refactoring legacy code, or doing project wide refactoring that takes several days, always consider the cost and risk involved and compare that to the actual business value that is being delivered by going through with it.