Refactoring Complexity
posted on: Apr 28, 2018

The behavior of a complex system is often complicated. When we try to implement a complex behavior in the program, that complexity is generally not recognized at first, so the behavior is represented as a single method. Gradually that method grows and grows, gaining more lines, more parameters, and more temporary variables, until it is a monstrous mess. This post explains a few patterns to refactor complicated methods.


Pattern 1: Composed Method
Problem
  • Your program is doing too much, at various levels of abstraction.
  • Also, the reader of the program cannot understand what the code is doing.
Solution
  • Divide your program into methods that perform one identifiable task.
    • Top-down: While writing a method, invoke several smaller, well-named methods, without having an implementation first.
    • Bottom-up: If a method is doing too much, extract the pieces of code into their separate, well-named methods.
  • Keep all of the operations in a method at the same level of abstraction.
  • Give the method an intention revealing name.
Benefits
  • The reader can read your program much more quickly and accurately if they can chunk the internal implementation details of the method into higher level abstractions.
  • Ease of maintenance and flexibility of future changes.
Trade-offs
  • Messages take time. The more small methods you create, the more messages you will execute. You can just create one giant method that does everything, which has enormous human costs and ignores the realities of performance tuning well-structured code.
  • Following the flow of control in programs with many small methods can be difficult.


Pattern 2: Method Object

Some methods are too complicated, where different segments of the code needs access to common variables. Local variables are so intertwined that you cannot extract methods.

If you apply composed method and extract parts of code to different methods, it only obscures the situation. Since all parts of such a method generally need all the temporary variables and parameters, any piece of method you break off requires six or eight parameters.

Problem
  • You have a method that does not simplify well with the composed method
  • How do you refactor a method where many lines of code share many arguments and temporary variables, that are hard to isolate from each other?
Solution
  • Transform the method into a separate class so that the local variables become fields of the class. Ideally, the name of this class relate to a verb, e.g. Processor, Executor, etc.
  • Then you can split the method into several methods within the same class.
  • Replace the body of the original method in the original class by creating a method object and calling its executor method.
Benefits
  • Isolating a long method in its own class prevents a method from ballooning in size.
  • This also allows splitting it into methods on the new class, which have shared access to the class instance variables.
Trade-offs
  • You have to add a new class, which might add to the complexity.
  • If the method you are trying to refactor is not hard to understand and the code is not duplicated, even though it’s big, leave it.


Pattern 3: Query Method
Problem
  • You are trying to evaluate/execute a complicated boolean expression, involving multiple steps
Solution
  • Provide a method that returns a boolean by evaluating the above complicated expression.
  • Give it an intention-revealing name, usually in the form of a question. e.g. IsDate(), HasMarkups(), etc.
  • If the functionality it’s trying to execute is a concern of the object itself, make it a property/method on the object itself, rather than the caller code.
Benefits
  • Simplify complicated boolean tests. Understanding complicated boolean tests in detail is rarely necessary for understanding program flow.
  • Putting such a test into a function makes the code more readable because
    • the details of the test are out of the way and
    • a descriptive function name summarizes the purpose of the test.
    • both the main flow of the code and the test itself become clearer.
Trade-off
  • If the test is very simple, keep it inline.
Before:
public void Process(Payment payment) 
{
    if (payment.amount > 0 && payment.status != Status.Cancelled && payment.isDue)
    {
        // Process the payment
    }
}
After
public void Process(Payment payment)
{
    if (IsValid(payment))
    {
        // Process the payment
    }
}

// Create a new method in the caller. 
public bool IsValid(Payment payment) 
{
    bool isValid = payment.Amount > 0 && payment.Status != Status.Cancelled && payment.IsDue;
    return isValid;
}

We can improve this code further. As validation seems more of a concern on the Payment, add a property IsValid on the payment. In my opinion, this is much more readable and flexible than the previous approach. Now, whenever a module needs to validate payment, it can simply call the IsValid() method in the Payment class, instead of copying the existing implementation.

public void Process(Payment payment)
{
    if (payment.IsValid)
    {
        // Process the payment
    }
}

class Payment
{
    public bool IsValid(Payment payment)
    {
        get 
        {
            return Amount > 0 && Status != Status.Cancelled && IsDue;
        }
    }
}