Method Chaining vs Function Composition
Part of an occasional series on my move away from OO programming
Summary: using method chaining might look fluent and elegant, but it adds coupling while reducing design flexibility. For the last ten years I have been moving away from chaining, instead using various kinds of function composition. Let me tell you why I find it better.
The Appeal of Method Chains
Here’s a turtle graphics program that draws a square. It’s written in some arbitrary OO language.
turtle
.pen_down()
.move(1)
.turn(90)
.move(1)
.turn(90)
.move(1)
.turn(90)
.move(1)
.pen_up()
It’s a kind of Fluent Interface, a mini-Domain Specific Language. It’s made possible because each of the actions is a method defined in the Turtle class, and each does its business and then returns that same object.
def pen_down
# move the pen…
return self # (or this, or …)
end
There’s no law that says it has to be the same object that is returned by each method:
File # → File class
.open(“data.dat”) # → File object
.read_line() # → string1
.trim() # → string2
.split(/\s+/) # → array of string
This is the method chain, and it has been a staple of OO development pretty much from the start.
I love them, and use them every day.
But the lure of the fluent-style method chain can lead developers down some pretty dangerous paths. Let’s look at a couple of these before considering an alternative.
“I Can Make It Like English”
(or French, German, Spanish…)
Testing is a big deal in the Ruby world. There are two schools: traditional and behavior-driven. Both use assertions to check expected values against the values produced by code. In the traditional camp, we have tools such as minitest:
assert(update_complete)
assert_equal(7, add(2, 5))
refute(count > 5)
In the BDD classrooms, they teach tools such as RSpec:
expect(update_complete).to be(true) #
or be_truthy
expect(add(5, 2)).to eql(7)
expect(count).to_not be > 5
Both styles achieve the same thing, but RSpec attempts to look more like a specification, more like natural language. And it is pretty clever.
But, for the same reason there’s an (increasingly shallow) uncanny valley between artificial and real images, RSpec’s attempts at fluency leave me feeling uncomfortable.
Every test starts expect(
expression).to
(or .to_not
). The .to method is there largely as window dressing; it gives us a semblance of English-like syntax. It’s followed by the actual assertion: a verb and possibly the expected value. RSpec calls these parts matchers.
They’re trying to be fluent, but failing. Like AppleScript, it’s fairly intuitive to read, but it is anything but to write. Why the space after to
? (Because the matcher is actually a parameter.) Why be(true)
but eql(7)
? (Because these are two of the three different kinds of Ruby equality). And why the split infinitive in the last case? Who knows?
I believe that part of the reasoning for this was to allow less technical people to read and agree on the tests. It may actually have happened that a manager peeked at this code, at some point, somewhere. But my guess is that this worked about as well as assuming managers could read COBOL because of its English-like syntax. The barrier that needs to be crossed is not one of syntax; it’s one of abstraction, of thinking like a developer.
Programming is all about layering abstractions, and method chaining is one way of doing that. But it can come with a high price.
Homogenous Chains Good,…
For a method chain to work, the methods that form the links must each return an object that implements the subsequent method in the chain: in
a().b().c()
The method a
must return an object that implements a method b
, and that method in turn must return an object that implements c
.
There is a style of method chain that makes this easy. It’s when you start with some initial object that has a bunch of methods that can be used to update its state. Each of these methods returns the initial object, allowing these calls to be chained. Sometimes this is called a builder or a constructor. The turtle graphics example at the top of this article is an example of this:
turtle
.pen_down()
.move(1)
.turn(90)
The turtle object has an initial state, and methods such as pen_down
and move
update that state. The chain is homogenous; it’s turtles all the way down.
Another example of this is the ActiveRecord query builder:
Book.where("id > 100").limit(20).order("id desc")
Each of these methods adds to the query structure without actually executing any SQL; that SQL is only generated when you try to access data from the query.
But there’s another style of chain, where we’re striving for some kind of fluent API, but we have to bend the code to make it work.
Heterogeneous Chains Bad
Another common use of chains is to traverse a bunch of objects, either to extract some particular piece of data or to execute some particular functionality.
Here we have a line of code in the fulfillment module trying to work our the charge for shipping based on a client’s address:
client.address(“home”).postcode().delivery_zone().shipping_cost(package)
We can get all academic and call this a violation of the supposed Law of Demeter, or we can just be pragmatic and say that this code has just increased the coupling in your application in manifold ways. Let’s see how.
Chains That Bind
Let’s look at dependencies.
We can infer that our client object depends on an Address class, with in turn depends on a PostCode, and so on.
This is a perfectly fine set of dependencies: if the PostCode
class implementation changes, then that might affect the Address
class, but that’s all (at least in this picture).
But add in our Fulfillment class, and the picture changes:
Our Fulfillment class now depends on all five of the other classes: a change in any of them could potentially cause the need to change Fulfillment
. Maybe we decide that the PostCode
class has no business knowing about its delivery zone, and we move that functionality somewhere else. Fulfillment
breaks
One way to fix this is to say that Fulfillment
only depends on Client
, and that Client
must implement a shipping_cost()
method. If we apply that same logic throughout, we’d also have to add a shipping_cost()
method to Address
, PostCode
, and DeliveryZone
as well. That’s clearly untenable.
Functions and Composition
One way out of this mess is to stop insisting that we pass state and functions together; that we no longer have to invoke methods only on objects. We’ll call these liberated methods functions, as they transform one or more parameters into a new value. This lets us write something like this:
shipping_cost(delivery_zone(postcode(address("home”, client))), package)
Of course, this particular cure is worse than the disease; these kinds of nested calls are a nightmare to read and maintain.
This is where method composition and pipelines come into play.
In a functional language such as OCaml, function composition creates a new function that is the result of running the first, then passing the result to the second. We can create a function that does composition; we’ll call it >>
.
let ( >> ) f g x = g (f x)
This says that >>
is a function that takes three parameters: two functions, f
and g
, and a value x
. It passes x
to f
, and then passes the result of that to g
. We can use this function to build a new function:
let plus1 n = n + 1
let square n = n*n
let plus1_and_square = plus1 >> square
plus1_and_square 10 → 121
In this example, plus1_and_square
is a new function that is the composition of plus1
and square
.
We can parameterize these composite functions:
let plusn n = (+) n
let plusn_and_square n = plusn n >> square
plusn_and_square 3 2 -> 25
Now we can create a new function that creates a function that calculates the shipping for our customer and package:
let shipping client package =
address "home" client >>
postcode >>
delivery_zone >>
shipping_cost package
Using that, I can create a function that calculates the shipping cost of delivering any package to me:
let shipping_to_dave package = shipping_client “pragdave” package
or, more concisely
let shipping_to_dave = shipping_client “pragdave”
See what we’ve done? We created a chain of functions which can be parameterized, and each is independent of the other.
And There Are Pipelines
Your language du jour may not support function composition, but an increasing number of languages support something similar, the function pipeline. Using the OCaml and Elixir notation, we can say
value |> f |> g
takes value
, pipes it into the function f
as a parameter, then pipes the value returned by f
into g
. It is similar to function composition, but it executes immediately. Using pipelines, we could write our shipping code as
client
|> address "home"
|> postcode
|> delivery_zone
|> shipping_cost package
Try This…
I’ve found that this kind of decoupling of functions from state pays enormous dividends, to the point where I now write my code in this style even if I’m writing in an OO language. Typically I’ll use objects or read-only structs to hold state, and then I’ll write independent functions to transform that state. If the language doesn’t have a pipeline equivalent, I’ll either use temporary variables or a loop and a trivial counter-based state machine to chain together the function calls.
It takes getting used to, but if your experience is anything like mine, you’ll find yourself writing code that easier to test (because each function is independent), maintain (because state is easy to examine at every step), and update.