The silver bullet of collection pipelines: Functional composition
How simple functions help solve the problems with collection processing
The right way to process collections
In the previous episodes, we've seen that collection processing is both crucial and nontrivial to get right. We've already seen some potential solutions, and where they fall short.
In this post, we'll look into how concepts from functional programming can help solve this problem.
Pipeline composition
The basic idea is that a collection pipeline is ultimately a function that gets a collection and returns a new collection.
Something like this:
coll -> pipeline -> coll
Thinking about pipelines this way, we can merge two pipelines and still get a valid one:
coll -> pipeline1 -> pipeline2 -> coll
This is like UNIX's pipe
. pipeline1
gets the input collection, pipeline2
gets the output of pipeline1
,
and the final result is what comes out of pipeline2
.
Using composition, with only the fundamental pipeline functions, arbitrarily complex ones can be built on top of them.
Function composition
Let's translate this abstract concept into programming!
Since the pipelines are simple functions, merging is a function composition. Let's write a function that
gets two parameter functions (f
and g
) and returns a composite one! (Try it):
const flow = (f, g) => (arg) => g(f(arg));
The arg
is passed to f
, and it's result is passed to g
, and it's result is the final result.
A simple example is to add 1 to a number, then double it:
const composed = flow((i) => i + 1, (i) => i * 2);
console.log(composed(3)); // 8
But if we were to use this implementation to compose 3 or more functions, multiple flow
s would be needed:
flow(flow(f, g), h); // f -> g -> h
Instead, generalize the implementation to handle arbitrary amount of functions:
const flow = (...fs) => (arg) =>
fs.reduce((res, f) => f(res), arg);
And to use it:
flow(f, g, h); // f -> g -> h
Higher-order functions
Now that we can compose them, if we have the fundamental collection processing functions, arbitrarily complex ones
can be built from them. But how to modify our map
and filter
implementations to conform to the required
signature, while staying generic?
The solution is higher-order functions. This is when a function takes or returns another function.
Since the map
and filter
both need a collection
and an iteratee
, the key is to supply these parameters
in different runs. Since the pipeline function gets the collection, the iteratee
has to be the first one.
This is called "iteratee first, data last".
This way, the universality of the functions are retained, while making them compatible with the pipeline signature.
The modified map
implementation:
const map = (iter) => (coll) => {
const result = [];
for (const e of coll) {
result.push(iter(e));
}
return result;
}
And the filter
:
const filter = (iter) => (coll) => {
const result = [];
for (const e of coll) {
if(iter(e)){
result.push(e);
}
}
return result;
}
Putting it all together
Now we have all the pieces to make a collection pipeline (Try it).
To have a map
that adds three to every number:
const addThree = map((i) => i + 3);
And a filter
that keeps only the even numbers:
const filterOdds = filter((i) => i % 2 === 0);
And a composed pipeline that increments by three first, then drops the odd numbers:
const composed = flow(addThree, filterOdds);
And finally, to use it, just invoke it with the collection:
composed([5, 10, 15]); // [8, 18]
Of course, all these values are optional. The same pipeline:
flow(
map((i) => i + 3),
filter((i) => i % 2 === 0)
)([5, 10, 15]); // [8, 18]
Libraries
There are various libraries that embrace these concepts.
One is the Lodash/fp package (Try it):
_.flow(
_.map((i) => i + 3),
_.filter((i) => i % 2 === 0)
)([5, 10, 15]); // [8, 18]
Another one is Ramda (Try it):
R.pipe(
R.map((i) => i + 3),
R.filter((i) => i % 2 === 0)
)([5, 10, 15]); // [8, 18]
Conclusion
Functional composition has all the benefits of chaining, all while retaining the ability to easily extend it with
new functions. Adding a new one is just a matter of conforming to the interface of (coll) => coll
, making it a localized
change.
Also, unlike chaining, it has no magic. With a few lines of code, you can write all the required helper functions.