The silver bullet of collection pipelines: Functional composition

How simple functions help solve the problems with collection processing

Author's image
Tamás Sallai
4 mins

The right way to process collections

In the previous episodes, we've seen that collection processing is both crucial and nontrivial to get right. We've already seen some potential solutions, and where they fall short.

In this post, we'll look into how concepts from functional programming can help solve this problem.

Pipeline composition

The basic idea is that a collection pipeline is ultimately a function that gets a collection and returns a new collection.

Something like this:

coll -> pipeline -> coll

Thinking about pipelines this way, we can merge two pipelines and still get a valid one:

coll -> pipeline1 -> pipeline2 -> coll

This is like UNIX's pipe. pipeline1 gets the input collection, pipeline2 gets the output of pipeline1, and the final result is what comes out of pipeline2.

Using composition, with only the fundamental pipeline functions, arbitrarily complex ones can be built on top of them.

Function composition

Let's translate this abstract concept into programming!

Since the pipelines are simple functions, merging is a function composition. Let's write a function that gets two parameter functions (f and g) and returns a composite one! (Try it):

const flow = (f, g) => (arg) => g(f(arg));

The arg is passed to f, and it's result is passed to g, and it's result is the final result.

A simple example is to add 1 to a number, then double it:

const composed = flow((i) => i + 1, (i) => i * 2);
console.log(composed(3)); // 8

But if we were to use this implementation to compose 3 or more functions, multiple flows would be needed:

flow(flow(f, g), h); // f -> g -> h

Instead, generalize the implementation to handle arbitrary amount of functions:

const flow = (...fs) => (arg) =>
	fs.reduce((res, f) => f(res), arg);

And to use it:

flow(f, g, h); // f -> g -> h

Higher-order functions

Now that we can compose them, if we have the fundamental collection processing functions, arbitrarily complex ones can be built from them. But how to modify our map and filter implementations to conform to the required signature, while staying generic?

The solution is higher-order functions. This is when a function takes or returns another function.

Since the map and filter both need a collection and an iteratee, the key is to supply these parameters in different runs. Since the pipeline function gets the collection, the iteratee has to be the first one.

This is called "iteratee first, data last".

This way, the universality of the functions are retained, while making them compatible with the pipeline signature.

The modified map implementation:

const map = (iter) => (coll) => {
	const result = [];
	for (const e of coll) {
		result.push(iter(e));
	}
	return result;
}

And the filter:

const filter = (iter) => (coll) => {
	const result = [];
	for (const e of coll) {
		if(iter(e)){
			result.push(e);
		}
	}
	return result;
}

Putting it all together

Now we have all the pieces to make a collection pipeline (Try it).

To have a map that adds three to every number:

const addThree = map((i) => i + 3);

And a filter that keeps only the even numbers:

const filterOdds = filter((i) => i % 2 === 0);

And a composed pipeline that increments by three first, then drops the odd numbers:

const composed = flow(addThree, filterOdds);

And finally, to use it, just invoke it with the collection:

composed([5, 10, 15]); // [8, 18]

Of course, all these values are optional. The same pipeline:

flow(
	map((i) => i + 3),
	filter((i) => i % 2 === 0)
)([5, 10, 15]); // [8, 18]

Libraries

There are various libraries that embrace these concepts.

One is the Lodash/fp package (Try it):

_.flow(
	_.map((i) => i + 3),
	_.filter((i) => i % 2 === 0)
)([5, 10, 15]); // [8, 18]

Another one is Ramda (Try it):

R.pipe(
	R.map((i) => i + 3),
	R.filter((i) => i % 2 === 0)
)([5, 10, 15]); // [8, 18]

Conclusion

Functional composition has all the benefits of chaining, all while retaining the ability to easily extend it with new functions. Adding a new one is just a matter of conforming to the interface of (coll) => coll, making it a localized change.

Also, unlike chaining, it has no magic. With a few lines of code, you can write all the required helper functions.

December 19, 2017
In this article