The right way to process collections
In the previous episodes, we’ve seen that collection processing is both crucial and nontrivial to get right. We’ve already seen some potential solutions, and where they fall short.
In this post, we’ll look into how concepts from functional programming can help solve this problem.
The basic idea is that a collection pipeline is ultimately a function that gets a collection and returns a new collection.
Something like this:
Thinking about pipelines this way, we can merge two pipelines and still get a valid one:
This is like UNIX’s
pipeline1 gets the input collection,
pipeline2 gets the output of
pipeline1, and the final result is what comes out of
Using composition, with only the fundamental pipeline functions, arbitrarily complex ones can be built on top of them.
We write articles like this regularly. Join our mailing list and let's keep in touch.
Let’s translate this abstract concept into programming!
Since the pipelines are simple functions, merging is a function composition. Let’s write a function that gets two parameter functions (
g) and returns a composite one! (Try it):
arg is passed to
f, and it’s result is passed to
g, and it’s result is the final result.
A simple example is to add 1 to a number, then double it:
But if we were to use this implementation to compose 3 or more functions, multiple
flows would be needed:
Instead, generalize the implementation to handle arbitrary amount of functions:
And to use it:
Now that we can compose them, if we have the fundamental collection processing functions, arbitrarily complex ones can be built from them. But how to modify our
filter implementations to conform to the required signature, while staying generic?
The solution is higher-order functions. This is when a function takes or returns another function.
filter both need a
collection and an
iteratee, the key is to supply these parameters in different runs. Since the pipeline function gets the collection, the
iteratee has to be the first one.
This is called “iteratee first, data last”.
This way, the universality of the functions are retained, while making them compatible with the pipeline signature.
Learn how to implement a secure, robust, and scalable file downloading and uploading solution using S3 signed URLsGet the Ebook
Not sure if it's for you? Sign up for the free 8-part AWS signed URLs email course and find out.
Putting it all together
Now we have all the pieces to make a collection pipeline (Try it).
To have a
map that adds three to every number:
filter that keeps only the even numbers:
And a composed pipeline that increments by three first, then drops the odd numbers:
And finally, to use it, just invoke it with the collection:
Of course, all these values are optional. The same pipeline:
There are various libraries that embrace these concepts.
One is the Lodash/fp package (Try it):
Another one is Ramda (Try it):
Functional composition has all the benefits of chaining, all while retaining the ability to easily extend it with new functions. Adding a new one is just a matter of conforming to the interface of
(coll) => coll, making it a localized change.
Also, unlike chaining, it has no magic. With a few lines of code, you can write all the required helper functions.