Collection pipelines

Most of programming is processing collections. Learn how to do it right


Collections form the backbone of any programs. Working with data is an essential part of the software, and thus an essential part of development.

But they are non-trivial to get right. There are several approaches to choose from, each with their advantages and drawbacks.

In a series of posts, we’ll look into how developers handled collections decades ago, and how it evolved over several milestones.

Avoid the common mistakes, and write code that is easier to understand by knowing how to approach collections effectively.


1. Better collection processing with collection pipelines

In the first post, we’ll look into how collections used to be processed. The for loop is the construct that dominated this area, thanks to its simplicity and its effectiveness.

But simplicity from the view of a machine is far from being simple from the perspective of a fellow developer. For loops, while being the most efficient solution, is prone to getting too complicated.

But what is the alternative?

Learn the basic building blocks of the modern, functional collection pipelines, and how combining them forms complex processing steps.


2. Ditch for loops. Here is a case study to convince you

Simple for loops are simple to understand. In that sense, there is no need for another entirely different kind of collection processing.

But they are written for the machine, and that hardly translates to the specification developers are working from.

In a series of example problems, see how for loops diverge from the written requirements, making them ever harder to understand.


3. Where Array#Extras fall short

Javascript Arrays come with several functions that can be found in modern pipelines. Using them makes the code easy to use and familiar to everybody.

But what if something is missing?

In that case, there is no good option, just bad ones in varying degrees.

Learn the downsides of amending the Array prototype, and why hacking your way to adding them is not the right approach.


4. Demystifying chaining in Javascript

Chaining is a powerful method offered by several libraries. If you’ve ever used Underscore or Lodash, you already know how effortlessly can the functions be composed.

It is half magic, and half ingenuity. But how does it work?

Learning by doing, we’ll iteratively build a simplified chain function from the ground up. You’ll see that it’s not magic, but a few tricks, working together to achieve the desired effect.

Also, learn about the downsides of such function.


5. The silver bullet of collection pipelines: Functional composition

We’ve seen that many approaches to collection pipelines fail, mostly because of the lack of extensibility.

How, then, should such a construct designed to get both the expressiveness and the ability to add new kinds of steps freely?

Fortunately, functional programmers long have the answer. It’s the combination of function composition and higher-order functions. And the best of all, it is entirely doable in Javascript.

Learn how to design a pipeline that has all the advantages of the previous implementations, but none of their drawbacks.


6. Reuse code with domain-specific steps in collection pipelines

Finally, let’s talk about the need for extending a pipeline. Most of the steps fit into a basic building block, and thus trivial to implement.

But what if that’s not the case?

As software gets more complex, so is the need for reusing ever complex parts of it. Soon enough, the reused part no longer fits into a map or a filter.

In that case, adding new functions becomes a necessity. And that is the point where having chosen the right implementation pays off.

Learn why it’s usually the case for real-world software, and what is the right strategy to follow.


Iterators, iterables, and generators in Javascript

Everything you need to know about function-generated collections in Javascript


Collections do not stop with Arrays, Sets, Maps, and similar data structures. There is a completely different approach to define them.

They are called streaming, lazy, or function-generated collections. This is when you don’t necessarily have all the elements beforehand, but define a function that creates the next one when needed.

This opens up a whole new world. Collections that don’t fit into the memory or even infinite ones become possible, opening new possibilities and making some problems more expressive or efficient.

While this guide focuses on Javascript, the same concepts pop up everywhere. Most languages have some concepts of Iterators and Iterables. Learn the protocols for the functions that generate them on-the-fly.

Level up your coding game and avoid the pitfalls along the way with our screencast series.


1. [Screencast] Iterators in Javascript

In this first post, we’ll look into the basis of all generated collections: The iterator protocol.

Iterators are both simple and elegant. They capture the essence of what a collection is good for: The ability to produce a sequence of elements. There is the generalized and standardized way to define them in Javascript.

In this post, you’ll learn how they are defined, how to write one, and also how to make use of one.


2. [Screencast] Iterables in Javascript

In the second post, we’ll move on to iterables. While iterators focus on how to use elements, iterables are the other end of the equation: how to produce iterators. They define how something can be iterated over.

Having a protocol is also useful that language constructs can support them. This makes different collection types interchangeable: the consumer does not care how the elements are produced as long as they are the right ones.

Learn how Javascript’s iterable protocol is defined, and some of the constructs that support them on the language level.


3. [Screencast] Generators in Javascript

Generator functions are a recent addition to Javascript, and they are unique in the language.

They are a type of function that can be paused and resumed. Unlike traditional functions that run to completion, they produce multiple elements in the course of a single call. This behavior opens up a lot of potential.

In the context of collections, they greatly simplify the protocol you need to write for a lazy collection.

But alas, they break the clear separation between iterators and iterables.

Learn why it’s a problem and what to do about it.


4. [Screencast] Generators and ImmutableJS

While iterators open up new ways of coding that are not possible with traditional Arrays, they still lack an important component: a developer-friendly API.

This is where third-party libraries come handy. We’ll look into a specific one: ImmmutableJs. It has support for its own version of lazy collection: the Seq.

For many use cases, Seq does a great job. But it lacks the versatility of generator functions.

Learn how to combine generator functions with ImmutableJs’s Seq to get the best of both worlds. And also learn what to look out for when you do so.