Shape and contents with traversables

One of the first papers I could find that seriously studies the properties of Traversals is “The essence of the iterator pattern”, by Jeremy Gibbons and Bruno C. d. S. Oliveira (PDF)

In there, they show the idea of splitting a traversable collection in its contents and its shape, attributing this idea to Moggi et al. in “Monads, Shapely Functors and Traversals”. The idea is to traverse the collection extracting shape and elements in a way that would allow to reconstruct the original structure.

To represent the contents of a traversable collections Traversable t => t a we can use simply [a]. For the shape, we need to conserve the traversable structure, discarding the elements: Traversable t => t ().

Extracting contents

Let’s start with the contents. We want a function of type

The type class function traverse can do the job of iterating over all the elements giving us access to each of them, we just need to provide the right Applicative f:

What we want is to accumulate each element on a list, monoid style. Fortunately, every monoid, and lists in particular, can generate an applicative that uses the monoid operation to combine effects in <*>, and mempty for pure. Haskell calls this monoid Const apparently, because it looks like the const function, it just ignores the second argument:

So this Const applicative behaves like the monoid in its first argument

and it’s exactly what we need to implement our contents function:

To make the examples more interesting, let’s define a Tree type

and try contents on it

We will continue to use this tree t in future examples.

Extracting shape

To extract the shape of the collection we want to traverse it ignoring all elements. The right applicative to do that is Identity found in Data.Functor in the transformers package, or in a modern enough ghc base (>= 4.8.0.0)

This applicative basically “does nothing”, which is what we want to extract the shape, no effects. Using this applicative and traverse we can write

Contents and shape in one pass

If we want to compute both the contents and the shape, we can call traverse twice, but there is a better way. The product of two applicatives is guaranteed to be an applicative, unlike for instance the product of two monads. That means that we can write in a generic way the applicative instance for an arbitrary pair of applicatives. The base package already has this Product type in Data.Functor.Product

As we can see, this applicative tracks the effects of f and g in parallel, using a tuple-like Pair constructor.

With this, and in a single traversal we can compute both the contents and the shape:

Reconstructing

Now the paper proposes to reconstruct the original traversable from it’s shape and contents as extracted in the previous sections. This sounds like a fold, but we can also think about it as a stateful computation. The state being tracked is the list of elements, the contents. For each element of the desired shape, we extract the first element from the state and leave the rest in the new state. Since every monad is an applicative we know we’ll be able to use the State monad in a call to traverse.

But there is a one extra detail to take into account. If the number of elements provided as content are not enough to fill the shape, we won’t be able to recreate the datastructure. For this reason the end result has to be optional. So we have a combination of a State applicative with a Maybe, in a composition of both effects.

Just like in the case of Product the composition of two applicatives is also an applicative

In this form, Compose (State [a]) Maybe gives us the exact combination of effects we want. We can write the function to reassemble now

This reassembleBody function must take a () and return the composed stateful/optional computation

Now to reconstruct the datastructure we just need to feed the shape to reassemble and then run the stateful computation resulting

Here, we are discarding any extra elements provided.

Swapping data

With this machinery we can, for instance, write a generic way to swap the contents of two datastructures of differente shapes.

Final notes

It’s a great paper, I highly recommend to read it. This shape/contents thing is only a short section in the paper, it goes in several other directions with many other interesting ideas.

If you are an intermediate level Haskell programmer, reading classic papers is great, particularly old ones (this on is from 2009, so not that old). Lots of ideas, plainly explained.