Extending an Iterator in Rust

A common thing one has to do in Rust to leverage zero-cost abstractions and convenience is to extend types. By “extending” them I mean adding functionality (functions) to the type which make it more convenient to use and “improve” it for the domain at hand.

From time to time, one needs to extend generic types, which is a difficult thing to do. Especially when it comes to iterators, where extending is often tricky.

That's why I want to write some notes on it down.

How do filters work?

I have this crate called “filters” which offers nice functionality for building predicates with the builder pattern. A predicate is nothing more than a function which takes something by reference and returns a boolean for it:

fn(&self, some: &Thing) -> bool;

The function is now allowed to decide whether the Thing shall be used. This is useful when working with iterators:

let filter = LowerThan::new(5);
vec![1,2,3,4,5,6,7,8,9].iter().filter(|e| filter.filter(&e)).collect()

Filters can, as I mentioned, be build with the builder-pattern:

let filter = HigherThan::new(5)
    .and(LowerThan::new(20))
    .and(UnequalTo::new(15));

vec![1,2,3,4,5,6,7,8,9].iter().filter(|e| filter.filter(&e)).collect()

One can also define own filters (in fact, the HigherThan, LowerThan and UnequalTo types do not ship with the crate) by implementing the Filter trait:

struct LowerThan(u64);
impl Filter<u64> for LowerThan {
  fn filter(&self, elem: &u64) -> bool {
    *elem < self.0
  }
}

What are we going to do?

What we're going to do is rather easy. We want a functionality implemented on all iterator types which gives us the possibility to pass an object for filtering rather than writing down this nasty closure (as shown above):

iterator.filter(|e| filter.filter(&e))

We want to write:

iterator.filter_with(some_filter)

And we want to get back a nice iterator type.

Step 1: A type

First, we need a type representing a iterator which is filtered. This type, of course, has to be abstract over the actual iterator we want to filter (that's also how the rust standard library does this under the hood). The type also has to be generic over the filter in use. So basically we write down an entirely generic type:

pub struct FilteredIterator<T, F, I>(F, I)
    where F: Filter<T>,
          I: Iterator<Item = T>;

Here F is a filter over T and I is the iterator over T. The FilteredIterator holds both F and I.

Step 2: Implement Iterator on this type

The next step is implementing iterator on this new type. Because we want to use this iterator in a chain like this, for example:

forest
    .filtered_with(TreeWithColor::new(Color::Green))
    .map(|tree| tree.get_name())
    .enumerate()
    .map(|(i, tree_name)| {
      store_tree_number(tree_name, i)
    })
    .fold(Ok(()), |acc, e| {
      // ...
    })

we have to implement Iterator on it.

Because the type is generic, so is our iterator implementation.

impl<T, F, I> Iterator for FilteredIterator<T, F, I>
    where F: Filter<T>,
          I: Iterator<Item = T>
{
    type Item = T;

    fn next(&mut self) -> Option<Self::Item> {
        // ...
    }
}

The implementation is left out as task for the reader. Hint: The only function one needs is Filter::filter() and a while let Some(e).

Step 3: Transforming one iterator into a filtered iterator

Next, we want to be able to transform an existing iterator type into a filtered iterator. As I wrote above, we do this by providing a new trait for this. Because of the generic parameters our FilteredIterator type has, we need generic types in our trait as well:

pub trait FilterWith<T, F: Filter<T>> : Iterator<Item = T> + Sized {
    fn filter_with(self, f: F) -> FilteredIterator<T, F, Self>;
}

The trait provides only one function: filter_with(). This function takes self owning and a Filter which is used in the FilteredIterator later.

Step 4: Extending all iterators

Last but not least, we implement this trait on all Iterators. Even more generics here, of course:

impl<I, T, F: Filter<T>> FilterWith<T, F> for I
    where I: Iterator<Item = T>
{
    fn filter_with(self, f: F) -> FilteredIterator<T, F, Self> {
        FilteredIterator(f, self)
    }
}

The actual implementation is trivial, sure. But the type signature is not, so I'll explain.

  1. I is the type we implement the FilterWith trait for. Because we want to implement it for all iterators, we have to use a generic type.
  2. I is, of course, generic itself. We want to be able to filter all iterators over all types. So, T is the Item of the Iterator.
  3. F is needed because FilterWith is generic over the provided filter: We can filter with any Filter we want. So we have to be generic over that as well.

The output of the filter_with function is a FilteredIterator which is generic over T and F – but wait! FilteredIterator is generic over three types!

That's true. But the third generic type is the iterator it encapsulated – the iterator it actually filters. Because of that, we return FilteredIterator<T, F, Self> here – we don't need a generic type as third type here because we know that the iterator which will be encapsulated has exactly the type which the trait gets currently implemented for.

(I hope that explanation is easy to understand.)

Conclusion

Extending types in Rust is incredible easy – if one can grok generics. I know that generics are not that easy to grok for a beginner – I had a hard time learning how to use them myself. But it is really worth it, as our test shows:

#[test]
fn test_filter_with() {
    struct Foo;
    impl Filter<u64> for Foo {
        fn filter(&self, u: &u64) -> bool {
            *u > 5
        }
    }
    let foo = Foo;

    let v : Vec<u64> = vec![1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
        .into_iter()
        .filter_with(foo)
        .collect();

    assert_eq!(v, vec![6, 7, 8, 9]);
}

tags: #tools #software #rust #open-source