I have an iterator of lines from a very large file that need to be put in groups as I move along. I know where each group ends because there is a sentinel value on the last line of each group. So basically I want to write a function that takes an iterator and a sentinel value, and returns an iterator of groups each terminated by the sentinel value. Something like:
scala> groups("abc.defg.hi.jklmn.".iterator, '.')
res1: Iterator[Seq[Char]] = non-empty iterator
scala> groups("abc.defg.hi.jklmn.".iterator, '.').toList
res19: List[Seq[Char]] = List(List(a, b, c, .), List(d, e, f, g, .), List(h, i, .), List(j, k, l, m, n, .))
Note that I want the sentinel items included at the end of each of the groups. Here's my current solution:
def groups[T](iter: Iterator[T], sentinel: T) = new Iterator[Seq[T]] {
def hasNext = iter.hasNext
def next = iter.takeWhile(_ != sentinel).toList ++ List(sentinel)
}
I think this will work, and I guess it is fine, but having to re-add the sentinel every time gives me a code smell. Is there a better way to do this?
Ugly, but should be more performant than your solution:
Less readable than yours, but more "correct" when final group doesn't have a terminating sentinel value:
Or, recursively: