Lazily grouping a flat sequence in F#

2019-05-09 13:52发布

Given a sequence of items as follows:

[ ("a", 1); ("a", 2); ("a", 3); ("b", 1); ("c", 2); ("c", 3) ]

How can I convert this lazily into:

{ ("a", { 1; 2; 3}); ("b", { 1 }); ("c", { 2; 3}) }

You can assume that the input data source is already sorted on the grouping key element e.g. "a" "b" and "c".

I'm using the { } there to indicate that it's a lazily-evaluated sequence of items.

I've gotten it working imperatively with two while loops operating over the IEnumerator of the source sequence, but this involves creating reference variables and mutation etc. etc. I'm sure that there are better ways of doing this, perhaps with Recursion or using some of the operations in the Seq library e.g. scan or unfold?

4条回答
手持菜刀,她持情操
2楼-- · 2019-05-09 14:13
Seq.groupBy fst

Will do the trick

查看更多
我想做一个坏孩纸
3楼-- · 2019-05-09 14:15

If you want to implement this over IEnumerable<'T> (to make it lazy), then it is necessarily going to be somewhat imperative, because the IEnumerator<'T> type that is used to iterate over the input is imperative. But the rest can be written as a recursive function using sequence expressions.

The following is lazy in the first level (it produces each group lazily), but it does not produce elements of the group lazily (I think that would have pretty subtle semantics):

/// Group adjacent elements of 'input' according to the 
/// keys produced by the key selector function 'f'
let groupAdjacent f (input:seq<_>) = seq {
  use en = input.GetEnumerator()

  // Iterate over elements and keep the key of the current group
  // together with all the elements belonging to the group so far
  let rec loop key acc = seq { 
    if en.MoveNext() then 
      let nkey = f en.Current 
      if nkey = key then 
        // If the key matches, append to the group so far
        yield! loop key (en.Current::acc)
      else 
        // Otherwise, produce the group collected so far & start a new one
        yield List.rev acc
        yield! loop nkey [en.Current]
    else
      // At the end of the sequence, produce the last group
      yield List.rev acc
  }
  // Start with the first key & first value as the accumulator
  if en.MoveNext() then 
    yield! loop (f en.Current) [en.Current] }

Unfortunately, this (pretty useful!) function is not included in the standard F# library, so if you want to group adjacent elements (rather than arbitrary elements in the list using Seq.groupBy), you have to define it yourself...

查看更多
祖国的老花朵
4楼-- · 2019-05-09 14:17

In F#+ there is a generic function chunkBy that can be used to do that:

#r "FSharpPlus.dll"
open FSharpPlus

seq [ ("a", 1); ("a", 2); ("a", 3); ("b", 1); ("c", 2); ("c", 3) ]
    |> chunkBy fst 
    |> map (fun (x,y) -> x, map snd y)

And it works with seq, array and list.

The implementation for seq is pretty much the same as the groupdAdjacent from Tomas.

查看更多
劫难
5楼-- · 2019-05-09 14:22
let p = [("a", 1); ("a", 2); ("a", 3); ("b", 1); ("c", 2); ("c", 3)]
let l = p |> Seq.groupBy fst |> Seq.map(fun x -> fst x, snd x |> Seq.map snd) 
查看更多
登录 后发表回答