F# split sequence into sub lists on every nth elem

2019-01-15 13:54发布

Say I have a sequence of 100 elements. Every 10th element I want a new list of the previous 10 elements. In this case I will end up with a list of 10 sublists.

Seq.take(10) looks promising, how can I repeatedly call it to return a list of lists?

标签: f#
8条回答
啃猪蹄的小仙女
2楼-- · 2019-01-15 14:23

This is not bad:

let splitEach n s =
    seq {
        let r = ResizeArray<_>()
        for x in s do
            r.Add(x)
            if r.Count = n then
                yield r.ToArray()
                r.Clear()
        if r.Count <> 0 then
            yield r.ToArray()
    }

let s = splitEach 5 [1..17]
for a in s do
    printfn "%A" a
(*
[|1; 2; 3; 4; 5|]
[|6; 7; 8; 9; 10|]
[|11; 12; 13; 14; 15|]
[|16; 17|]
*)
查看更多
Animai°情兽
3楼-- · 2019-01-15 14:24

Perhaps this simple pure implementation might be useful:

let splitAt n xs =  (Seq.truncate n xs, if Seq.length xs < n then Seq.empty else Seq.skip n xs)

let rec chunk n xs =
    if Seq.isEmpty xs then Seq.empty
    else
        let (ys,zs) = splitAt n xs
        Seq.append (Seq.singleton ys) (chunk n zs)

For example:

> chunk 10 [1..100];;
val it : seq<seq<int>> =
  seq
    [seq [1; 2; 3; 4; ...]; seq [11; 12; 13; 14; ...]; 
     seq [21; 22; 23; 24; ...]; seq [31; 32; 33; 34; ...]; ...]

> chunk 5 [1..12];;
val it : seq<seq<int>> =
  seq [seq [1; 2; 3; 4; ...]; seq [6; 7; 8; 9; ...]; seq [11; 12]]
查看更多
女痞
4楼-- · 2019-01-15 14:31

I found this to be easily the fastest:

let windowChunk n xs =
    let range = [0 .. Seq.length xs]
    Seq.windowed n xs |> Seq.zip range 
                      |> Seq.filter (fun d -> (fst d) % n = 0)
                      |> Seq.map(fun x -> (snd x))

i.e. window the list, zip with a list of integers, remove all the overlapping elements, and then drop the integer portion of the tuple.

查看更多
对你真心纯属浪费
5楼-- · 2019-01-15 14:34

now there's Seq.chunkBySize available:

[1;2;3;4;5] |> Seq.chunkBySize 2 = seq [[|1; 2|]; [|3; 4|]; [|5|]]
查看更多
我想做一个坏孩纸
6楼-- · 2019-01-15 14:39

I think that the solution from Brian is probably the most reasonable simple option. A probelm with sequences is that they cannot be easily processed with the usual pattern matching (like functional lists). One option to avoid that would be to use LazyList from F# PowerPack.

Another option is to define a computation builder for working with IEnumerator type. I wrote something like that recently - you can get it here. Then you can write something like:

let splitEach chunkSize (s:seq<_>) =
  Enumerator.toSeq (fun () -> 
    let en = s.GetEnumerator()
    let rec loop n acc = iter {
      let! item = en
      match item with 
      | Some(item) when n = 1 -> 
          yield item::acc |> List.rev 
          yield! loop chunkSize []
      | Some(item) -> 
          yield! loop (n - 1) (item::acc)
      | None -> yield acc |> List.rev }
    loop chunkSize [] )

This enables using some functional patterns for list processing - most notably, you can write this as a usual recursive function (similar to the one you would write for lists/lazy lists), but it is imperative under the cover (the let! constructo of iter takes the next element and modifies the enumerator).

查看更多
Fickle 薄情
7楼-- · 2019-01-15 14:41

If in doubt, use fold.

let split n  = let one, append, empty = Seq.singleton, Seq.append, Seq.empty
               Seq.fold (fun (m, cur, acc) x -> 
                           if m = n then (1, one x, append acc (one cur))
                           else (m+1, append cur (one x), acc))
                        (0, empty, empty)
               >> fun (_, cur, acc) -> append acc (one cur)

This has the advantage of being fully functional, yet touch each element of the input sequence only once(*) (as opposed to the Seq.take + Seq.skip solutions proposed above).

(*) Assuming O(1) Seq.append. I should certainly hope so.

查看更多
登录 后发表回答