Can someone suggest a way to create batches of a certain size in linq?
Ideally I want to be able to perform operations in chunks of some configurable amount.
Can someone suggest a way to create batches of a certain size in linq?
Ideally I want to be able to perform operations in chunks of some configurable amount.
All of the above perform terribly with large batches or low memory space. Had to write my own that will pipeline (notice no item accumulation anywhere):
Edit: Known issue with this approach is that each batch must be enumerated and enumerated fully before moving to the next batch. For example this doesn't work:
If you start with
sequence
defined as anIEnumerable<T>
, and you know that it can safely be enumerated multiple times (e.g. because it is an array or a list), you can just use this simple pattern to process the elements in batches:So with a functional hat on, this appears trivial....but in C#, there are some significant downsides.
you'd probably view this as an unfold of IEnumerable (google it and you'll probably end up in some Haskell docs, but there may be some F# stuff using unfold, if you know F#, squint at the Haskell docs and it will make sense).
Unfold is related to fold ("aggregate") except rather than iterating through the input IEnumerable, it iterates through the output data structures (its a similar relationship between IEnumerable and IObservable, in fact I think IObservable does implement an "unfold" called generate...)
anyway first you need an unfold method, I think this works;
this is a bit obtuse because C# doesn't implement some of the things functional langauges take for granted...but it basically takes a seed and then generates a "Maybe" answer of the next element in the IEnumerable and the next seed (Maybe doesn't exist in C#, so we've used IEnumerable to fake it), and concatenates the rest of the answer (I can't vouch for the "O(n?)" complexity of this).
Once you've done that then;
it all looks quite clean...you take the "n" elements as the "next" element in the IEnumerable, and the "tail" is the rest of the unprocessed list.
if there is nothing in the head...you're over...you return "Nothing" (but faked as an empty IEnumerable>)...else you return the head element and the tail to process.
you probably can do this using IObservable, there's probably a "Batch" like method already there, and you can probably use that.
If the risk of stack overflows worries (it probably should), then you should implement in F# (and there's probably some F# library (FSharpX?) already with this).
(I have only done some rudimentary tests of this, so there may be the odd bugs in there).
I'm joining this very late but i found something more interesting.
So we can use here
Skip
andTake
for better performance.Next I checked with 100000 records. The looping only is taking more time in case of
Batch
Code Of console application.
Time taken Is like this.
First - 00:00:00.0708 , 00:00:00.0660
Second (Take and Skip One) - 00:00:00.0008, 00:00:00.0008
Another way is using Rx Buffer operator