Split results from DB into “chunks” of 10

2019-07-09 13:56发布

问题:

Afternoon,

I need a hand to split over 1200 results into "chunks" of 10 so i can process these results with the Amazon MWS API. Can anyone provide any guidance on how i would go about doing this please?

 List<string> prodASIN = dc.aboProducts.Select(a => a.asin).Take(10).ToList();

I currently have this, which works. But i have 1200+ results and need to loop through each 10 so i can process them and pass them over to the Amazon MWS API

回答1:

Why not try something like:

//Load all the database entries into Memory
List<string> prodASINs = dc.aboProducts.Select(a => a.asin).ToList();
var count = prodASINs.Count();
//Loop through passing 10 at a time to AWS
for (var i = 0; i < count; i++)
{
    var prodASINToSend = prodASINs.Skip(i * 10).Take(10).ToList(); 
    //Send to AWS
}

Or if you don't want to load them all into memory.

var count = dc.aboProducts.Count();
for (var i = 0; i < count; i++)
{
    List<string> prodASIN = dc.aboProducts.OrderBy(a => a.Id).Select(a => a.asin).Skip(i * 10).Take(10).ToList(); 
    //Send to AWS
}


回答2:

I know the question is answered but I can't withhold from you this little extension method I once made and that has served me well since.

You can do:

foreach(var list in prodASINs.ToChunks(10))
{
    // send list
}


回答3:

Sorry this isnt LINQ specific, but perhaps it will help...

One of the things I have done when working with data with MWS and ERP software is adding a control column to the database, something like "addedASIN'. In the database I define the control column as a boolean value ( or TINYINT(1) in MySQL ) and default the flag to 0 for all new entries and set it to 1 when the entry has been added.

If you are able to do that then you can do something like

SELECT asin FROM datasource WHERE addedASIN = 0 LIMIT 10;

Then once MWS returns successful for the additions update the flag using

UPDATE datasource SET addedASIN = 1 WHERE asin = 'asinnumber';

The benefit I have found with this is that your database will be able to stop and start with a minimal repetition of data - for instance in my case ( and what started this control column ) our network connection can be flaky, so I was finding during order imports I would lose connectivity resulting in lost orders, or orders being uploaded to our system twice.

This solution has mitigated that by having at most 1 order being added twice as a result of a connectivity loss, and in order for that order to be uploaded twice, connectivity needs to be lost between sending the data to our ERP system, our ERP system acknowledging it was added and the database being updated, which for a round trip takes approximately 30 seconds.



回答4:

Slice Extension (for Arrays):

    public static T[] Slice<T>(this T[] source, int index, int length)
    {
        T[] slice = new T[length];
        Array.Copy(source, index, slice, 0, length);
        return slice;
    }

Array.Copy is extremely fast, a lot faster than the Select/Skip/Take pattern. Although this method is not the fasted I've found, recents tests show that it's nearly 400 times faster than the Skip/Take pattern used to split Lists and Arrays.

To use it as is:

const int arraySize = 10;
List<string> listSource = whatever;
string[] source = listSource.ToArray();

for (int i = 0; i < source.Length; i += arraySize)
{
    List<string> buffer = source.Slice(i, arraySize).ToList();
    DoSomething(buffer);
}


回答5:

List<T> has a built-in function called GetRange() which was made specifically for what you're trying to do. It's extremely fast and doesn't need Linq, casting, etc...

List<string> prodASINs = dc.aboProducts.Select(a => a.asin).ToList(); 

for(int i = 0; i < prodASINs.Count; i += 10)
{

    List<string> buffer = prodASINs.GetRange(i, 10);
    // do something with buffer
}

That's it. Very simple.


Test results: GetRange vs. Slice vs. Linq with 5000 strings in List<string> As you can clearly see, the Skip/Take approach using Linq is over 383 times slower than Slice<T>() and 4,736 times slower than GetRange()

==================================================================================

GetRange took on average 168 ticks
Slice took on average 2073 ticks
Linq took on average 795643 ticks

Test method used (try it yourself):

private static void GetRangeVsSliceVsLinq()
{
    List<string> stringList = new List<string>();
    for (int i = 0; i < 5000; i++)
    {
        stringList.Add("This is a test string " + i.ToString());
    }

    Stopwatch sw = new Stopwatch();

    long m1 = 0, m2 = 0, m3 = 0;


    for (int x = 0; x < 10; x++)
    {
        Console.WriteLine("Iteration {0}", x + 1);
        Console.WriteLine();

        sw.Reset();
        sw.Start();

        for (int i = 0; i < stringList.Count; i += 10)
        {
            List<string> buffer = stringList.GetRange(i, 10);
        }
        sw.Stop();
        Console.WriteLine("GetRange took {0} msecs", sw.ElapsedMilliseconds);
        Console.WriteLine("GetRange took {0} ticks", sw.ElapsedTicks);
        m1 += sw.ElapsedTicks;

        sw.Reset();
        sw.Start();

        string[] sliceArray = stringList.ToArray();
        for (int i = 0; i < sliceArray.Length; i += 10)
        {
            List<string> buffer = sliceArray.Slice(i, 10).ToList();
        }
        sw.Stop();
        Console.WriteLine("Slice took {0} msecs", sw.ElapsedMilliseconds);
        Console.WriteLine("Slice took {0} ticks", sw.ElapsedTicks);
        m2 += sw.ElapsedTicks;

        sw.Reset();
        sw.Start();

        var count = stringList.Count();
        for (var i = 0; i < count; i++)
        {
            var buffer = stringList.Skip(i * 10).Take(10).ToList();
        }

        sw.Stop();
        Console.WriteLine("Skip/Take took {0} msecs", sw.ElapsedMilliseconds);
        Console.WriteLine("Skip/Take took {0} ticks", sw.ElapsedTicks);
        m3 += sw.ElapsedTicks;

        Console.WriteLine();
    }

    Console.WriteLine();
    Console.WriteLine("GetRange took on average {0} ticks", m1 / 10);
    Console.WriteLine("Slice took on average {0} ticks", m2 / 10);
    Console.WriteLine("Linq took on average {0} ticks", m3 / 10);

}