Need to optimise counting positive and negative va

2019-02-16 06:24发布

问题:

I need to optimise code that counts pos/neg values and remove non-qualified values by time.

I have queue of values with time-stamp attached.
I need to discard values which are 1ms old and count negative and positive values. here is pseudo code

list<val> l;
v = q.dequeue();
deleteold(l, v.time);
l.add(v);
negcount = l.count(i => i.value < 0);
poscount = l.count(i => i.value >= 0);
if(negcount == 10) return -1;
if(poscount == 10) return  1;

I need this code in c# working with max speed. No need to stick to the List. In fact arrays separated for neg and pos values are welcome.

edit: probably unsafe arrays will be the best. any hints?

EDIT: thanks for the heads up.. i quickly tested array version vs list (which i already have) and the list is faster: 35 vs 16 ms for 1 mil iterations...

Here is the code for fairness sake:

class Program
{
    static int LEN = 10;
    static int LEN1 = 9;

    static void Main(string[] args)
    {
        Var[] data = GenerateData();

        Stopwatch sw = new Stopwatch();

        for (int i = 0; i < 30; i++)
        {
            sw.Reset();
            ArraysMethod(data, sw);

            Console.Write("Array: {0:0.0000}ms     ", sw.ElapsedTicks / 10000.0);

            sw.Reset();
            ListMethod(data, sw);

            Console.WriteLine("List: {0:0.0000}ms", sw.ElapsedTicks / 10000.0);
        }

        Console.ReadLine();
    }

    private static void ArraysMethod(Var[] data, Stopwatch sw)
    {
        int signal = 0;
        int ni = 0, pi = 0;
        Var[] n = new Var[LEN];
        Var[] p = new Var[LEN];
        for (int i = 0; i < LEN; i++)
        {
            n[i] = new Var();
            p[i] = new Var();
        }

        sw.Start();
        for (int i = 0; i < DATALEN; i++)
        {
            Var v = data[i];

            if (v.val < 0)
            {
                int x = 0;
                ni = 0;
                // time is not sequential
                for (int j = 0; j < LEN; j++)
                {
                    long diff = v.time - n[j].time;
                    if (diff < 0)
                        diff = 0;

                    // too old
                    if (diff > 10000)
                        x = j;
                    else
                        ni++;

                }

                n[x] = v;

                if (ni >= LEN1)
                    signal = -1;

            }
            else
            {
                int x = 0;
                pi = 0;
                // time is not sequential
                for (int j = 0; j < LEN; j++)
                {
                    long diff = v.time - p[j].time;
                    if (diff < 0)
                        diff = 0;

                    // too old
                    if (diff > 10000)
                        x = j;
                    else
                        pi++;

                }

                p[x] = v;

                if (pi >= LEN1)
                    signal = 1;
            }
        }
        sw.Stop();
    }

    private static void ListMethod(Var[] data, Stopwatch sw)
    {
        int signal = 0;
        List<Var> d = new List<Var>();

        sw.Start();
        for (int i = 0; i < DATALEN; i++)
        {
            Var v = data[i];


            d.Add(new Var() { time = v.time, val = v.val < 0 ? -1 : 1 });

            // delete expired
            for (int j = 0; j < d.Count; j++)
            {
                if (v.time - d[j].time < 10000)
                    d.RemoveAt(j--);
                else
                    break;
            }

            int cnt = 0;
            int k = d.Count;
            for (int j = 0; j < k; j++)
            {
                cnt += d[j].val;
            }

            if ((cnt >= 0 ? cnt : -cnt) >= LEN)
                signal = 9;
        }
        sw.Stop();
    }

    static int DATALEN = 1000000;
    private static Var[] GenerateData()
    {
        Random r = new Random(DateTime.Now.Millisecond);

        Var[] data = new Var[DATALEN];

        Var prev = new Var() { val = 0, time = DateTime.Now.TimeOfDay.Ticks};
        for (int i = 0; i < DATALEN; i++)
        {
            int x = r.Next(20);
            data[i] = new Var() { val = x - 10, time = prev.time + x * 1000 };
        }

        return data;
    }

    class Var
    {
        public int val;
        public long time;
    }

}

回答1:

Some ideas:

  1. Only count until max(negcount,poscount) becomes 10, then quit (no need to count the rest). Only works if 10 is the maximum count.
  2. Count negative and positive items in 1 go.
  3. Calculate only negcount and infer poscount from count-negcount which is easier to do than counting them both.

Whether any of them are faster than what you have now, and which is fastest, depends among other things on what the data typically looks like. Is it long? Short?

Some more about 3:
You can use trickery to avoid branches here. You don't have to test whether the item is negative, you can add its negativity to a counter. Supposing the item is x and it is an int, x >> 31 is 0 for positive x and -1 for negative x. So counter -= x >> 31 will give negcount.

Edit: unsafe arrays can be faster, but shouldn't be in this case, because the loop would be of the form

for (int i = 0; i < array.Length; i++)
    do something with array[i];

Which is optimized by the JIT compiler.



回答2:

To get negcount and poscount, you are traversing the entire list twice. Instead, traverse it once (to compute negcount), and then poscount = l.Count - negcount.