make sure array is sequential in C#

2019-07-29 12:08发布

问题:

I've got an array of integers we're getting from a third party provider. These are meant to be sequential but for some reason they miss a number (something throws an exception, its eaten and the loop continues missing that index). This causes our system some grief and I'm trying to ensure that the array we're getting is indeed sequential.

The numbers start from varying offsets (sometimes 1000, sometimes 5820, others 0) but whatever the start, its meant to go from there.

What's the fastest method to verify the array is sequential? Even though its a required step it seems now, I also have to make sure it doesn't take too long to verify. I am currently starting at the first index, picking up the number and adding one and making sure the next index contains that etc.

EDIT: The reason why the system fails is because of the way people use the system it may not always be returning the tokens the way it was picked initially - long story. The data can't be corrected until it gets to our layer unfortunately.

回答1:

If you're sure that the array is sorted and has no duplicates, you can just check:

array[array.Length - 1] == array[0] + array.Length - 1


回答2:

I think it's worth addressing the bigger issue here: what are you going to do if the data doesn't meet your requriements (sequential, no gaps)?

If you're still going to process the data, then you should probably invest your time in making your system more resilient to gaps or missing entries in the data.

**If you need to process the data and it must be clean, you should work with the vendor to make sure they send you well-formed data.

If you're going to skip processing and report an error, then asserting the precondition of no gaps may be the way to go. In C# there's a number of different things you could do:

  1. If the data is sorted and has no dups, just check if LastValue == FirstValue + ArraySize - 1.
  2. If the data is not sorted but dup free, just sort it and do the above.
  3. If the data is not sorted, has dups and you actually want to detect the gaps, I would use LINQ.

List<int> gaps = Enumerable.Range(array.Min(), array.Length).Except(array).ToList();

or better yet (since the high-end value may be out of range):

int minVal = array.Min();
int maxVal = array.Max();
List<int> gaps = Enumerable.Range(minVal, maxVal-minVal+1).Except(array).ToList();

By the way, the whole concept of being passed a dense, gapless, array of integers is a bit odd for an interface between two parties, unless there's some additional data that associated with them. If there's no other data, why not just send a range {min,max} instead?



回答3:

for (int i = a.Length - 2; 0 <= i; --i)
{
    if (a[i] >= a[i+1]) return false; // not in sequence
}
return true; // in sequence


回答4:

Gabe's way is definitely the fastest if the array is sorted. If the array is not sorted, then it would probably be best to sort the array (with merge/shell sort (or something of similar speed)) and then use Gabe's way.