Best way to store multiple lists of true false val

2019-08-09 14:42发布

问题:

This is just to settle a curiosity - Suppose, in my C# project, I have a list containing millions of strings, each along the following lines:

"123Hi1234Howdy"
"Hi1Howdy23"
....

And all I need to know is, for each character in the string, if it is a digit or is it a letter.

So, I was thinking the easiest way to store this would be as 0's and 1's or True / False. So, in the example above, assuming I could assign IsLetter = 1 and IsDigit = 0, I could transform each line to:

"123Hi1234Howdy"  >> 00011000011111
"Hi1Howdy23"      >> 1101111100
....

That seems to me to be the most efficient way to store the data I'm looking for (but please do already correct me if I'm wrong on this - I'm still pretty much a newbie with programming).

So, writing the code that loops through a line and checks for whether each character is a digit or a letter and converting it to true/false or 1/0 is easy enough. My question is what would be the best way to store each line's output?

Should I store each line's output as a bit array? Could it be stored as some other type (maybe, say, integer) that could then be converted back to a series of bits? Should it be stored as a boolean array? Any other thoughts on the best way to store this? When it's all said and done, I need to have a list where I can know, for example:

myList[0] = 00011000011111
myList[1] = 1101111100

And, then, therefore myList[0] <> myList[1]

回答1:

You could use a BitArray for each word and set the bits to true or false if they are a digit or not. See this possible solution:

void Main()
{
    string[] words = 
    {
        "123Hi1234Howdy", 
        "Hi1Howdy23"
    };

    //Create an array of BitArray
    var bArrays = words.Select(w => new BitArray(w.Select(c => char.IsDigit(c)).ToArray()));

    //You can also create string too
    var strings = words.Select(w => new string(w.Select(c => char.IsDigit(c) ? '1' : '0').ToArray())).ToArray();


}

This is not necessarily the fastest or most efficient. I guess it depends on what you intend to do with the strings, but at least it's simple!