I need very big array length(size) in C#

2019-02-15 07:16发布

问题:

public double[] result = new double[ ??? ];

I am storing results and total number of the results are bigger than the 2,147,483,647 which is max int32.

I tried biginteger, ulong etc. but all of them gave me errors.

How can I extend the size of the array that can store > 50,147,483,647 results (double) inside it?

Thanks...

回答1:

An array of 2,147,483,648 doubles will occupy 16GB of memory. For some people, that's not a big deal. I've got servers that won't even bother to hit the page file if I allocate a few of those arrays. Doesn't mean it's a good idea.

When you are dealing with huge amounts of data like that you should be looking to minimize the memory impact of the process. There are several ways to go with this, depending on how you're working with the data.


Sparse Arrays

If your array is sparsely populated - lots of default/empty values with a small percentage of actually valid/useful data - then a sparse array can drastically reduce the memory requirements. You can write various implementations to optimize for different distribution profiles: random distribution, grouped values, arbitrary contiguous groups, etc.

Works fine for any type of contained data, including complex classes. Has some overheads, so can actually be worse than naked arrays when the fill percentage is high. And of course you're still going to be using memory to store your actual data.

Simple Flat File

Store the data on disk, create a read/write FileStream for the file, and enclose that in a wrapper that lets you access the file's contents as if it were an in-memory array. The simplest implementation of this will give you reasonable usefulness for sequential reads from the file. Random reads and writes can slow you down, but you can do some buffering in the background to help mitigate the speed issues.

This approach works for any type that has a static size, including structures that can be copied to/from a range of bytes in the file. Doesn't work for dynamically-sized data like strings.

Complex Flat File

If you need to handle dynamic-size records, sparse data, etc. then you might be able to design a file format that can handle it elegantly. Then again, a database is probably a better option at this point.

Memory Mapped File

Same as the other file options, but using a different mechanism to access the data. See System.IO.MemoryMappedFile for more information on how to use Memory Mapped Files from .NET.

Database Storage

Depending on the nature of the data, storing it in a database might work for you. For a large array of doubles this is unlikely to be a great option however. The overheads of reading/writing data in the database, plus the storage overheads - each row will at least need to have a row identity, probably a BIG_INT (8-byte integer) for a large recordset, doubling the size of the data right off the bat. Add in the overheads for indexing, row storage, etc. and you can very easily multiply the size of your data.

Databases are great for storing and manipulating complicated data. That's what they're for. If you have variable-width data - strings and the like - then a database is probably one of your best options. The flip-side is that they're generally not an optimal solution for working with large amounts of very simple data.


Whichever option you go with, you can create an IList<T>-compatible class that encapsulates your data. This lets you write code that doesn't have any need to know how the data is stored, only what it is.



回答2:

BCL arrays cannot do that.
Someone wrote a chunked BigArray<T> class that can.

However, that will not magically create enough memory to store it.



回答3:

You can't. Even with gcAllowVeryLargeObjects, the maximum size of any dimension in an array (of non-bytes) is 2,146,435,071

So you'll need to rethink your design, or use an alternative implementation such as a jagged array.



回答4:

Another possible approach is to implement your own BigList. First note that List is implemented as an array. Also, you can set the initial size of the List in the constructor, so if you know it will be big, get a big chunk of memory up front.

Then

public class myBigList<T> : List<List<T>>
{

}

or, maybe more preferable, use a has-a approach:

public class myBigList<T>
{
   List<List<T>> theList;
}

In doing this you will need to re-implement the indexer so you can use division and modulo to find the correct indexes into your backing store. Then you can use a BigInt as the index. In your custom indexer you will decompose the BigInt into two legal sized ints.



回答5:

C# arrays are limited in size to System.Int32.MaxValue.

For bigger than that, use List<T> (where T is whatever you want to hold).

More here: What is the Maximum Size that an Array can hold?