Read Big TXT File, Out of Memory Exception

2019-01-15 12:58发布

I want to read big TXT file size is 500 MB, First I use

var file = new StreamReader(_filePath).ReadToEnd();  
var lines = file.Split(new[] { '\n' });

but it throw out of memory Exception then I tried to read line by line but again after reading around 1.5 million lines it throw out of memory Exception

  using (StreamReader r = new StreamReader(_filePath))
         {            
             while ((line = r.ReadLine()) != null)            
                 _lines.Add(line);            
         }

or I used

  foreach (var l in File.ReadLines(_filePath))
            {
                _lines.Add(l);
            }

but Again I received

An exception of type 'System.OutOfMemoryException' occurred in mscorlib.dll but was not handled in user code

My Machine is powerful machine with 8GB of ram so it shouldn't be my machine problem.

p.s: I tried to open this file in NotePadd++ and I received 'the file is too big to be opened' exception.

4条回答
相关推荐>>
2楼-- · 2019-01-15 13:15

The cause of exception seem to be growing _lines collection but not reading big file. You are reading line and adding to some collection _lines which will be taking memory and causing out of memory execption. You can apply filters to only put the required lines to _lines collection.

查看更多
唯我独甜
3楼-- · 2019-01-15 13:22

You have to count the lines first. It is slower, but you can read up to 2,147,483,647 lines.

int intNoOfLines = 0;
using (StreamReader oReader = new 
StreamReader(MyFilePath))
{
    while (oReader.ReadLine() != null) intNoOfLines++;
}
string[] strArrLines = new string[intNoOfLines];
int intIndex = 0;
using (StreamReader oReader = new 
StreamReader(MyFilePath))
{
    string strLine;
    while ((strLine = oReader.ReadLine()) != null)
    {
       strArrLines[intIndex++] = strLine;
    }
}
查看更多
够拽才男人
4楼-- · 2019-01-15 13:28

Just use File.ReadLines which returns an IEnumerable<string> and doesn't load all the lines at once to the memory.

foreach (var line in File.ReadLines(_filePath))
{
    //Don't put "line" into a list or collection.
    //Just make your processing on it.
}
查看更多
该账号已被封号
5楼-- · 2019-01-15 13:32

Edit:

loading the whole file in memory will be causing objects to grow, and .net will throw OOM exceptions if it cannot allocate enough contiguous memory for an object.

The answer is still the same, you need to stream the file, not read the entire contents. That may require a rearchitecture of your application, however using IEnumerable<> methods you can stack up business processes in different areas of the applications and defer processing.


A "powerful" machine with 8GB of RAM isn't going to be able to store a 500GB file in memory, as 500 is bigger than 8. (plus you don't get 8 as the operating system will be holding some, you can't allocate all memory in .Net, 32-bit has a 2GB limit, opening the file and storing the line will hold the data twice, there is an object size overhead....)

You can't load the whole thing into memory to process, you will have to stream the file through your processing.

查看更多
登录 后发表回答