I want to read big TXT file size is 500 MB, First I use
var file = new StreamReader(_filePath).ReadToEnd();
var lines = file.Split(new[] { '\n' });
but it throw out of memory Exception then I tried to read line by line but again after reading around 1.5 million lines it throw out of memory Exception
using (StreamReader r = new StreamReader(_filePath))
{
while ((line = r.ReadLine()) != null)
_lines.Add(line);
}
or I used
foreach (var l in File.ReadLines(_filePath))
{
_lines.Add(l);
}
but Again I received
An exception of type 'System.OutOfMemoryException' occurred in mscorlib.dll but was not handled in user code
My Machine is powerful machine with 8GB of ram so it shouldn't be my machine problem.
p.s: I tried to open this file in NotePadd++ and I received 'the file is too big to be opened' exception.
The cause of exception seem to be growing _lines collection but not reading big file. You are reading line and
adding to some collection _lines which will be taking memory and causing out of memory execption
. You can apply filters to only put the required lines to _lines collection.You have to count the lines first. It is slower, but you can read up to 2,147,483,647 lines.
Just use File.ReadLines which returns an
IEnumerable<string>
and doesn't load all the lines at once to the memory.Edit:
loading the whole file in memory will be causing objects to grow, and .net will throw OOM exceptions if it cannot allocate enough contiguous memory for an object.
The answer is still the same, you need to stream the file, not read the entire contents. That may require a rearchitecture of your application, however using
IEnumerable<>
methods you can stack up business processes in different areas of the applications and defer processing.A "powerful" machine with 8GB of RAM isn't going to be able to store a 500GB file in memory, as 500 is bigger than 8. (plus you don't get 8 as the operating system will be holding some, you can't allocate all memory in .Net, 32-bit has a 2GB limit, opening the file and storing the line will hold the data twice, there is an object size overhead....)
You can't load the whole thing into memory to process, you will have to stream the file through your processing.