I am curious why the following throws an error message (text reader closed exception) on the "last" assignment:
IEnumerable<string> textRows = File.ReadLines(sourceTextFileName);
IEnumerator<string> textEnumerator = textRows.GetEnumerator();
string first = textRows.First();
string last = textRows.Last();
However the following executes fine:
IEnumerable<string> textRows = File.ReadLines(sourceTextFileName);
string first = textRows.First();
string last = textRows.Last();
IEnumerator<string> textEnumerator = textRows.GetEnumerator();
What is the reason for the different behavior?
You've discovered a bug in the framework, as far as I can tell. It's reasonably subtle, because of the interaction of a few things:
- When you call
ReadLines()
, the file is actually opened. Personally, I think of this as a bug in itself; I'd expect and hope that it would be lazy - only opening the file when you try to start iterating over it.
- When you call
GetEnumerator()
the first time on the return value of ReadLines
, it will actually return the same reference.
- When
First()
calls GetEnumerator()
, it will create a clone. This will share the same StreamReader
as textEnumerator
- When
First()
disposes its clone, it will dispose of the StreamReader
, and set its variable to null
. This doesn't affect the variable within the original, which now refers to a disposed StreamReader
- When
Last()
calls GetEnumerator()
, it will create a clone of the original object, complete with disposes StreamReader
. It then tries to read from that reader, and throws an exception.
Now compare this with your second version:
- When
First()
calls GetEnumerator()
, the original reference is returned, complete with open reader.
- When
First()
then calls Dispose()
, the reader will be disposed and the variable set to null
- When
Last()
calls GetEnumerator()
, a clone will be created - but because the value it's cloning has a null
reference, a new StreamReader
is created, so it's able to read the file with no problems. It then disposes of the clone, which closes the reader
- When
GetEnumerator()
is called, a second clone of the original object, opening yet another StreamReader
- again, no problems there.
So basically, the problem in the first snippet is that you're calling GetEnumerator()
a second time (in First()
) without having disposed of the first object.
Here's another example of the same problem:
using System;
using System.IO;
using System.Linq;
class Test
{
static void Main()
{
var lines = File.ReadLines("test.txt");
var query = from x in lines
from y in lines
select x + "/" + y;
foreach (var line in query)
{
Console.WriteLine(line);
}
}
}
You could fix this by calling File.ReadLines
twice - or by using a genuinely lazy implementation of ReadLines
, like this:
using System.IO;
using System.Linq;
class Test
{
static void Main()
{
var lines = ReadLines("test.txt");
var query = from x in lines
from y in lines
select x + "/" + y;
foreach (var line in query)
{
Console.WriteLine(line);
}
}
static IEnumerable<string> ReadLines(string file)
{
using (var reader = File.OpenText(file))
{
string line;
while ((line = reader.ReadLine()) != null)
{
yield return line;
}
}
}
}
In the latter code, a new StreamReader
is opened each time GetEnumerator()
is called - so the result is each pair of lines in test.txt.