Why does IEumerator affect the state of IEnumer

2019-03-18 23:03发布

I am curious why the following throws an error message (text reader closed exception) on the "last" assignment:

IEnumerable<string> textRows = File.ReadLines(sourceTextFileName);
IEnumerator<string> textEnumerator = textRows.GetEnumerator();

string first = textRows.First();
string last = textRows.Last();

However the following executes fine:

IEnumerable<string> textRows = File.ReadLines(sourceTextFileName);

string first = textRows.First();
string last = textRows.Last();

IEnumerator<string> textEnumerator = textRows.GetEnumerator();

What is the reason for the different behavior?

1条回答
戒情不戒烟
2楼-- · 2019-03-18 23:29

You've discovered a bug in the framework, as far as I can tell. It's reasonably subtle, because of the interaction of a few things:

  • When you call ReadLines(), the file is actually opened. Personally, I think of this as a bug in itself; I'd expect and hope that it would be lazy - only opening the file when you try to start iterating over it.
  • When you call GetEnumerator() the first time on the return value of ReadLines, it will actually return the same reference.
  • When First() calls GetEnumerator(), it will create a clone. This will share the same StreamReader as textEnumerator
  • When First() disposes its clone, it will dispose of the StreamReader, and set its variable to null. This doesn't affect the variable within the original, which now refers to a disposed StreamReader
  • When Last() calls GetEnumerator(), it will create a clone of the original object, complete with disposes StreamReader. It then tries to read from that reader, and throws an exception.

Now compare this with your second version:

  • When First() calls GetEnumerator(), the original reference is returned, complete with open reader.
  • When First() then calls Dispose(), the reader will be disposed and the variable set to null
  • When Last() calls GetEnumerator(), a clone will be created - but because the value it's cloning has a null reference, a new StreamReader is created, so it's able to read the file with no problems. It then disposes of the clone, which closes the reader
  • When GetEnumerator() is called, a second clone of the original object, opening yet another StreamReader - again, no problems there.

So basically, the problem in the first snippet is that you're calling GetEnumerator() a second time (in First()) without having disposed of the first object.

Here's another example of the same problem:

using System;
using System.IO;
using System.Linq;

class Test
{
    static void Main()
    {
        var lines = File.ReadLines("test.txt");
        var query = from x in lines
                    from y in lines
                    select x + "/" + y;
        foreach (var line in query)
        {
            Console.WriteLine(line);
        }
    }
}

You could fix this by calling File.ReadLines twice - or by using a genuinely lazy implementation of ReadLines, like this:

using System.IO;
using System.Linq;

class Test
{
    static void Main()
    {
        var lines = ReadLines("test.txt");
        var query = from x in lines
                    from y in lines
                    select x + "/" + y;
        foreach (var line in query)
        {
            Console.WriteLine(line);
        }
    }

    static IEnumerable<string> ReadLines(string file)
    {
        using (var reader = File.OpenText(file))
        {
            string line;
            while ((line = reader.ReadLine()) != null)
            {
                yield return line;
            }
        }
    }
}

In the latter code, a new StreamReader is opened each time GetEnumerator() is called - so the result is each pair of lines in test.txt.

查看更多
登录 后发表回答