.NET C# - Random access in text files - no easy wa

2019-01-17 07:05发布

I've got a text file that contains several 'records' inside of it. Each record contains a name and a collection of numbers as data.

I'm trying to build a class that will read through the file, present only the names of all the records, and then allow the user to select which record data he/she wants.

The first time I go through the file, I only read header names, but I can keep track of the 'position' in the file where the header is. I need random access to the text file to seek to the beginning of each record after a user asks for it.

I have to do it this way because the file is too large to be read in completely in memory (1GB+) with the other memory demands of the application.

I've tried using the .NET StreamReader class to accomplish this (which provides very easy to use 'ReadLine' functionality, but there is no way to capture the true position of the file (the position in the BaseStream property is skewed due to the buffer the class uses).

Is there no easy way to do this in .NET?

9条回答
小情绪 Triste *
2楼-- · 2019-01-17 07:28

FileStream has the seek() method.

查看更多
倾城 Initia
3楼-- · 2019-01-17 07:31

This exact question was asked in 2006 here: http://www.devnewsgroups.net/group/microsoft.public.dotnet.framework/topic40275.aspx

Summary:

"The problem is that the StreamReader buffers data, so the value returned in BaseStream.Position property is always ahead of the actual processed line."

However, "if the file is encoded in a text encoding which is fixed-width, you could keep track of how much text has been read and multiply that by the width"

and if not, you can just use the FileStream and read a char at a time and then the BaseStream.Position property should be correct

查看更多
混吃等死
4楼-- · 2019-01-17 07:37

You can use a System.IO.FileStream instead of StreamReader. If you know exactly, what file contains ( the encoding for example ), you can do all operation like with StreamReader.

查看更多
Fickle 薄情
5楼-- · 2019-01-17 07:37

I think that the FileHelpers library runtime records feature might help u. http://filehelpers.sourceforge.net/runtime_classes.html

查看更多
放我归山
6楼-- · 2019-01-17 07:38

There are some good answers provided, but I couldn't find some source code that would work in my very simplistic case. Here it is, with the hope that it'll save someone else the hour that I spent searching around.

The "very simplistic case" that I refer to is: the text encoding is fixed-width, and the line ending characters are the same throughout the file. This code works well in my case (where I'm parsing a log file, and I sometime have to seek ahead in the file, and then come back. I implemented just enough to do what I needed to do (ex: only one constructor, and only override ReadLine()), so most likely you'll need to add code... but I think it's a reasonable starting point.

public class PositionableStreamReader : StreamReader
{
    public PositionableStreamReader(string path)
        :base(path)
        {}

    private int myLineEndingCharacterLength = Environment.NewLine.Length;
    public int LineEndingCharacterLength
    {
        get { return myLineEndingCharacterLength; }
        set { myLineEndingCharacterLength = value; }
    }

    public override string ReadLine()
    {
        string line = base.ReadLine();
        if (null != line)
            myStreamPosition += line.Length + myLineEndingCharacterLength;
        return line;
    }

    private long myStreamPosition = 0;
    public long Position
    {
        get { return myStreamPosition; }
        set
        {
            myStreamPosition = value;
            this.BaseStream.Position = value;
            this.DiscardBufferedData();
        }
    }
}

Here's an example of how to use the PositionableStreamReader:

PositionableStreamReader sr = new PositionableStreamReader("somepath.txt");

// read some lines
while (something)
    sr.ReadLine();

// bookmark the current position
long streamPosition = sr.Position;

// read some lines
while (something)
    sr.ReadLine();

// go back to the bookmarked position
sr.Position = streamPosition;

// read some lines
while (something)
    sr.ReadLine();
查看更多
叛逆
7楼-- · 2019-01-17 07:39

Is the encoding a fixed-size one (e.g. ASCII or UCS-2)? If so, you could keep track of the character index (based on the number of characters you've seen) and find the binary index based on that.

Otherwise, no - you'd basically need to write your own StreamReader implementation which lets you peek at the binary index. It's a shame that StreamReader doesn't implement this, I agree.

查看更多
登录 后发表回答