I've got a text file that contains several 'records' inside of it. Each record contains a name and a collection of numbers as data.
I'm trying to build a class that will read through the file, present only the names of all the records, and then allow the user to select which record data he/she wants.
The first time I go through the file, I only read header names, but I can keep track of the 'position' in the file where the header is. I need random access to the text file to seek to the beginning of each record after a user asks for it.
I have to do it this way because the file is too large to be read in completely in memory (1GB+) with the other memory demands of the application.
I've tried using the .NET StreamReader class to accomplish this (which provides very easy to use 'ReadLine' functionality, but there is no way to capture the true position of the file (the position in the BaseStream property is skewed due to the buffer the class uses).
Is there no easy way to do this in .NET?
A couple of items that may be of interest.
1) If the lines are a fixed set of characters in length, that is not of necessity useful information if the character set has variable sizes (like UTF-8). So check your character set.
2) You can ascertain the exact position of the file cursor from StreamReader by using the BaseStream.Position value IF you Flush() the buffers first (which will force the current position to be where the next read will begin - one byte after the last byte read).
3) If you know in advance that the exact length of each record will be the same number of characters, and the character set uses fixed-width characters (so each line is the same number of bytes long) the you can use FileStream with a fixed buffer size to match the size of a line and the position of the cursor at the end of each read will be, perforce, the beginning of the next line.
4) Is there any particular reason why, if the lines are the same length (assuming in bytes here) that you don't simply use line numbers and calculate the byte-offset in the file based on line size x line number?
If you're flexible with how the data file is written and don't mind it being a little less text editor-friendly, you could write your records with a BinaryWriter:
Then, initially reading each record is simple because you can use the BinaryReader's ReadString method:
The BinaryReader isn't buffered so you get the proper position to store and use later. The only hassle is parsing the name out of the line, which you may have to do with a StreamReader anyway.
Are you sure that the file is "too large"? Have you tried it that way and has it caused a problem?
If you allocate a large amount of memory, and you aren't using it right now, Windows will just swap it out to disk. Hence, by accessing it from "memory", you will have accomplished what you want -- random access to the file on disk.