可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I have a "Find Files" function in my program that will find text files with the .ged suffix that my program reads. I display the found results in an explorer-like window that looks like this:
I use the standard FindFirst / FindNext methods, and this works very quickly. The 584 files shown above are found and displayed within a couple of seconds.
What I'd now like to do is add two columns to the display that shows the "Source" and "Version" that are contained in each of these files. This information is found usually within the first 10 lines of each file, on lines that look like:
1 SOUR FTM
2 VERS Family Tree Maker (20.0.0.368)
Now I have no problem parsing this very quickly myself, and that is not what I'm asking.
What I need help with is simply how to most quickly load the first 10 or so lines from these files so that I can parse them.
I have tried to do a StringList.LoadFromFile, but it takes too much time loading the large files, such at those above 1 MB.
Since I only need the first 10 lines or so, how would I best get them?
I'm using Delphi 2009, and my input files might or might not be Unicode, so this needs to work for any encoding.
Followup: Thanks Antonio,
I ended up doing this which works fine:
var
CurFileStream: TStream;
Buffer: TBytes;
Value: string;
Encoding: TEncoding;
try
CurFileStream := TFileStream.Create(folder + FileName, fmOpenRead);
SetLength(Buffer, 256);
CurFileStream.Read(Buffer[0], 256);
TEncoding.GetBufferEncoding(Buffer, Encoding);
Value := Encoding.GetString(Buffer);
...
(parse through Value to get what I want)
...
finally
CurFileStream.Free;
end;
回答1:
Use TFileStream and with Read method read number of bytes needed. Here is the example of reading bitmap info that is also stored on begining of the file.
http://www.delphidabbler.com/tips/19
回答2:
Just open the file yourself for block reading (not using TStringList builtin functionality), and read the first block of the file, and then you can for example load that block to a stringlist with strings.SetText() (if you are using block functions) or simply strings.LoadFromStream() if you are loading your blocks using streams.
I would personally just go with FileRead/FileWrite block functions, and load the block into a buffer. You could also use similair winapi functions, but that's just more code for no reason.
OS reads files in blocks, which are at least 512bytes big on almost any platform/filesystem, so you can read 512 bytes first (and hope that you got all 10 lines, which will be true if your lines are generally short enough). This will be (practically) as fast as reading 100 or 200 bytes.
Then if you notice that your strings objects has only less than 10 lines, just read next 512 byte block and try to parse again. (Or just go with 1024, 2048 and so on blocks, on many systems it will probably be as fast as 512 blocks, as filesystem cluster sizes are generally larger than 512 bytes).
PS. Also, using threads or asynchronous functionality in winapi file functions (CreateFile and such), you could load that data from files asynchronously, while the rest of your application works. Specifically, the interface will not freeze during reading of large directories.
This will make the loading of your information appear faster, (since the file list will load directly, and then some milliseconds later the rest of the information will come up), while not actually increasing the real reading speed.
Do this only if you have tried the other methods and you feel like you need the extra boost.
回答3:
You can use a TStreamReader
to read individual lines from any TStream
object, such as a TFileStream
. For even faster file I/O, you could use Memory-Mapped Views with TCustomMemoryStream
.
回答4:
Okay, I deleted my first answer. Using Remy's first suggestion above, I tried again with built-in stuff. What I don't like here is that you have to create and free two objects. I think I would make my own class to wrap this up:
var
fs:TFileStream;
tr:TTextReader;
filename:String;
begin
filename := 'c:\temp\textFileUtf8.txt';
fs := TFileStream.Create(filename, fmOpenRead);
tr := TStreamReader.Create(fs);
try
Memo1.Lines.Add( tr.ReadLine );
finally
tr.Free;
fs.Free;
end;
end;
If anybody is interested in what I had here before, it had the problem of not working with unicode files.
回答5:
Sometimes oldschool pascal stylee is not that bad.
Even though non-oo file access doesn't seem to be very popular anymore, ReadLn(F,xxx)
still works pretty ok in situations like yours.
The code below loads information (filename, source and version) into a TDictionary
so that you can look it up easily, or you can use a listview in virtual mode, and look stuff up in this list when the ondata
even fires.
Warning: code below does not work with unicode.
program Project101;
{$APPTYPE CONSOLE}
uses
IoUtils, Generics.Collections, SysUtils;
type
TFileInfo=record
FileName,
Source,
Version:String;
end;
function LoadFileInfo(var aFileInfo:TFileInfo):Boolean;
var
F:TextFile;
begin
Result := False;
AssignFile(F,aFileInfo.FileName);
{$I-}
Reset(F);
{$I+}
if IOResult = 0 then
begin
ReadLn(F,aFileInfo.Source);
ReadLn(F,aFileInfo.Version);
CloseFile(F);
Exit(True)
end
else
WriteLn('Could not open ', aFileInfo.FileName);
end;
var
FileInfo:TFileInfo;
Files:TDictionary<string,TFileInfo>;
S:String;
begin
Files := TDictionary<string,TFileInfo>.Create;
try
for S in TDirectory.GetFiles('h:\WINDOWS\system32','*.xml') do
begin
WriteLn(S);
FileInfo.FileName := S;
if LoadFileInfo(FileInfo) then
Files.Add(S,FileInfo);
end;
// showing file information...
for FileInfo in Files.Values do
WriteLn(FileInfo.Source, ' ',FileInfo.Version);
finally
Files.Free
end;
WriteLn;
WriteLn('Done. Press any key to quit . . .');
ReadLn;
end.