How to scan through really huge files on disk?-第2页回答

How to scan through really huge files on disk?

2020-02-05 02:46发布

Considering a really huge file(maybe more than 4GB) on disk,I want to scan through this file and calculate the times of a specific binary pattern occurs.

My thought is:

Use memory-mapped file(CreateFileMap or boost mapped_file) to load the file to the virtual memory.
For each 100MB mapped-memory,create one thread to scan and calculate the result.

Is this feasible?Are there any better method to do so?

Update:
Memory-mapped file would be a good choice,for scaning through a 1.6GB file could be handled within 11s.

thanks.

标签： .net large-files memory-mapped-files

8条回答

疯言疯语

2楼-- · 2020-02-05 03:29

Tim Bray (and his readers) explored this in depth in his Wide Finder Project and Wide Finder 2. Benchmark results show that multithreaded implementations can outperform a single-threaded solution on a massive Sun multicore server. On usual PC hardware, multithreading won't gain you that much, if at all.

0人赞添加讨论(0) 举报

We Are One

3楼-- · 2020-02-05 03:30

Although you can use memory mapping, you don't have to. If you read the file sequentially in small chunks, say 1 MB each, the file will never be present in memory all at once.

If your search code is actually slower than your hard disk, you can still hand chunks off to worker threads if you like.

0人赞添加讨论(0) 举报

上一页 1 2

How to scan through really huge files on disk?

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间