Buffered RandomAccessFile java

RandomAccessFile is quite slow for random access to a file. You often read about implementing a buffered layer over it, but code doing this isn't possible to find online.

So my question is: would you guys who know any opensource implementation of this class share a pointer or share your own implementation?

It would be nice if this question would turn out as a collection of useful links and code about this problem, which I'm sure, is shared by many and never addressed properly by SUN.

Please, no reference to MemoryMapping, as files can be way bigger than Integer.MAX_VALUE.

标签： java file-io io buffering random-access

4条回答

叛逆

2楼-- · 2019-01-13 08:57

RandomAccessFile is quite slow for random access to a file. You often read about implementing a buffered layer over it, but code doing this isn't possible to find online.

Well, it is possible to find online.
For one, the JAI source code in jpeg2000 has an implementation, as well as an even more non-encumbered impl at: http://www.unidata.ucar.edu/software/netcdf-java/

javadocs:

http://www.unidata.ucar.edu/software/thredds/v4.3/netcdf-java/v4.0/javadoc/ucar/unidata/io/RandomAccessFile.html

0人赞添加讨论(0) 举报

女痞

3楼-- · 2019-01-13 09:01

If you're running on a 64-bit machine, then memory-mapped files are your best approach. Simply map the entire file into an array of equal-sized buffers, then pick a buffer for each record as needed (ie, edalorzo's answer, however you want overlapping buffers so that you don't have records that span boundaries).

If you're running on a 32-bit JVM, then you're stuck with RandomAccessFile. However, you can use it to read a byte[] that contains your entire record, then use a ByteBuffer to retrieve individual values from that array. At worst you should need to make two file accesses: one to retrieve the position/size of the record, and one to retrieve the record itself.

However, be aware that you can start stressing the garbage collector if you create lots of byte[]s, and you'll remain IO-bound if you bounce all over the file.

0人赞添加讨论(0) 举报

Rolldiameter

4楼-- · 2019-01-13 09:21

You can make a BufferedInputStream from a RandomAccessFile with code like,

 RandomAccessFile raf = ...
 FileInputStream fis = new FileInputStream(raf.getFD());
 BufferedInputStream bis = new BufferedInputStream(fis);

Some things to note

Closing the FileInputStream will close the RandomAccessFile and vice versa
The RandomAccessFile and FileInputStream point to the same position, so reading from the FileInputStream will advance the file pointer for the RandomAccessFile, and vice versa

Probably the way you want to use this would be something like,

RandomAccessFile raf = ...
FileInputStream fis = new FileInputStream(raf.getFD());
BufferedInputStream bis = new BufferedInputStream(fis);

//do some reads with buffer
bis.read(...);
bis.read(...);

//seek to a a different section of the file, so discard the previous buffer
raf.seek(...);
bis = new BufferedInputStream(fis);
bis.read(...);
bis.read(...);

0人赞添加讨论(0) 举报

迷人小祖宗

5楼-- · 2019-01-13 09:22

Well, I do not see a reason not to use java.nio.MappedByteBuffer even if the files are bigger the Integer.MAX_VALUE.

Evidently you will not be allowed to define a single MappedByteBuffer for the whole file. But you could have several MappedByteBuffers accessing different regions of the file.

The definition of position and size in FileChannenel.map are of type long, which implies you can provide values over Integer.MAX_VALUE, the only thing you have to take care of is that the size of your buffer will not be bigger than Integer.MAX_VALUE.

Therefore, you could define several maps like this:

buffer[0] = fileChannel.map(FileChannel.MapMode.READ_WRITE,0,2147483647L);
buffer[1] = fileChannel.map(FileChannel.MapMode.READ_WRITE,2147483647L, Integer.MAX_VALUE);
buffer[2] = fileChannel.map(FileChannel.MapMode.READ_WRITE, 4294967294L, Integer.MAX_VALUE);
...

In summary, the size cannot be bigger than Integer.MAX_VALUE, but the start position can be anywhere in your file.

In the Book Java NIO, the author Ron Hitchens states:

Accessing a file through the memory-mapping mechanism can be far more efficient than reading or writing data by conventional means, even when using channels. No explicit system calls need to be made, which can be time-consuming. More importantly, the virtual memory system of the operating system automatically caches memory pages. These pages will be cached using system memory andwill not consume space from the JVM's memory heap.

Once a memory page has been made valid (brought in from disk), it can be accessed again at full hardware speed without the need to make another system call to get the data. Large, structured files that contain indexes or other sections that are referenced or updated frequently can benefit tremendously from memory mapping. When combined with file locking to protect critical sections and control transactional atomicity, you begin to see how memory mapped buffers can be put to good use.

I really doubt that you will find a third-party API doing something better than that. Perhaps you may find an API written on top of this architecture to simplify the work.

Don't you think that this approach ought to work for you?

0人赞添加讨论(0) 举报

Buffered RandomAccessFile java

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间