Incremental allocation of memory mapped buffer and

2019-09-11 01:22发布

问题:

I have created the following demo to see the MMF begaviour (I want to use it as a very large array of long values).

import java.nio._, java.io._, java.nio.channels.FileChannel

object Index extends App {

    val formatter = java.text.NumberFormat.getIntegerInstance
    def format(l: Number) = formatter.format(l)

    val raf = new RandomAccessFile("""C:\Users\...\Temp\96837624\mmf""", "rw")
    raf.setLength(20)
    def newBuf(capacity: Int) = {
      var bytes= 8.toLong*capacity
      println("new buf " + format(capacity) + " words = " + format(bytes) + " bytes")

      // java.io.IOException: "Map failed" at the following line
      raf.getChannel.map(FileChannel.MapMode.READ_WRITE, 0, bytes).asLongBuffer()
    }

    (1 to 100 * 1000 * 1000).foldLeft(newBuf(2): LongBuffer){ case(buf, i) =>
        if (Math.random < 0.000009) println(format(buf.get(buf.position()/2)))
        (if (buf.position == buf.capacity) {
            val p = buf.position
            val b = newBuf(buf.capacity * 2)
            b.position(p) ; b
        } else buf).put(i)

    }

    raf.close

It fails with the output

16,692,145
16,741,940
new buf 67,108,864
[error] (run-main-1) java.io.IOException: Map failed
java.io.IOException: Map failed
        at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:907)

I see a 512-MB file created and system seems failed to expand it to 1 GB.

If, however, instead of initial size of 2 long words, foldLeft(newBuf(2)), I use 64M long words, newBuf(64*1024*1027), runtime succeeds creating 1GB file and fails when it tries to create 2GB file with

new buf 268 435 458 words = 2 147 483 664 bytes
java.lang.IllegalArgumentException: Size exceeds Integer.MAX_VALUE
        at sun.nio.ch.FileChannelImpl.map(Unknown Source)

I ran it with 64-bit jvm.

I am also not sure how to close the buffer to release it for later application in sbt and be sure that data will ultimately appear in the file. The mechanism looks utterly unreliable.

回答1:

Ok, one day of experiments has demonstrated that 32-bit JVM fails with IOException: Map failed at 1 GB no matter what. In order to circumvent the Size exceeds Integer.MAX_VALUE with mapping at 64-bit machines, one should should use multiple buffers of affordable size, e.g. 100 mb each is fine. That is because buffers are addressed by integer.

What regards this question, you can keep all such buffers open in memory simultaneously, i.e. there is no need to close one buffer => null before you allocate the next effectively increment the file size, as the following demo demonstrates

import Utils._, java.nio._, java.io._, java.nio.channels.FileChannel

object MmfDemo extends App {

    val bufAddrWidth = 25 /*in bits*/ // Every element of the buff addresses a long
    val BUF_SIZE_WORDS = 1 << bufAddrWidth ; val BUF_SIZE_BYTES = BUF_SIZE_WORDS << 3
    val bufBitMask = BUF_SIZE_WORDS - 1
    var buffers = Vector[LongBuffer]()
    var capacity = 0 ; var pos = 0
    def select(pos: Int) = {
        val bufn = pos >> bufAddrWidth // higher bits of address denote the buffer number
        //println(s"accessing $pos = " + (pos - buf * wordsPerBuf) + " in " + buf)
        while (buffers.length <= bufn) expand
        pass(buffers(bufn)){_.position(pos & bufBitMask)}
    }
    def get(address: Int = pos) = {
        pos = address +1
        select(address).get
    }
    def put(value: Long) {
        //println("writing " + value + " to " + pos)
        select(pos).put(value) ; pos += 1
    }
    def expand = {
        val fromByte = buffers.length.toLong  * BUF_SIZE_BYTES
        println("adding " + buffers.length + "th buffer, total size expected " + format(fromByte + BUF_SIZE_BYTES) + " bytes")

        // 32bit JVM: java.io.IOException: "Map failed" at the following line if buf size requested is larger than 512 mb
        // 64bit JVM: IllegalArgumentException: Size exceeds Integer.MAX_VALUE
        buffers :+= fc.map(FileChannel.MapMode.READ_WRITE, fromByte, BUF_SIZE_BYTES).asLongBuffer()
        capacity += BUF_SIZE_WORDS
    }

    def rdAll(get: Int => Long) {
        var firstMismatch = -1
        val failures = (0 until parse(args(1))).foldLeft(0) { case(failures, i) =>
            val got = get(i)
            if (got != i && firstMismatch == -1) {firstMismatch = i; println("first mismatch at " +format(i) + ", value = " + format(got))}
            failures + ?(got != i, 1, 0)
        } ; println(format(failures) + " mismatches")
    }

    val raf = new RandomAccessFile("""C:\Temp\mmf""", "rw")
    val fc = raf.getChannel
    try {

        if (args.length < 1) {
            println ("usage1: buf_gen <len in long words>")
            println ("usage1: raf_gen <len in long words>")
            println("example: buf_gen 30m")
            println("usage2: raf_rd <size in words>")
            println("usage3: buf_rd <size in words>")
        } else {
            val t1 = System.currentTimeMillis
            args(0) match {
                case "buf_gen" => raf.setLength(0)
                    (0 until parse(args(1))) foreach {i => put(i.toLong)}
                case "raf_gen" => raf.setLength(0)
                    (0 until parse(args(1))) foreach {i =>raf.writeLong(i.toLong)}
                        //fc.force(true)
                case "rd_raf" => rdAll{i => raf.seek(i.toLong * 8) ; raf.readLong()}
                case "rd_buf" => rdAll(get)
                case u =>println("unknown command " + u)
            } ; println("finished in " + (System.currentTimeMillis - t1) + " ms")
        }
    } finally {
        raf.close ; fc.close

        buffers = null ; System.gc /*GC needs to close the buffer*/}

}

object Utils {
    val formatter = java.text.NumberFormat.getIntegerInstance
    def format(l: Number) = formatter.format(l)

    def ?[T](sel: Boolean, a: => T, b: => T) = if (sel) a else b
    def parse(s: String) = {
        val lc = s.toLowerCase()
        lc.filter(_.isDigit).toInt *
            ?(lc.contains("k"), 1000, 1) *
            ?(lc.contains("m"), 1000*1000, 1)
    }
    def eqa[T](a: T, b: T) = assert(a == b, s"$a != $b")
    def pass[T](a: T)(code: T => Unit) = {code(a) ; a}
}

at least in Windows. Using this program, I have managed to create mmf file larger than my machine memory (not to speak about JVM's -Xmx, which play no role at all in these matters). Just slow down the file generation selecting some text in the Windows console with mouse (program will pause until you release the selection) because otherwise Windows will evict all other performance critical staff to page file and your PC will die in thrashing.

BTW, PC dies in thrashing despite I write only into the end of file and Windows could evict my unused gigabyte blocks. Also, I have noticed that the block I am writing is actually read

The following output

adding 38th buffer, total size expected 12,480,000,000 bytes
adding 39th buffer, total size expected 12,800,000,000 bytes

is accompanied by following system requests

5:24,java,"QueryStandardInformationFile",mmf,"SUCCESS","AllocationSize: 12 480 000 000, EndOfFile: 12 480 000 000, NumberOfLinks: 1, DeletePending: False, Directory: False"
5:24,java,"SetEndOfFileInformationFile",mmf,"SUCCESS","EndOfFile: 12 800 000 000"
5:24,java,"SetAllocationInformationFile",mmf,"SUCCESS","AllocationSize: 12 800 000 000"
5:24,java,"CreateFileMapping",mmf,"FILE LOCKED WITH WRITERS","SyncType: SyncTypeCreateSection, PageProtection: "
5:24,java,"QueryStandardInformationFile",mmf,"SUCCESS","AllocationSize: 12 800 000 000, EndOfFile: 12 800 000 000, NumberOfLinks: 1, DeletePending: False, Directory: False"
5:24,java,"CreateFileMapping",mmf,"SUCCESS","SyncType: SyncTypeOther"
5:24,java,"ReadFile",mmf,"SUCCESS","Offset: 12 480 000 000, Length: 32 768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal"
5:24,java,"ReadFile",mmf,"SUCCESS","Offset: 12 480 032 768, Length: 32 768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal"
5:24,java,"ReadFile",mmf,"SUCCESS","Offset: 12 480 065 536, Length: 32 768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal"
5:24,java,"ReadFile",mmf,"SUCCESS","Offset: 12 480 098 304, Length: 32 768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal"
5:24,java,"ReadFile",mmf,"SUCCESS","Offset: 12 480 131 072, Length: 20 480, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal"

skipped 9000 reads

5:25,java,"ReadFile",mmf,"SUCCESS","Offset: 12 799 836 160, Length: 32 768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal"
5:25,java,"ReadFile",mmf,"SUCCESS","Offset: 12 799 868 928, Length: 32 768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal"
5:25,java,"ReadFile",mmf,"SUCCESS","Offset: 12 799 901 696, Length: 32 768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal"
5:25,java,"ReadFile",mmf,"SUCCESS","Offset: 12 799 934 464, Length: 32 768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal"
5:25,java,"ReadFile",mmf,"SUCCESS","Offset: 12 799 967 232, Length: 32 768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal"

but that is another story.

It turns out that this answer is duplicate of Peter Lawrey's except that mine question is dedicated to 'Map failed' and 'Integer range exceeded' when mapping large buffers whereas original question is concerned with OutOfMem in JVM, which has nothing to do with I/O.