How does Bitmap allocation work on Oreo, and how t

2019-01-08 21:39发布

问题:

Background

For the past years, in order to check how much heap memory you have on Android and how much you use, you can use something like:

@JvmStatic
fun getHeapMemStats(context: Context): String {
    val runtime = Runtime.getRuntime()
    val maxMemInBytes = runtime.maxMemory()
    val availableMemInBytes = runtime.maxMemory() - (runtime.totalMemory() - runtime.freeMemory())
    val usedMemInBytes = maxMemInBytes - availableMemInBytes
    val usedMemInPercentage = usedMemInBytes * 100 / maxMemInBytes
    return "used: " + Formatter.formatShortFileSize(context, usedMemInBytes) + " / " +
            Formatter.formatShortFileSize(context, maxMemInBytes) + " (" + usedMemInPercentage + "%)"
}

This means, the more memory you use, especially by storing Bitmaps into memory, the closer you get to the max heap memory your application is allowed to use. When you reach the max, your app will crash with the OutOfMemory exception (OOM).

The problem

I've noticed that on Android O (8.1 in my case, but it's probably on 8.0 too), the above code isn't affected by Bitmap allocations.

Digging further, I've noticed in the Android profiler that the more memory you use (saving large bitmaps in my POC), the more native memory is used.

To test how it works, I've created a simple loop as such:

    val list = ArrayList<Bitmap>()
    Log.d("AppLog", "memStats:" + MemHelper.getHeapMemStats(this))
    useMoreMemoryButton.setOnClickListener {
        AsyncTask.execute {
            for (i in 0..1000) {
                // list.add(Bitmap.createBitmap(20000, 20000, Bitmap.Config.ARGB_8888))
                list.add(BitmapFactory.decodeResource(resources, R.drawable.huge_image))
                Log.d("AppLog", "heapMemStats:" + MemHelper.getHeapMemStats(this) + " nativeMemStats:" + MemHelper.getNativeMemStats(this))
            }
        }
    }

On some cases, I've made it in a single iteration, and on some, I've only created a bitmap into the list, instead of decoding it (code in comment). More about this later...

This is the result of running the above :

As you can see from the graph, the app reached a huge memory usage, well above the allowed max heap memory that was reported to me (which is 201MB).

What I've found

I've found many weird behaviors. Because of this, I've decided to report on them, here.

  1. First, I tried an alternative to the above code, to get the memory stats at runtime :

     @JvmStatic
     fun getNativeMemStats(context: Context): String {
         val nativeHeapSize = Debug.getNativeHeapSize()
         val nativeHeapFreeSize = Debug.getNativeHeapFreeSize()
         val usedMemInBytes = nativeHeapSize - nativeHeapFreeSize
         val usedMemInPercentage = usedMemInBytes * 100 / nativeHeapSize
         return "used: " + Formatter.formatShortFileSize(context, usedMemInBytes) + " / " +
                 Formatter.formatShortFileSize(context, nativeHeapSize) + " (" + usedMemInPercentage + "%)"
     }
    

But, as opposed to the heap memory check, it seems that the max native memory changes its value over time, which means I can't know what is its truly max value and so I can't, in real apps, decide what a memory cache size should be. Here's the result of the code above:

heapMemStats:used: 2.0 MB / 201 MB (0%) nativeMemStats:used: 3.6 MB / 6.3 MB (57%)
heapMemStats:used: 1.8 MB / 201 MB (0%) nativeMemStats:used: 290 MB / 310 MB (93%)
heapMemStats:used: 1.8 MB / 201 MB (0%) nativeMemStats:used: 553 MB / 579 MB (95%)
heapMemStats:used: 1.8 MB / 201 MB (0%) nativeMemStats:used: 821 MB / 847 MB (96%)
  1. When I reach the point that the device can't store any more bitmaps (stopped on 1.1GB or ~850MB on my Nexus 5x), instead of OutOfMemory exception, I get... nothing! It just closes the app. Without even a dialog saying it has crashed.

  2. In case I just create a new Bitmap, instead of decoding it (code available above, just in a comment instead), I get a weird log, saying I use tons of GBs and have tons of GBs of native memory available:

Also, as opposed to when I decode bitmaps, I do get a crash here (including a dialog), but it's not OOM. Instead, it's... NPE !

01-04 10:12:36.936 30598-31301/com.example.user.myapplication E/AndroidRuntime: FATAL EXCEPTION: AsyncTask #1 Process: com.example.user.myapplication, PID: 30598 java.lang.NullPointerException: Attempt to invoke virtual method 'void android.graphics.Bitmap.setHasAlpha(boolean)' on a null object reference at android.graphics.Bitmap.createBitmap(Bitmap.java:1046) at android.graphics.Bitmap.createBitmap(Bitmap.java:980) at android.graphics.Bitmap.createBitmap(Bitmap.java:930) at android.graphics.Bitmap.createBitmap(Bitmap.java:891) at com.example.user.myapplication.MainActivity$onCreate$1$1.run(MainActivity.kt:21) at android.os.AsyncTask$SerialExecutor$1.run(AsyncTask.java:245) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1162) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:636) at java.lang.Thread.run(Thread.java:764)

Looking at the profiler graph, it gets even weirder. The memory usage doesn't seem to rise much at all, and at the crash point, it just drops:

If you look at the graph, you will see a lot of GC icons (the trash can). I think it might be doing some memory compression.

  1. If I do a memory dump (using the profiler), as opposed to previous versions of Android, I can't see a preview of Bitmaps anymore.

The questions

This new behavior raises a lot of questions. It could reduce number of crashes of OOM, but it can also make it very hard to detect them, find memory leaks and fix them. Maybe some of what I've seen are just bugs, but still...

  1. What exactly has changed in memory usage on Android O ? And why?

  2. How do Bitmaps get handled?

  3. Is it possible to still preview Bitmaps inside memory dump reports?

  4. What's the correct way to get the max native memory that app is allowed to use, and print it on logs, and use it as something to decide of max ?

  5. Is there any video/article about this topic? I'm not talking about memory optimizations that were added, but more about how Bitmaps are allocated now, how to handle OOM now, etc...

  6. I guess this new behavior might affect some caching libraries, right? That's because they might depend on the heap memory size instead.

  7. How could it be that I could create so many bitmaps, each of size 20,000x20,000 (meaning ~1.6 GB) , yet when I could only create a few of them from real image of size 7,680x7,680 (meaning ~236MB) ? Does it really do memory compression, as I've guessed?

  8. How could the native memory functions return me such huge values in the case of bitmap creation, yet more reasonably ones for when I decoded bitmaps ? What do they mean?

  9. What's with the weird profiler graph on the Bitmap creation case? It barely rises in memory usage, and yet it reached a point that it can't create any more of them, eventually (after a lot of items being inserted).

  10. What's with the weird exceptions behavior? Why on bitmap decoding I got no exception or even an error log as part of the app, and when I created them, I got NPE ?

  11. Will the Play Store detect OOM and still report about them, in case the app crashes because of it? Will it detect it in all cases? Can Crashlytics detect it? Is there a way to be informed of such a thing, whether by users or during development at the office?

回答1:

Looks like your app was killed by Linux OOM killer. Game developers and other people, who actively use native memory, see that happen all the time.

Enabling kernel overcommit together with lifting heap-based restrictions on Bitmap allocation may result in the picture you see. You can read a bit about overcommit here.

Personally I would love to see an OS API for learning about app deaths, but I won't be holding my breath.


  1. What's the correct way to get the max native memory that app is allowed to use, and print it on logs, and use it as something to decide of max ?

Pick some arbitrary value (say, quarter of heap size) and stick with it. If you get call to onTrimMemory (which is directly tied to OOM killer and native memory pressure), try to reduce your consumption.

  1. I guess this new behavior might affect some caching libraries, right? That's because they might depend on the heap memory size instead.

Does not matter — Android heap size is always smaller than total physical memory. Any caching library, that used heap size as guideline, should continue to work either way.

  1. How could it be that I could create so many bitmaps, each of size 20,000x20,000

Magic.

I assume, that current version of Android Oreo allows memory overcommit: untouched memory isn't actually requested from hardware, so you can have as much of it as allowed by OS addressable memory limit (a bit less than 2 gigabytes on x86, several terabytes on x64). All virtual memory consists of pages (usually 4Kb each). When you try to use a page, it is paged in. If the kernel does not have enough physical memory to map a page for your process, the app will receive a signal, killing it. In practice the app will killed by Linux OOM killer way before that happens.

  1. How could the native memory functions return me such huge values in the case of bitmap creation, yet more reasonably ones for when I decoded bitmaps ? What do they mean?

  2. What's with the weird profiler graph on the Bitmap creation case? It barely rises in memory usage, and yet it reached a point that it can't create any more of them, eventually (after a lot of items being inserted).

The profiler graph shows heap memory usage. If the bitmaps do not count towards heap, that graph naturally won't show them.

Native memory functions appear to work as (originally) intended — they correctly track virtual allocations, but do not realize, how much physical memory is reserved for each virtual allocation by kernel (that is opaque to user space).

Also, as opposed to when I decode bitmaps, I do get a crash here (including a dialog), but it's not OOM. Instead, it's... NPE !

You haven't used any of those pages, so they are not mapped to physical memory, hence the OOM killer does not kill you (yet). The allocation might have failed because you have ran out of virtual memory, which is more harmless, compared to running out of physical memory, or because of hitting some other kind of memory limit (such as cgroups-based ones), which is even more harmless.

  1. ...Can Crashlytics detect it? Is there a way to be informed of such a thing, whether by users or during development at the office?

OOM killer destroys your app with SIGKILL (same as when your process is terminated after going into background). Your process can not react to it. It is theoretically possible to observe process death from child process, but the exact reason may be hard to learn. See Who “Killed” my process and why?. A well-written library may be able to periodically check memory usage and make an educated guess. An extremely well-written library may be able to detect memory allocations by hooking into native malloc function (for example, by hot-patching application import table or something like that).


To better demonstrate how virtual memory management works, let's imagine allocating 1000 of Bitmaps 1Gb each, then changing a single pixel in each of them. The OS does not initially allocate physical memory for those Bitmaps, so they take around 0 byte of physical memory in total. After you touch a single four-byte RGBA pixel of Bitmap, the kernel will allocate a single page for storing that pixel.

The OS does not know anything about Java objects and Bitmaps — it simply views all process memory as continuous list of pages.

The commonly used size of memory page is 4Kb. After touching 1000 pixels — one in each 1Gb Bitmap — you will still use up less than 4Mb of real memory.