Why does host_statistics64() return inconsistent r

2020-08-09 09:21发布

问题:

Why does host_statistics64() in OS X 10.6.8 (I don't know if other versions have this problem) return counts for free, active, inactive, and wired memory that don't add up to the total amount of ram? And why is it missing an inconsistent number of pages?

The following output represents the number of pages not classified as free, active, inactive, or wired over ten seconds (sampled roughly once per second).

458
243
153
199
357
140
304
93
181
224

The code that produces the numbers above is:

#include <stdio.h>
#include <mach/mach.h>
#include <mach/vm_statistics.h>
#include <sys/types.h>
#include <sys/sysctl.h>
#include <unistd.h>
#include <string.h>

int main(int argc, char** argv) {
        struct vm_statistics64 stats;
        mach_port_t host    = mach_host_self();
        natural_t   count   = HOST_VM_INFO64_COUNT;
        natural_t   missing = 0;
        int         debug   = argc == 2 ? !strcmp(argv[1], "-v") : 0;
        kern_return_t ret;
        int           mib[2];
        long          ram;
        natural_t     pages;
        size_t        length;
        int           i;

        mib[0] = CTL_HW;
        mib[1] = HW_MEMSIZE;
        length = sizeof(long);
        sysctl(mib, 2, &ram, &length, NULL, 0);
        pages  = ram / getpagesize();

        for (i = 0; i < 10; i++) {
                if ((ret = host_statistics64(host, HOST_VM_INFO64, (host_info64_t)&stats, &count)) != KERN_SUCCESS) {
                        printf("oops\n");
                        return 1;
                }

                /* updated for 10.9 */
                missing = pages - (
                        stats.free_count     +
                        stats.active_count   +
                        stats.inactive_count +
                        stats.wire_count     +
                        stats.compressor_page_count
                );

                if (debug) {
                        printf(
                                "%11d pages (# of pages)\n"
                                "%11d free_count (# of pages free) \n"
                                "%11d active_count (# of pages active) \n"
                                "%11d inactive_count (# of pages inactive) \n"
                                "%11d wire_count (# of pages wired down) \n"
                                "%11lld zero_fill_count (# of zero fill pages) \n"
                                "%11lld reactivations (# of pages reactivated) \n"
                                "%11lld pageins (# of pageins) \n"
                                "%11lld pageouts (# of pageouts) \n"
                                "%11lld faults (# of faults) \n"
                                "%11lld cow_faults (# of copy-on-writes) \n"
                                "%11lld lookups (object cache lookups) \n"
                                "%11lld hits (object cache hits) \n"
                                "%11lld purges (# of pages purged) \n"
                                "%11d purgeable_count (# of pages purgeable) \n"
                                "%11d speculative_count (# of pages speculative (also counted in free_count)) \n"
                                "%11lld decompressions (# of pages decompressed) \n"
                                "%11lld compressions (# of pages compressed) \n"
                                "%11lld swapins (# of pages swapped in (via compression segments)) \n"
                                "%11lld swapouts (# of pages swapped out (via compression segments)) \n"
                                "%11d compressor_page_count (# of pages used by the compressed pager to hold all the compressed data) \n"
                                "%11d throttled_count (# of pages throttled) \n"
                                "%11d external_page_count (# of pages that are file-backed (non-swap)) \n"
                                "%11d internal_page_count (# of pages that are anonymous) \n"
                                "%11lld total_uncompressed_pages_in_compressor (# of pages (uncompressed) held within the compressor.) \n",
                                pages, stats.free_count, stats.active_count, stats.inactive_count,
                                stats.wire_count, stats.zero_fill_count, stats.reactivations,
                                stats.pageins, stats.pageouts, stats.faults, stats.cow_faults,
                                stats.lookups, stats.hits, stats.purges, stats.purgeable_count,
                                stats.speculative_count, stats.decompressions, stats.compressions,
                                stats.swapins, stats.swapouts, stats.compressor_page_count,
                                stats.throttled_count, stats.external_page_count,
                                stats.internal_page_count, stats.total_uncompressed_pages_in_compressor
                        );
                }

                printf("%i\n", missing);
                sleep(1);
        }

        return 0;
}

回答1:

TL;DR:

  • host_statistics64() get information from different sources which might cost time and could produce inconsistent results.
  • host_statistics64() gets some information by variables with names like vm_page_foo_count. But not all of these variables are taken into account, e.g. vm_page_stolen_count is not.
  • The well known /usr/bin/top adds stolen pages to the number of wired pages. This is an indicator that these pages should be taken into account when counting pages.

Notes

  • I'm working on a macOS 10.12 with Darwin Kernel Version 16.5.0 xnu-3789.51.2~3/RELEASE_X86_64 x86_64 but all behaviour is completly reproducable.
  • I'm going to link a lot a source code of the XNU Version I use on my machine. It can be found here: xnu-3789.51.2.
  • The program you have written is basically the same as /usr/bin/vm_stat which is just a wrapper for host_statistics64() (and host_statistics()). The corressponding source code can be found here: system_cmds-496/vm_stat.tproj/vm_stat.c.

How does host_statistics64() fit into XNU and how does it work?

As widley know the OS X kernel is called XNU (XNU IS NOT UNIX) and "is a hybrid kernel combining the Mach kernel developed at Carnegie Mellon University with components from FreeBSD and C++ API for writing drivers called IOKit." (https://github.com/opensource-apple/xnu/blob/10.12/README.md)

The virtual memory management (VM) is part of Mach therefore host_statistics64() is located here. Let's have a closer look at the its implementation which is contained in xnu-3789.51.2/osfmk/kern/host.c.

The function signature is

kern_return_t
host_statistics64(host_t host, host_flavor_t flavor, host_info64_t info, mach_msg_type_number_t * count);

The first relevant lines are

[...]
processor_t processor;
vm_statistics64_t stat;
vm_statistics64_data_t host_vm_stat;
mach_msg_type_number_t original_count;
unsigned int local_q_internal_count;
unsigned int local_q_external_count;
[...]
processor = processor_list;
stat = &PROCESSOR_DATA(processor, vm_stat);
host_vm_stat = *stat;

if (processor_count > 1) {
    simple_lock(&processor_list_lock);

    while ((processor = processor->processor_list) != NULL) {
        stat = &PROCESSOR_DATA(processor, vm_stat);

        host_vm_stat.zero_fill_count += stat->zero_fill_count;
        host_vm_stat.reactivations += stat->reactivations;
        host_vm_stat.pageins += stat->pageins;
        host_vm_stat.pageouts += stat->pageouts;
        host_vm_stat.faults += stat->faults;
        host_vm_stat.cow_faults += stat->cow_faults;
        host_vm_stat.lookups += stat->lookups;
        host_vm_stat.hits += stat->hits;
        host_vm_stat.compressions += stat->compressions;
        host_vm_stat.decompressions += stat->decompressions;
        host_vm_stat.swapins += stat->swapins;
        host_vm_stat.swapouts += stat->swapouts;
    }

    simple_unlock(&processor_list_lock);
}
[...]

We get host_vm_stat which is of type vm_statistics64_data_t. This is just a typedef struct vm_statistics64 as you can see in xnu-3789.51.2/osfmk/mach/vm_statistics.h. And we get processor information from the makro PROCESSOR_DATA() defined in xnu-3789.51.2/osfmk/kern/processor_data.h. We fill host_vm_stat while looping through all of our processors by simply adding up the relevant numbers.

As you can see we find some well known stats like zero_fill_count or compressions but not all covered by host_statistics64().

The next relevant lines are:

stat = (vm_statistics64_t)info;

stat->free_count = vm_page_free_count + vm_page_speculative_count;
stat->active_count = vm_page_active_count;
[...]
stat->inactive_count = vm_page_inactive_count;
stat->wire_count = vm_page_wire_count + vm_page_throttled_count + vm_lopage_free_count;
stat->zero_fill_count = host_vm_stat.zero_fill_count;
stat->reactivations = host_vm_stat.reactivations;
stat->pageins = host_vm_stat.pageins;
stat->pageouts = host_vm_stat.pageouts;
stat->faults = host_vm_stat.faults;
stat->cow_faults = host_vm_stat.cow_faults;
stat->lookups = host_vm_stat.lookups;
stat->hits = host_vm_stat.hits;

stat->purgeable_count = vm_page_purgeable_count;
stat->purges = vm_page_purged_count;

stat->speculative_count = vm_page_speculative_count;

We reuse stat and make it our output struct. We then fill free_count with the sum of two unsigned long called vm_page_free_count and vm_page_speculative_count. We collect the other remaining data in the same manner (by using variables named vm_page_foo_count) or by taking the stats from host_vm_stat which we filled up above.

1. Conclusion We collect data from different sources. Either from processor informations or from variables called vm_page_foo_count. This costs time and might end in some inconsitency matter the fact VM is a very fast and continous process.

Let's take a closer look at the already mentioned variables vm_page_foo_count. They are defined in xnu-3789.51.2/osfmk/vm/vm_page.h as follows:

extern
unsigned int    vm_page_free_count; /* How many pages are free? (sum of all colors) */
extern
unsigned int    vm_page_active_count;   /* How many pages are active? */
extern
unsigned int    vm_page_inactive_count; /* How many pages are inactive? */
#if CONFIG_SECLUDED_MEMORY
extern
unsigned int    vm_page_secluded_count; /* How many pages are secluded? */
extern
unsigned int    vm_page_secluded_count_free;
extern
unsigned int    vm_page_secluded_count_inuse;
#endif /* CONFIG_SECLUDED_MEMORY */
extern
unsigned int    vm_page_cleaned_count; /* How many pages are in the clean queue? */
extern
unsigned int    vm_page_throttled_count;/* How many inactives are throttled */
extern
unsigned int    vm_page_speculative_count;  /* How many speculative pages are unclaimed? */
extern unsigned int vm_page_pageable_internal_count;
extern unsigned int vm_page_pageable_external_count;
extern
unsigned int    vm_page_xpmapped_external_count;    /* How many pages are mapped executable? */
extern
unsigned int    vm_page_external_count; /* How many pages are file-backed? */
extern
unsigned int    vm_page_internal_count; /* How many pages are anonymous? */
extern
unsigned int    vm_page_wire_count;     /* How many pages are wired? */
extern
unsigned int    vm_page_wire_count_initial; /* How many pages wired at startup */
extern
unsigned int    vm_page_free_target;    /* How many do we want free? */
extern
unsigned int    vm_page_free_min;   /* When to wakeup pageout */
extern
unsigned int    vm_page_throttle_limit; /* When to throttle new page creation */
extern
uint32_t    vm_page_creation_throttle;  /* When to throttle new page creation */
extern
unsigned int    vm_page_inactive_target;/* How many do we want inactive? */
#if CONFIG_SECLUDED_MEMORY
extern
unsigned int    vm_page_secluded_target;/* How many do we want secluded? */
#endif /* CONFIG_SECLUDED_MEMORY */
extern
unsigned int    vm_page_anonymous_min;  /* When it's ok to pre-clean */
extern
unsigned int    vm_page_inactive_min;   /* When to wakeup pageout */
extern
unsigned int    vm_page_free_reserved;  /* How many pages reserved to do pageout */
extern
unsigned int    vm_page_throttle_count; /* Count of page allocations throttled */
extern
unsigned int    vm_page_gobble_count;
extern
unsigned int    vm_page_stolen_count;   /* Count of stolen pages not acccounted in zones */
[...]
extern
unsigned int    vm_page_purgeable_count;/* How many pages are purgeable now ? */
extern
unsigned int    vm_page_purgeable_wired_count;/* How many purgeable pages are wired now ? */
extern
uint64_t    vm_page_purged_count;   /* How many pages got purged so far ? */

That's a lot of statistics regarding we only get access to a very limited number using host_statistics64(). The most of these stats are updated in xnu-3789.51.2/osfmk/vm/vm_resident.c. For example this function releases pages to the list of free pages:

/*
*   vm_page_release:
*
*   Return a page to the free list.
*/

void
vm_page_release(
    vm_page_t   mem,
    boolean_t   page_queues_locked)
{
    [...]
    vm_page_free_count++;
    [...]
}

Very interesting is extern unsigned int vm_page_stolen_count; /* Count of stolen pages not acccounted in zones */. What are stolen pages? It seems like there are mechanisms to take a page out of some lists even though it wouldn't usually be paged out. One of these mechanisms is the age of a page in the speculative page list. xnu-3789.51.2/osfmk/vm/vm_page.h tells us

* VM_PAGE_MAX_SPECULATIVE_AGE_Q * VM_PAGE_SPECULATIVE_Q_AGE_MS
* defines the amount of time a speculative page is normally
* allowed to live in the 'protected' state (i.e. not available
* to be stolen if vm_pageout_scan is running and looking for
* pages)...  however, if the total number of speculative pages
* in the protected state exceeds our limit (defined in vm_pageout.c)
* and there are none available in VM_PAGE_SPECULATIVE_AGED_Q, then
* vm_pageout_scan is allowed to steal pages from the protected
* bucket even if they are underage.
*
* vm_pageout_scan is also allowed to pull pages from a protected
* bin if the bin has reached the "age of consent" we've set

It is indeed void vm_pageout_scan(void) that increments vm_page_stolen_count. You find the corresponding source code in xnu-3789.51.2/osfmk/vm/vm_pageout.c.

I think stolen pages are not taken into account while calculating VM stats a host_statistics64() does.

Evidence that I'm right

The best way to prove this would be to compile XNU with an customized version of host_statistics64() by hand. I had no opportunity do this but will try soon.

Fortunately we are not the only ones interested in correct VM statistics. Therefore we should have a look at the implementation of well know /usr/bin/top (not contained in XNU) which is completely available here: top-108 (I just picked the macOS 10.12.4 release).

Let's have a look at top-108/libtop.c where we find the following:

static int
libtop_tsamp_update_vm_stats(libtop_tsamp_t* tsamp) {
    kern_return_t kr;
    tsamp->p_vm_stat = tsamp->vm_stat;

    mach_msg_type_number_t count = sizeof(tsamp->vm_stat) / sizeof(natural_t);
    kr = host_statistics64(libtop_port, HOST_VM_INFO64, (host_info64_t)&tsamp->vm_stat, &count);
    if (kr != KERN_SUCCESS) {
        return kr;
    }

    if (tsamp->pages_stolen > 0) {
        tsamp->vm_stat.wire_count += tsamp->pages_stolen;
    }

    [...]

    return kr;
}

tsamp is of type libtop_tsamp_t which is a struct defined in top-108/libtop.h. It contains amongst other things vm_statistics64_data_t vm_stat and uint64_t pages_stolen.

As you can see, static int libtop_tsamp_update_vm_stats(libtop_tsamp_t* tsamp) gets tsamp->vm_stat filled by host_statistics64() as we know it. Afterwards it checks if tsamp->pages_stolen > 0 and adds it up to the wire_count field of tsamp->vm_stat.

2. Conclusion We won't get the number of these stolen pages if we just use host_statistics64() as in /usr/bin/vm_stat or your example code!

Why is host_statistics64() implemented as it is?

Honestly, I don't know. Paging is a complex process and therefore a real time observation a challenging task. We have to notice that there seems to be no bug in its implementation. I think that we wouldn't even get a 100% accurate number of pages if we could get access to vm_page_stolen_count. The implementation of /usr/bin/top doesn't count stolen pages if their number is not very big.

An additional interesting thing is a comment above the function static void update_pages_stolen(libtop_tsamp_t *tsamp) which is /* This is for <rdar://problem/6410098>. */. Open Radar is a bug reporting site for Apple software and usually classifies bugs in the format given in the comment. I was unable to find the related bug; maybe it was about missing pages.

I hope these information could help you a bit. If I manage to compile the latest (and customized) Version of XNU on my machine I will let you know. Maybe this brings interesting insights.



回答2:

Just noticed that if you add compressor_page_count into the mix you get much closer to the actual amount of RAM in the machine.

This is an observation, not an explanation, and links to where this was properly documented would be nice to have!