How can I confirm that a host is NUMA-aware? The Oracle doc says that NUMA-awareness starts at kernel 2.6.19, but the NUMA man page says that it was introduced with 2.6.14. I'd like to be sure that a Java process started with -XX:+UseNUMA
is actually taking advantage of something.
Checking for the numa_maps, I see that I have them:
# find /proc -name numa_maps
/proc/1/task/1/numa_maps
/proc/1/numa_maps
/proc/2/task/2/numa_maps
/proc/2/numa_maps
/proc/3/task/3/numa_maps
Though my kernel is behind what Oracle states:
# uname -sr
Linux 2.6.18-92.el5
I'm currently using 64-bit jdk1.6.0_29 on RHEL5.1.
The presence of those /proc files indicates that your linux kernel is numa-aware. Don't concern yourself too much comparing version numbers, as, particularly with Oracle / RHEL kernels, the vendors port/backport many features without keeping the version string "up to date".
Other ways of testing the same thing:
$ grep NUMA=y /boot/config-`uname -r`
CONFIG_NUMA=y
CONFIG_K8_NUMA=y
CONFIG_X86_64_ACPI_NUMA=y
CONFIG_ACPI_NUMA=y
$ numactl --hardware
available: 2 nodes (0-1)
node 0 size: 18156 MB
node 0 free: 9053 MB
node 1 size: 18180 MB
node 1 free: 6853 MB
node distances:
node 0 1
0: 10 20
1: 20 10
The Oracle doc also states:
Note: There was a known bug in the Linux Kernel that may cause the JVM to crash when being t with -XX:UseNUMA. The bug was fixed in 2012, so this should not affect the latest versions of the Linux Kernel. To see if your Kernel has this bug, you can run the native reproducer.
Which I have reproduced here to demonstrate its simplicity:
http://docs.oracle.com/javase/7/docs/technotes/guides/vm/reproducer.c
To build the reproducer, you may need to install the numactl or numactl-devel packages depending on your distribution. See man numa_maps
for details.
#include <numaif.h>
#include <numa.h>
#include <stddef.h>
#include <sys/mman.h>
#include <stdint.h>
int main(void) {
if (numa_all_nodes_ptr == (void*)0) {
return -1;
}
size_t pagesize = getpagesize();
void* mapped_memory = mmap(NULL, 3 * pagesize, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
if (mapped_memory == MAP_FAILED) {
return -2;
}
void* page0 = mapped_memory;
void* page1 = (void*)((uintptr_t)page0 + pagesize);
void* page2 = (void*)((uintptr_t)page1 + pagesize);
// Set up the last page as interleaved.
mbind(page2, pagesize, MPOL_INTERLEAVE, numa_all_nodes_ptr->maskp,
numa_all_nodes_ptr->size, 0);
// Setup the last two pages as interleaved.
mbind(page1, 2 * pagesize, MPOL_INTERLEAVE,
numa_all_nodes_ptr->maskp, numa_all_nodes_ptr->size, 0);
*((char*)page2) = 2;
*((char*)page1) = 1;
*((char*)page0) = 0; // Crash here, when mbind_merge was broken.
return 0;
}
So, I took the ambiguity to mean that 2.6.19 was the first safe version.