On the kubernetes dashboard, there's a pod wherein the Memory Usage (bytes) is displayed as 904.38Mi
.
This pod holds the java app that was ran with -Xms512m -Xmx1024m
, and on kubernetes deployment file -> requests.memory = 512M
, limits.memory = 1.5G
.
I've enabled gc logs and seeing these on the pod logs:
[2020-04-29T15:41:32.051+0000] GC(1533) Phase 1: Mark live objects
[2020-04-29T15:41:32.133+0000] GC(1533) Phase 1: Mark live objects 81.782ms
[2020-04-29T15:41:32.133+0000] GC(1533) Phase 2: Compute new object addresses
[2020-04-29T15:41:32.145+0000] GC(1533) Phase 2: Compute new object addresses 11.235ms
[2020-04-29T15:41:32.145+0000] GC(1533) Phase 3: Adjust pointers
[2020-04-29T15:41:32.199+0000] GC(1533) Phase 3: Adjust pointers 54.559ms
[2020-04-29T15:41:32.199+0000] GC(1533) Phase 4: Move objects
[2020-04-29T15:41:32.222+0000] GC(1533) Phase 4: Move objects 22.406ms
[2020-04-29T15:41:32.222+0000] GC(1533) Pause Full (Allocation Failure) 510M->127M(680M) 171.359ms
[2020-04-29T15:41:32.222+0000] GC(1532) DefNew: 195639K->0K(195840K)
[2020-04-29T15:41:32.222+0000] GC(1532) Tenured: 422769K->130230K(500700K)
[2020-04-29T15:41:32.222+0000] GC(1532) Metaspace: 88938K->88938K(1130496K)
[2020-04-29T15:41:32.228+0000] GC(1532) Pause Young (Allocation Failure) 603M->127M(614M) 259.018ms
[2020-04-29T15:41:32.228+0000] GC(1532) User=0.22s Sys=0.05s Real=0.26s
How did kubernetes arrived at 904.38Mi
usage? If I understood correctly, the current usages are only:
DefNew (young) - 0k
Tenured - 130230K
Metaspace - 88938K
Sum - 216168K
Running ps
shows there are no other processes running on the pod aside from this java app.
Anyone can shed some light to this?
(edited) When the pod first started and let run for few mins, the memory usage is displayed as around 500mb only, then let the requests come in it will burst to 900mb-1gb, then when all has been processed, the memory usage on k8s dashboard doesn't go down below 900mb, even though based on GC logs, heap is GC'ed ok.
GC deals with a subset of memory used by process. There are regions of JVM memory which are not subject for garbage collection.
Below are few memory areas are not included in heap / metaspace
List above is not full, these are just largest memory consumers.
Here is JVM memory hierarchy diagram with related config options.
In summary, actual JVM memory appetite is always large than heap limit.
How much large depends on nature of application and could be established empirically.
UPDATE
Java Native Memory Tracking could be enabled in JVM to provide detailed reports related to memory usage across different functional areas.
There are a lot of things going on here. Let's take each at a time.
It seems you are using a single container per pod (though you could have many containers per pod). The
requests.memory
andlimits.memory
are specific to the container, kubernets computeslimits
andrequests
per pod as the sum of all the containers limitations.So think about it - you are saying that a
pod
shows904.38Mi
, but you showrequests.memory
andlimits.memory
, which is per container. That is why I assume you have one container per pod. This is a general intro that does not answer your question - but we will get there.Then comes the fact that a
pod
is started bydocker
, which is started withkubectl
and that readsrequires.memory
andlimits.memory
. To make this a bit simpler : what you set inlimits.memory
, will be passed asdocker -m
. So, in your case, the total memory used for the docker process is1.5GC
. Remember that is the entire process limit, not just the heap. A java process is a lot more than the heap, which you specify with-Xms512m -Xmx1024m
. So to answer your question:This is what the entire process is currently taking, not just the heap. From the very short log files that you posted - your application is just fine.
EDIT
I actually did not had kubernetes dashboard in my environment to test this specifically, so had to install it to really understand what is going on. I had a hint against the majority of things, but to make sure, I did some tests.
First things first: what does that number in the dashboard mean? Took a while to find/understand, but that is the actual resident memory of the process, which actually is a very good thing.
Any sane
OS
knows that when someone requests memory from it, it rarely needs/utilizes it all, as such, it gives the memory to it in a lazy fashion. This is easy to prove ink8s
. Suppose I have ajdk-13
JVM and start it with:Notice
requests.memory=10Mi
andlimits.memory=100Mi
. Reading the answer from the beginning, you already know that the specific pod will be started withdocker -m 100m...
becauselimits.memory=100Mi
. That is easy to prove, justsh
into thepod
:and find out what the
cgroup
says:perfect! so the pod's memory limitation is
100 MB
max, but what is the current memory utilization, that is what is the resident memory taken?OK, so the current memory utilization is only
4MB
. You can "make sure" that this is indeed accurate, if you do:and inside that pod issue:
and notice that
RES
memory is on par with the one reported via dashboard orkubectl top pod
.And now let's do a test. Suppose I have this very simple code in that pod:
I allocate
1MB
every 3 seconds for around 2 minutes. When I look at this process in dashboard, I do see that at some point in time, the memory grows. After the program ends, dashboard reports the drop in memory back. Good! That means the memory is given back and RSS memory drops. This is how it looks like in dashboard:Now let's change this code just a bit. Let's add some GC in there and let's never finish this process (you know like typical spring-boot applications do):
I run this with:
on inspecting logs (just like in your example), I do see that heap is collected just fine. But when I look at the dashboard, the memory does not drop (unlikes the previous example).
Once
G1GC
takes memory, it is not very eager to give it back to the OS. It can do that in rare cases, here is one example or you can instruct it to do so.Both ways are rather painful, instead there are
GC
algorithms that are smarter (and in general a lot better). My personal love goes toShenandoah
, let's see what it does. If I slightly change the code (so that I could prove my point better):And run it with:
Here is what you are going to see:
And you should, for a fact care about this:
P.S. I would also add to that the fact that other pods are suffering too, because one pod decided to take as much memory as it could, at a particular spike, and never give it back.