I have many apps running on containers in Mesos, managed via marathon. I have given CPU allocation for each app while deploying via marathon like 1, .5 etc.
But the CPU allocation in marathon, does not mean that its 1 CPU or half CPU. It simply means that its time sharing ratio. Also each container gets to access all the CPUs on its Host.
Now, I want to measure the CPU efficiency of each Container on Mesos slaves, so that I can reduce or increase the CPU allocation in for each App in Marathon. I just want to make resource utilisation even more efficient.
I could use https://github.com/bobrik/collectd-mesos-tasks, but the problem is CPU utilisation metrics does not relate to the CPU allocation in Marathon.
In Mesos WebUI you can see how much CPU is used by your executor
Here is the code that collects statistics from /monitor/statistics
endpoint and calculate CPU usage.
You are interested in cpus_total_usage
so the following method should works for you
Let's assume a
and b
are snapshot of statistics at some point in time. To calculate cpus_total_usage
, we need calculate the time executor spent in the system and user space and divide it by the time elapsed between a
and b
.
cpus_total_usage = (
(b.cpus_system_time_secs - a.cpus_system_time_secs) +
(b.cpus_user_time_secs - a.cpus_user_time_secs)) /
(b.timestamp - a.timestamp)
)
cpu_percent = cpus_total_usage / cpu_limit * 100%
Depending on how much work you want to invest yourself, you can either use the Marathon Event Bus and more generally the Marathon HTTP API (for example this endpoint) along with low-level tools like cAdvisor or cinf to do the maths yourself. If you don't want to code stuff yourself, I suggest you use Sysdig, Datadog or Prometheus to do the heavy lifting for you.