Lowering linux kernel timer frequency

2019-03-09 11:28发布

问题:

When I run my Virtual Machine with Gentoo as guest, I have found that there is considerable overhead coming from tick_periodic function. (This is the function which runs on every timer interrupt.) This function updates a global jiffy using write_seqlocks which leads to the overhead.

Here's a grep of HZ and relevant stuff in my kernel config file.

sharan013@sitmac4:~$ cat /boot/config | egrep 'HZ|TIME'

# CONFIG_RCU_FAST_NO_HZ is not set
CONFIG_NO_HZ=y
# CONFIG_HZ_100 is not set
# CONFIG_HZ_250 is not set
# CONFIG_HZ_300 is not set
CONFIG_HZ_1000=y
CONFIG_HZ=1000
# CONFIG_MACHZ_WDT is not set
CONFIG_TIMERFD=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_X86_CYCLONE_TIMER=y
CONFIG_HPET_TIMER=y

Clearly it has set the configuration to 1000, but when I do sysconf(_SC_CLK_TCK), I get 100 as my timer frequency. So what is my system's timer frequency?

What I want to do is to bring the frequency down to 100, even lower if possible. Although it might effect the interactivity and precision of poll/select and schedulers time slice, I am ready to sacrifice these things for lesser timer interrupt as it will speed up VM.

When I tried to find out what has to be done I read in some place that you can do so by changing in the configuration file, else where I read that adding divider=10 to the boot parameter does the job, else where I read that none of it is needed if you can set the CONFIG_HIGH_RES_TIMERS to acheive low-latency timers even without increasing the timer frequency and the same is possible with a tickless system CONFIG_NO_HZ.

I am extermely confused about what is the right approach.

All I want is to bring down the timer interrupt to as low as possible.

Can I know the right way of doing this?

回答1:

Don't worry! Your confusion is nothing but expected. Linux timer interrupts are very confusing and have had a long and quite exciting history.

CLK_TCK

Linux has no sysconf system call and glibc is just returning the constant value 100. Sorry.

HZ <-- what you probably want

When configuring your kernel you can choose a timer frequency of either 100Hz, 250Hz, 300Hz or 1000Hz. All of these are supported, and although 1000Hz is the default it's not always the best.

People will generally choose a high value when they value latency (a desktop or a webserver) and a low value when they value throughput (HPC).

CONFIG_HIGH_RES_TIMERS

This has nothing to do with timer interrupts, it's just a mechanism that allows you to have higher resolution timers. This basically means that timeouts on calls like select can be more accurate than 1/HZ seconds.

divider

This command line option is a patch provided by Red Hat. You can probably use this (if you're using Red Hat or CentOS), but I'd be careful. It's caused lots of bugs and you should probably just recompile with a different Hz value.

CONFIG_NO_HZ

This really doesn't do much, it's for power saving and it means that the ticks will stop (or at least become less frequent) when nothing is executing. This is probably already enabled on your kernel. It doesn't make any difference when at least one task is runnable.

Frederic Weisbecker actually has a patch pending which generalizes this to cases where only a single task is running, but it's a little way off yet.