Linux Kernel: udelay() returns too early?

2020-07-10 10:21发布

I have a driver which requires microsecond delays. To create this delay, my driver is using the kernel's udelay function. Specifically, there is one call to udelay(90):

iowrite32(data, addr + DATA_OFFSET);
iowrite32(trig, addr + CONTROL_OFFSET);

udelay(30);

trig |= 1;
iowrite32(trig, addr + CONTROL_OFFSET);

udelay(90); // This is the problematic call

We had reliability issues with the device. After a lot of debugging, we traced the problem to the driver resuming before 90us has passed. (See "proof" below.)

I am running kernel version 2.6.38-11-generic SMP (Kubuntu 11.04, x86_64) on an Intel Pentium Dual Core (E5700).

As far as I know, the documentation states that udelay will delay execution for at least the specified delay, and is uninterruptible. Is there a bug is this version of the kernel, or did I misunderstand something about the use of udelay?


To convince ourselves that the problem was caused by udelay returning too early, we fed a 100kHz clock to one of the I/O ports and implemented our own delay as follows:

// Wait until n number of falling edges
// are observed
void clk100_delay(void *addr, u32 n) {
    int i;

    for (i = 0; i < n; i++) {
        u32 prev_clk = ioread32(addr);
        while (1) {
            u32 clk = ioread32(addr);
            if (prev_clk && !clk) {
                break;
            } else {
                prev_clk = clk;
            }
        }
    }
}

...and the driver now works flawlessly.


As a final note, I found a discussion indicating that frequency scaling could be causing the *delay() family of functions to misbehave, but this was on a ARM mailing list - I assuming such problems would be non-existent on a Linux x86 based PC.

2条回答
老娘就宠你
2楼-- · 2020-07-10 11:04

The E5700 has X86_FEATURE_CONSTANT_TSC but not X86_FEATURE_NONSTOP_TSC. The TSC is the likely clock source for the udelay. Unless bound to one of the cores with an affinity mask, your task may have been preempted and rescheduled to another CPU during the udelay. Or the TSC might not be stable during lower-power CPU modes.

Can you try disabling interrupts or disabling preemption during the udelay? Also, try reading the TSC before and after.

查看更多
唯我独甜
3楼-- · 2020-07-10 11:09

I don't know of any bug in that version of the kernel (but that doesn't mean that there isn't one).

udelay() isn't "uninterruptible" - it does not disable preemption, so your task can be preempted by a RT task during the delay. However the same is true of your alternate delay implementation, so that is unlikely to be the problem.

Could your actual problem be a DMA coherency / memory ordering issue? Your alternate delay implementation accesses the bus, so this might be hiding the real problem as a side-effect.

查看更多
登录 后发表回答