“unexpected IRQ trap at vector XX” on Beaglebone B

2019-08-16 19:19发布

问题:

I have a problem with interrupt processing on the beaglebone black. I have written my own combination of a kernel module and a user-space driver to have access to gpios (see also https://github.com/Terstegge/gpio-bbb). With older kernels, everything was working fine. Using the most recent debian image (kernel 4.14.71-ti-r80), I get errors in the kernel log:

[  461.028013] gpio_bbb: Device /dev/gpio_bbb registered
[  507.507335] gpio_bbb: Requesting GPIO #30
[  507.507370] Mode: f
[  507.507383] gpio_bbb: Requesting GPIO #49
[  507.507395] Mode: 37
[  507.507405] gpio_bbb: Requesting GPIO #15
[  507.507414] Mode: 37
[  507.507656] gpio_bbb: Using IRQ #77 for GPIO #49
[  507.507821] gpio_bbb: Using IRQ #78 for GPIO #15
[  571.511409] irq 77, desc: db1ad800, depth: 0, count: 0, unhandled: 0
[  571.511429] ->handle_irq():  c01ab7b0, 
[  571.511458] handle_bad_irq+0x0/0x2c0
[  571.511463] ->irq_data.chip(): dc122910, 
[  571.511476] 0xdc122910
[  571.511481] ->action(): dc454600
[  571.511487] ->action->handler(): bf4c904c, 
[  571.511514] gpio_irq_handler+0x0/0x34 [gpio_bbb]
[  571.511524]    IRQ_NOPROBE set
[  571.511532] unexpected IRQ trap at vector 4d

What I do is the following: In the module code I call gpio_to_irq() to get the irq number and then call request_irq(). Both calls seem to work, because they don't report an error code (see logfile above):

  /* request the irq and install handler */
  if (!irq_enabled[gpio_num]) {
    irq = gpio_to_irq(gpio_num);
    /* request irq and install handler */
    ret = request_irq (irq, gpio_irq_handler, IRQF_SHARED, "gpio_bbb", &gpio_data);
    if (ret != 0) {
      printk(KERN_ERR MOD_STR"Failed to request IRQ %i (error %i)\n", irq, ret);
      return ret;
    }
    printk(KERN_INFO MOD_STR"Using IRQ #%i for GPIO #%i\n", irq, gpio_num);
    irq_enabled[gpio_num] = irq;
  }

When starting a test program I can see that my module (gpio_bbb) is registered for the interrupts in /proc/interrupts:

           CPU0       
...
 62:          0  tps65217   2 Edge      tps65217_pwr_but
 63:       5822  44e07000.gpio  29 Edge      wl18xx
 77:          0  4804c000.gpio  17 Edge      gpio_bbb
 78:          0  44e07000.gpio  15 Edge      gpio_bbb
IPI0:          0  CPU wakeup interrupts
IPI1:          0  Timer broadcast interrupts

When triggering some interrupts (gpio input value change) and (even) with an empty interrupt handler, which does nothing:

irqreturn_t gpio_irq_handler(int irq, void *dev_id) {
  return IRQ_HANDLED;
}

I get the above error messages in the kernel log, and my interrupts are not processed :( I have noticed that the interrupt numbers have changed (formerly the irq number of a gpio was #gpio+128). I am also aware of the new libgpiod, which is obviously running (I see the /dev/gpiochip[0..3] devices). Are my problems related to these changes? Still I am a little bit confused because all the methods I call seem to work, and still my interrupts are handled as 'unexpected'. What am I doing wrong??

回答1:

I have investigated the problem in more depth. The simple solution was to add a irq_set_irq_type(irq, IRQ_TYPE_NONE) after calling request_any_context_irq() (this method should now be used instead of request_irq()):

ret = request_any_context_irq (irq, _gpio_irq_handler, IRQF_SHARED, "gpio_bbb", &gpio_data);
if (ret < 0) {
  printk(KERN_ERR MOD_STR"Failed to request IRQ %i (error %i)\n", irq, ret);
  return ret;
}
// Set IRQ type
irq_set_irq_type(irq, IRQ_TYPE_NONE);

These changes seem to be necessary becaue of the new generic irq system in linux: https://www.kernel.org/doc/html/v4.12/core-api/genericirq.html. I have not yet fully understood why and how the IRQ type influences the internally called methods, but the gpio module is working again as expected.