Linux中断子系统7

原创已于 2024-12-11 17:44:08 修改 · 705 阅读

28 ·

CC 4.0 BY-SA版权

文章标签：

#linux #运维 #服务器

于 2024-12-11 17:42:14 首次发布

Linux内核子系统专栏收录该内容

199 篇文章

订阅专栏

Linux中断子系统7（基于Linux6.6）---中断Driver申请API

一、概述

在嵌入式系统和操作系统中，开发者通常使用一些中断控制器的API来与硬件交互，完成中断的管理。以下是中断控制器驱动程序常见的API概述：

1. 中断控制器初始化 API

irqchip_init(): 初始化中断控制器。此函数会设置中断控制器的初始状态，并为后续的中断操作准备硬件资源。

2. 注册和注销中断

request_irq(): 注册一个中断请求（IRQ）。此函数将中断号、处理函数、标志等信息注册到系统中，启用特定的中断处理逻辑。常用参数：
- irq: 中断号。
- handler: 中断处理函数。
- flags: 中断处理标志（例如边沿触发、级别触发等）。
- dev_id: 用于标识设备的数据。
free_irq(): 注销中断。释放已注册的中断号，并移除中断服务程序（ISR）。

3. 启用与禁用中断

enable_irq(): 启用指定中断。调用此函数时，中断控制器允许指定的中断号进行中断请求。
disable_irq(): 禁用指定中断。调用此函数会暂停指定中断号的中断请求，避免触发中断服务程序。
enable_irq_wake(): 启用指定中断的唤醒功能，用于支持低功耗模式下的中断唤醒。
disable_irq_wake(): 禁用中断的唤醒功能，防止中断唤醒系统。

4. 处理中断

irq_handler_t: 中断处理函数类型。在中断发生时，操作系统或驱动会调用对应的中断处理函数（ISR，Interrupt Service Routine）。函数签名通常为：

```
irqreturn_t my_irq_handler(int irq, void *dev_id);
```
- irq: 中断号。
- dev_id: 设备标识符，通常指向特定设备的数据结构。
irqreturn_t: 用于中断处理函数的返回类型，常见的返回值包括：
- IRQ_HANDLED: 表示中断已被成功处理。
- IRQ_NONE: 表示此中断未被处理。

5. 中断上下文

in_irq(): 检查当前执行上下文是否为中断上下文。这对于编写中断处理程序时避免不当操作（如阻塞操作）非常重要。

6. 中断优先级

set_irq_priority(): 设置中断的优先级。该 API 可用于为某些中断设置更高或更低的优先级，帮助操作系统根据优先级调度中断。

7. 中断屏蔽与触发

mask_irq(): 屏蔽中断，使其不再触发。
unmask_irq(): 取消屏蔽中断，允许中断触发。
set_irq_trigger_type(): 设置中断的触发方式，通常可以设置为边缘触发或电平触发。

8. 特殊功能

irq_set_affinity(): 设置中断的亲和性，即指定哪个CPU可以处理特定的中断。此功能用于支持多核处理器的中断负载均衡。
irq_set_type(): 设置中断的类型，例如：边缘触发（Edge Triggered）或电平触发（Level Triggered）。
irq_clear_status(): 清除中断状态。这通常用于手动重置中断控制器的中断标志。

9. 调试与管理

irq_dump(): 打印中断控制器的状态，通常用于调试中断相关的问题。

10. 其他辅助 API

irq_get_irq_data(): 获取与特定中断相关的设备数据。
irq_get_chip(): 获取中断控制器芯片（IRQ chip）的数据结构，可以用来访问中断控制器的具体实现。

二、实时性分析与中断介绍

2.1、抢占式linux内核的实时性

CONFIG_PREEMPT的选项，打开该选项后，linux kernel就支持了内核代码的抢占（当然不能在临界区），其行为如下：

引入抢占式内核后，系统的平均任务响应时间会缩短，但是，实时性更关注的是：无论在任何的负载情况下，任务响应时间是确定的。因此，更需要关注的是worst-case的任务响应时间。这里有两个因素会影响worst case latency：

（1）为了同步，内核中总有些代码需要持有自旋锁资源，或者显式的调用preempt_disable来禁止抢占，这时候不允许抢占。

（2）中断上下文（并非只是中断handler，还包括softirq、timer、tasklet）总是可以抢占进程上下文。

因此，即便是打开了PREEMPT的选项，实际上linux系统的任务响应时间仍然是不确定的。

一方面内核代码的临界区非常多，我们需要找到，系统中持有锁，或者禁止抢占的最大的时间片。

另一方面，在上图的T4中，能顺利的调度高优先级任务并非易事，这时候可能有触发的软中断，也可能有新来的中断，也可能某些driver的tasklet要执行，只有在没有任何bottom half的任务要执行的时候，调度器才会启动调度。

2.2、提高linux内核的实时性

通过上一个小节的描述，相信大家都确信中断对linux 实时性的最大的敌人。

在Linux kernel中，一个外设的中断处理被分成top half和bottom half，top half进行最关键，最基本的处理，而比较耗时的操作被放到bottom half（softirq、tasklet）中延迟执行。虽然bottom half被延迟执行，但始终都是先于进程执行的。为何不让这些耗时的bottom half和普通进程公平竞争呢？因此，linux kernel借鉴了RTOS的某些特性，对那些耗时的驱动interrupt handler进行线程化处理，在内核的抢占点上，让线程（无论是内核线程还是用户空间创建的线程，还是驱动的interrupt thread）在一个舞台上竞争CPU。

三、request_threaded_irq

 int request_threaded_irq(unsigned int irq, irq_handler_t handler,
			 irq_handler_t thread_fn, unsigned long irqflags,
			 const char *devname, void *dev_id);

参数解释：

irq:
- 需要注册的中断号。通常是硬件中断号，每个设备的中断都会对应一个唯一的 IRQ。
handler:
- 中断处理函数（顶半部）。当中断发生时，内核会首先调用此函数来处理中断。此函数应该尽可能快速地执行。handler 会在中断上下文中执行，不允许进行阻塞操作（例如，睡眠操作）。
thread_fn:
- 线程化中断处理函数（底半部）。当顶半部中断处理函数执行完毕后，thread_fn 会在一个独立的内核线程上下文中执行。thread_fn 适合执行耗时的操作，比如数据处理、缓冲区管理等。这个函数会被调度到独立的线程中运行，因此可以进行睡眠操作。
irqflags:
- 中断标志。这个标志用于控制中断的行为，通常与中断的触发方式、共享中断等设置相关。常见的标志有：
- IRQF_SHARED: 中断号可能会被多个设备共享。
- IRQF_DISABLED: 禁用中断的自动启用。
- IRQF_TRIGGER_*: 控制中断触发的方式（例如边沿触发、级别触发）。
- IRQF_ONESHOT: 表示中断服务程序只能处理一次中断，完成后需要手动重新请求中断。
devname:
- 设备名称，通常是该中断对应的设备的名称，用于日志记录、调试等。
dev_id:
- 设备私有数据指针。可以用来存储与中断相关的设备特定数据。在中断处理程序中，dev_id 可以用来区分不同的设备或中断源。

返回值：

0：成功注册中断。
负值：失败，返回的负值是错误代码，通常来自 errno。例如，-EBUSY 表示中断号已经被其他设备占用。

四、request_threaded_irq代码分析

4.1、request_threaded_irq主流程

kernel/irq/manage.c


int request_threaded_irq(unsigned int irq, irq_handler_t handler,
			 irq_handler_t thread_fn, unsigned long irqflags,
			 const char *devname, void *dev_id)
{
	struct irqaction *action;
	struct irq_desc *desc;
	int retval;
 
	if (irq == IRQ_NOTCONNECTED)                //下面------ 1 ------ 位置分析
		return -ENOTCONN;
 
	/*
	 * Sanity-check: shared interrupts must pass in a real dev-ID,
	 * otherwise we'll have trouble later trying to figure out
	 * which interrupt is which (messes up the interrupt freeing
	 * logic etc).
	 *
	 * Also IRQF_COND_SUSPEND only makes sense for shared interrupts and
	 * it cannot be set along with IRQF_NO_SUSPEND.
	 */
    //下面 ----- 2  ------- 位置分析
	if (((irqflags & IRQF_SHARED) && !dev_id) ||
	    (!(irqflags & IRQF_SHARED) && (irqflags & IRQF_COND_SUSPEND)) ||
	    ((irqflags & IRQF_NO_SUSPEND) && (irqflags & IRQF_COND_SUSPEND)))
		return -EINVAL;
 
	desc = irq_to_desc(irq);        //下面 ----- 3  ------- 位置分析
	if (!desc)
		return -EINVAL;
 
	if (!irq_settings_can_request(desc) ||    //下面 ----- 4  ------- 位置分析
	    WARN_ON(irq_settings_is_per_cpu_devid(desc)))
		return -EINVAL;
 
	if (!handler) {                  //下面 ----- 5  ------- 位置分析  
		if (!thread_fn)
			return -EINVAL;
		handler = irq_default_primary_handler;
	}
 
    //下面 ----- 6  ------- 位置分析
 
	action = kzalloc(sizeof(struct irqaction), GFP_KERNEL);
	if (!action)
		return -ENOMEM;
 
	action->handler = handler;
	action->thread_fn = thread_fn;
	action->flags = irqflags;
	action->name = devname;
	action->dev_id = dev_id;
 
	retval = irq_chip_pm_get(&desc->irq_data);
	if (retval < 0) {
		kfree(action);
		return retval;
	}
 
	retval = __setup_irq(irq, desc, action);    //下面 ----- 7  ------- 位置分析
 
	if (retval) {
		irq_chip_pm_put(&desc->irq_data);
		kfree(action->secondary);
		kfree(action);
	}
 
#ifdef CONFIG_DEBUG_SHIRQ_FIXME
	if (!retval && (irqflags & IRQF_SHARED)) {
		/*
		 * It's a shared IRQ -- the driver ought to be prepared for it
		 * to happen immediately, so let's make sure....
		 * We disable the irq to make sure that a 'real' IRQ doesn't
		 * run in parallel with our fake.
		 */
		unsigned long flags;
 
		disable_irq(irq);
		local_irq_save(flags);
 
		handler(irq, dev_id);
 
		local_irq_restore(flags);
		enable_irq(irq);
	}
#endif
	return retval;
}
EXPORT_SYMBOL(request_threaded_irq);

分析1

#define IRQ_NOTCONNECTED	(1U << 31)

根据定义可以看到，这个宏表示的数字非常大，中断号不可能有这么大的。

分析2：

对于那些需要共享的中断，在request irq的时候需要给出dev id，否则会出错退出。为何对于IRQF_SHARED的中断必须要给出dev id呢？实际上，在共享的情况下，一个IRQ number对应若干个irqaction，当操作irqaction的时候，仅仅给出IRQ number就不是非常的足够了，这时候，需要一个ID表示具体的irqaction，这里就是dev_id的作用了。

extern const void *free_irq(unsigned int, void *);

当释放一个IRQ资源的时候，不但要给出IRQ number，还要给出device ID。只有这样，才能精准的把要释放的那个irqaction 从irq action list上移除。dev_id在中断处理中有没有作用呢？kernel/irq/handle.c

  
irqreturn_t __handle_irq_event_percpu(struct irq_desc *desc, unsigned int *flags)
{
	irqreturn_t retval = IRQ_NONE;
	unsigned int irq = desc->irq_data.irq;
	struct irqaction *action;
 
	record_irq_time(desc);
 
	for_each_action_of_desc(desc, action) {
		irqreturn_t res;
 
		trace_irq_handler_entry(irq, action);
		res = action->handler(irq, action->dev_id);    //执行每个驱动绑定的中断处理函数
		trace_irq_handler_exit(irq, action, res);
 
		if (WARN_ONCE(!irqs_disabled(),"irq %u handler %pF enabled interrupts\n",
			      irq, action->handler))
			local_irq_disable();
 
		switch (res) {
		case IRQ_WAKE_THREAD:
			/*
			 * Catch drivers which return WAKE_THREAD but
			 * did not set up a thread function
			 */
			if (unlikely(!action->thread_fn)) {
				warn_no_thread(irq, action);
				break;
			}
 
			__irq_wake_thread(desc, action);
 
			/* Fall through to add to randomness */
		case IRQ_HANDLED:
			*flags |= action->flags;
			break;
 
		default:
			break;
		}
 
		retval |= res;
	}
 
	return retval;
}

Linux interrupt framework虽然支持中断共享，但是它并不会协助解决识别问题，它只会遍历该IRQ number上注册的irqaction的callback函数，这样，虽然只是一个外设产生的中断，Linux kernel还是把所有共享的那些中断handler都逐个调用执行。为了让系统的performance不受影响，irqaction的callback函数必须在函数的最开始进行判断，是否是自己的硬件设备产生了中断（读取硬件的寄存器），如果不是，尽快的退出。

需要注意的是，这里dev_id并不能在中断触发的时候用来标识需要调用哪一个irqaction的callback函数，通过上面的代码也可以看出，dev_id有些类似一个参数传递的过程，可以把具体driver的一些硬件信息，组合成一个structure，在触发中断的时候可以把这个structure传递给中断处理函数。

分析3：

通过IRQ number获取对应的中断描述符。在引入CONFIG_SPARSE_IRQ选项后，这个转换变得不是那么简单了。在过去，我们会以IRQ number为index，从irq_desc这个全局数组中直接获取中断描述符。如果配置CONFIG_SPARSE_IRQ选项，则需要从radix tree中搜索。

kernel/irq/irqdesc.c


struct irq_desc *irq_to_desc(unsigned int irq)
{
	return mtree_load(&sparse_irqs, irq);
}

分析4：

并非系统中所有的IRQ number都可以request，有些中断描述符被标记为IRQ_NOREQUEST，标识该IRQ number不能被其他的驱动request。一般而言，这些IRQ number有特殊的作用，例如用于级联的那个IRQ number是不能request。irq_settings_can_request函数就是判断一个IRQ是否可以被request。

irq_settings_is_per_cpu_devid函数用来判断一个中断描述符是否需要传递per cpu的device ID。per cpu的中断上面已经描述的很清楚了，这里不再细述。如果一个中断描述符对应的中断 ID是per cpu的，那么在申请其handler的时候就有两种情况，一种是传递统一的dev_id参数（传入request_threaded_irq的最后一个参数），另外一种情况是针对每个CPU，传递不同的dev_id参数。在这种情况下，我们需要调用request_percpu_irq接口函数而不是request_threaded_irq。

 extern int __must_check
request_threaded_irq(unsigned int irq, irq_handler_t handler,
		     irq_handler_t thread_fn,
		     unsigned long flags, const char *name, void *dev);
 
 
 
 
extern int __must_check
__request_percpu_irq(unsigned int irq, irq_handler_t handler,
		     unsigned long flags, const char *devname,
		     void __percpu *percpu_dev_id);
 
static inline int __must_check
request_percpu_irq(unsigned int irq, irq_handler_t handler,
		   const char *devname, void __percpu *percpu_dev_id)
{
	return __request_percpu_irq(irq, handler, 0,
				    devname, percpu_dev_id);
}

当然这里要注意的是，使用的注册函数不同，对应的设防函数也要不同

 extern const void *free_irq(unsigned int, void *);
 
extern void free_percpu_irq(unsigned int, void __percpu *);

分析5：

传入request_threaded_irq的primary handler和threaded handler参数有下面四种组合：

primary handler	threaded handler	描述
NULL	NULL	函数出错，返回-EINVAL
设定	设定	正常流程。中断处理被合理的分配到primary handler和threaded handler中。
设定	NULL	中断处理都是在primary handler中完成
NULL	设定	这种情况下，系统会帮忙设定一个default的primary handler：irq_default_primary_handler，协助唤醒threaded handler线程

分析6：

	action = kzalloc(sizeof(struct irqaction), GFP_KERNEL);
	if (!action)
		return -ENOMEM;

	action->handler = handler;
	action->thread_fn = thread_fn;
	action->flags = irqflags;
	action->name = devname;
	action->dev_id = dev_id;

	retval = irq_chip_pm_get(&desc->irq_data);
	if (retval < 0) {
		kfree(action);
		return retval;
	}

每一个中断处理函数都会由一个struct irqaction来管理，它将挂接到对应中断描述符的cation链表上。

将来中断产生了，会依次执行这个链表上的所以函数。这里就是申请并初始化一个struct irqaction。include/linux/interrupt.h

struct irqaction {
	irq_handler_t		handler;
	void			*dev_id;
	void __percpu		*percpu_dev_id;
	struct irqaction	*next;
	irq_handler_t		thread_fn;
	struct task_struct	*thread;
	struct irqaction	*secondary;
	unsigned int		irq;
	unsigned int		flags;
	unsigned long		thread_flags;
	unsigned long		thread_mask;
	const char		*name;
	struct proc_dir_entry	*dir;
} ____cacheline_internodealigned_in_smp;

分析7：

这部分的代码很简单，分配struct irqaction，赋值，调用__setup_irq进行实际的注册过程。在内核中，有很多函数，有的是需要调用者自己加锁保护的，有些是不需要加锁保护的。对于这些场景，linux kernel采取了统一的策略：基本函数名字是一样的，只不过需要调用者自己加锁保护的那个函数需要增加__的前缀，例如内核有下面两个函数：setup_irq和__setup_irq。这里，在setup irq的时候已经调用chip_bus_lock进行保护，因此使用lock free的版本__setup_irq。

kernel/irq/manage.c


static int
__setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
{
	struct irqaction *old, **old_ptr;
	unsigned long flags, thread_mask = 0;
	int ret, nested, shared = 0;
 
	if (!desc)
		return -EINVAL;
 
	if (desc->irq_data.chip == &no_irq_chip)    //这个中断所在的中断控制器的基本操作必须要有(开关中断等)
		return -ENOSYS;
	if (!try_module_get(desc->owner))
		return -ENODEV;
 
	new->irq = irq;            //给申请的这个irqaction初始化中断号
 
	/* 要没指定触发类型,用默认的
	 * If the trigger type is not specified by the caller,
	 * then use the default for this interrupt.
	 */
	if (!(new->flags & IRQF_TRIGGER_MASK))
		new->flags |= irqd_get_trigger_type(&desc->irq_data);    
 
	/*
	 * Check whether the interrupt nests into another interrupt
	 * thread.
	 */
    //检查这个中断是否嵌入另一个中断线程,可见,一个中断,如果支持了共享中断,它也只能有一个中断线程化
	nested = irq_settings_is_nested_thread(desc);    //这个中断是嵌套进线程的
	if (nested) {
		if (!new->thread_fn) {
			ret = -EINVAL;        //这个新的irqaction如果也要线程话的话就会失败
			goto out_mput;
		}
		/*
		 * Replace the primary handler which was provided from
		 * the driver for non nested interrupt handling by the
		 * dummy function which warns when called.
		 */
        //替换驱动提供的主处理程序,表示这么搞不好
		new->handler = irq_nested_primary_handler;
	} else {
		if (irq_settings_can_thread(desc)) {       //检查这个中断是不是被设置了线程话
			ret = irq_setup_forced_threading(new);    //如果是,这里就强制线程化
			if (ret)
				goto out_mput;
		}
	}
 
	/*
	 * Create a handler thread when a thread function is supplied
	 * and the interrupt does not nest into another interrupt
	 * thread.
     * 当提供线程函数并且中断没有嵌套到另一个中断线程时，创建一个处理程序线程。
	 */
	if (new->thread_fn && !nested) {
		ret = setup_irq_thread(new, irq, false);    
		if (ret)
			goto out_mput;
		if (new->secondary) {
			ret = setup_irq_thread(new->secondary, irq, true);
			if (ret)
				goto out_thread;
		}
	}
 
	/*
	 * Drivers are often written to work w/o knowledge about the
	 * underlying irq chip implementation, so a request for a
	 * threaded irq without a primary hard irq context handler
	 * requires the ONESHOT flag to be set. Some irq chips like
	 * MSI based interrupts are per se one shot safe. Check the
	 * chip flags, so we can avoid the unmask dance at the end of
	 * the threaded handler for those.
	 */
	if (desc->irq_data.chip->flags & IRQCHIP_ONESHOT_SAFE)
		new->flags &= ~IRQF_ONESHOT;
 
	/*
	 * Protects against a concurrent __free_irq() call which might wait
	 * for synchronize_hardirq() to complete without holding the optional
	 * chip bus lock and desc->lock. Also protects against handing out
	 * a recycled oneshot thread_mask bit while it's still in use by
	 * its previous owner.
	 */
	mutex_lock(&desc->request_mutex);
 
	/*
	 * Acquire bus lock as the irq_request_resources() callback below
	 * might rely on the serialization or the magic power management
	 * functions which are abusing the irq_bus_lock() callback,
	 */
	chip_bus_lock(desc);
 
	/* First installed action requests resources. */
	if (!desc->action) {
		ret = irq_request_resources(desc);
		if (ret) {
			pr_err("Failed to request resources for %s (irq %d) on irqchip %s\n",
			       new->name, irq, desc->irq_data.chip->name);
			goto out_bus_unlock;
		}
	}
 
	/*
	 * The following block of code has to be executed atomically
	 * protected against a concurrent interrupt and any of the other
	 * management calls which are not serialized via
	 * desc->request_mutex or the optional bus lock.
	 */
	raw_spin_lock_irqsave(&desc->lock, flags);
	old_ptr = &desc->action;
	old = *old_ptr;
	if (old) {
		/*
		 * Can't share interrupts unless both agree to and are
		 * the same type (level, edge, polarity). So both flag
		 * fields must have IRQF_SHARED set and the bits which
		 * set the trigger type must match. Also all must
		 * agree on ONESHOT.
		 */
		unsigned int oldtype;
 
		/*
		 * If nobody did set the configuration before, inherit
		 * the one provided by the requester.
		 */
		if (irqd_trigger_type_was_set(&desc->irq_data)) {
			oldtype = irqd_get_trigger_type(&desc->irq_data);
		} else {
			oldtype = new->flags & IRQF_TRIGGER_MASK;
			irqd_set_trigger_type(&desc->irq_data, oldtype);
		}
 
		if (!((old->flags & new->flags) & IRQF_SHARED) ||
		    (oldtype != (new->flags & IRQF_TRIGGER_MASK)) ||
		    ((old->flags ^ new->flags) & IRQF_ONESHOT))
			goto mismatch;
 
		/* All handlers must agree on per-cpuness */
		if ((old->flags & IRQF_PERCPU) !=
		    (new->flags & IRQF_PERCPU))
			goto mismatch;
 
		/* add new interrupt at end of irq queue */
		do {
			/*
			 * Or all existing action->thread_mask bits,
			 * so we can find the next zero bit for this
			 * new action.
			 */
			thread_mask |= old->thread_mask;
			old_ptr = &old->next;
			old = *old_ptr;
		} while (old);
		shared = 1;
	}
 
	/*
	 * Setup the thread mask for this irqaction for ONESHOT. For
	 * !ONESHOT irqs the thread mask is 0 so we can avoid a
	 * conditional in irq_wake_thread().
	 */
	if (new->flags & IRQF_ONESHOT) {
		/*
		 * Unlikely to have 32 resp 64 irqs sharing one line,
		 * but who knows.
		 */
		if (thread_mask == ~0UL) {
			ret = -EBUSY;
			goto out_unlock;
		}
		/*
		 * The thread_mask for the action is or'ed to
		 * desc->thread_active to indicate that the
		 * IRQF_ONESHOT thread handler has been woken, but not
		 * yet finished. The bit is cleared when a thread
		 * completes. When all threads of a shared interrupt
		 * line have completed desc->threads_active becomes
		 * zero and the interrupt line is unmasked. See
		 * handle.c:irq_wake_thread() for further information.
		 *
		 * If no thread is woken by primary (hard irq context)
		 * interrupt handlers, then desc->threads_active is
		 * also checked for zero to unmask the irq line in the
		 * affected hard irq flow handlers
		 * (handle_[fasteoi|level]_irq).
		 *
		 * The new action gets the first zero bit of
		 * thread_mask assigned. See the loop above which or's
		 * all existing action->thread_mask bits.
		 */
		new->thread_mask = 1UL << ffz(thread_mask);
 
	} else if (new->handler == irq_default_primary_handler &&
		   !(desc->irq_data.chip->flags & IRQCHIP_ONESHOT_SAFE)) {
		/*
		 * The interrupt was requested with handler = NULL, so
		 * we use the default primary handler for it. But it
		 * does not have the oneshot flag set. In combination
		 * with level interrupts this is deadly, because the
		 * default primary handler just wakes the thread, then
		 * the irq lines is reenabled, but the device still
		 * has the level irq asserted. Rinse and repeat....
		 *
		 * While this works for edge type interrupts, we play
		 * it safe and reject unconditionally because we can't
		 * say for sure which type this interrupt really
		 * has. The type flags are unreliable as the
		 * underlying chip implementation can override them.
		 */
		pr_err("Threaded irq requested with handler=NULL and !ONESHOT for irq %d\n",
		       irq);
		ret = -EINVAL;
		goto out_unlock;
	}
 
	if (!shared) {
		init_waitqueue_head(&desc->wait_for_threads);
 
		/* Setup the type (level, edge polarity) if configured: */
		if (new->flags & IRQF_TRIGGER_MASK) {
			ret = __irq_set_trigger(desc,
						new->flags & IRQF_TRIGGER_MASK);
 
			if (ret)
				goto out_unlock;
		}
 
		/*
		 * Activate the interrupt. That activation must happen
		 * independently of IRQ_NOAUTOEN. request_irq() can fail
		 * and the callers are supposed to handle
		 * that. enable_irq() of an interrupt requested with
		 * IRQ_NOAUTOEN is not supposed to fail. The activation
		 * keeps it in shutdown mode, it merily associates
		 * resources if necessary and if that's not possible it
		 * fails. Interrupts which are in managed shutdown mode
		 * will simply ignore that activation request.
		 */
		ret = irq_activate(desc);
		if (ret)
			goto out_unlock;
 
		desc->istate &= ~(IRQS_AUTODETECT | IRQS_SPURIOUS_DISABLED | \
				  IRQS_ONESHOT | IRQS_WAITING);
		irqd_clear(&desc->irq_data, IRQD_IRQ_INPROGRESS);
 
		if (new->flags & IRQF_PERCPU) {
			irqd_set(&desc->irq_data, IRQD_PER_CPU);
			irq_settings_set_per_cpu(desc);
		}
 
		if (new->flags & IRQF_ONESHOT)
			desc->istate |= IRQS_ONESHOT;
 
		/* Exclude IRQ from balancing if requested */
		if (new->flags & IRQF_NOBALANCING) {
			irq_settings_set_no_balancing(desc);
			irqd_set(&desc->irq_data, IRQD_NO_BALANCING);
		}
 
		if (irq_settings_can_autoenable(desc)) {
			irq_startup(desc, IRQ_RESEND, IRQ_START_COND);
		} else {
			/*
			 * Shared interrupts do not go well with disabling
			 * auto enable. The sharing interrupt might request
			 * it while it's still disabled and then wait for
			 * interrupts forever.
			 */
			WARN_ON_ONCE(new->flags & IRQF_SHARED);
			/* Undo nested disables: */
			desc->depth = 1;
		}
 
	} else if (new->flags & IRQF_TRIGGER_MASK) {
		unsigned int nmsk = new->flags & IRQF_TRIGGER_MASK;
		unsigned int omsk = irqd_get_trigger_type(&desc->irq_data);
 
		if (nmsk != omsk)
			/* hope the handler works with current  trigger mode */
			pr_warn("irq %d uses trigger mode %u; requested %u\n",
				irq, omsk, nmsk);
	}
 
	*old_ptr = new;
 
	irq_pm_install_action(desc, new);
 
	/* Reset broken irq detection when installing new handler */
	desc->irq_count = 0;
	desc->irqs_unhandled = 0;
 
	/*
	 * Check whether we disabled the irq via the spurious handler
	 * before. Reenable it and give it another chance.
	 */
	if (shared && (desc->istate & IRQS_SPURIOUS_DISABLED)) {
		desc->istate &= ~IRQS_SPURIOUS_DISABLED;
		__enable_irq(desc);
	}
 
	raw_spin_unlock_irqrestore(&desc->lock, flags);
	chip_bus_sync_unlock(desc);
	mutex_unlock(&desc->request_mutex);
 
	irq_setup_timings(desc, new);
 
	/*
	 * Strictly no need to wake it up, but hung_task complains
	 * when no hard interrupt wakes the thread up.
	 */
	if (new->thread)
		wake_up_process(new->thread);
	if (new->secondary)
		wake_up_process(new->secondary->thread);
 
	register_irq_proc(irq, desc);
	new->dir = NULL;
	register_handler_proc(irq, new);
	return 0;
 
mismatch:
	if (!(new->flags & IRQF_PROBE_SHARED)) {
		pr_err("Flags mismatch irq %d. %08x (%s) vs. %08x (%s)\n",
		       irq, new->flags, new->name, old->flags, old->name);
#ifdef CONFIG_DEBUG_SHIRQ
		dump_stack();
#endif
	}
	ret = -EBUSY;
 
out_unlock:
	raw_spin_unlock_irqrestore(&desc->lock, flags);
 
	if (!desc->action)
		irq_release_resources(desc);
out_bus_unlock:
	chip_bus_sync_unlock(desc);
	mutex_unlock(&desc->request_mutex);
 
out_thread:
	if (new->thread) {
		struct task_struct *t = new->thread;
 
		new->thread = NULL;
		kthread_stop(t);
		put_task_struct(t);
	}
	if (new->secondary && new->secondary->thread) {
		struct task_struct *t = new->secondary->thread;
 
		new->secondary->thread = NULL;
		kthread_stop(t);
		put_task_struct(t);
	}
out_mput:
	module_put(desc->owner);
	return ret;
}

4.2、注册irqaction

（1）nested IRQ的处理代码

在看具体的代码之前，首先要理解什么是nested IRQ。nested IRQ不是cascade IRQ，在之前的代码中我们有描述过cascade IRQ这个概念，主要用在interrupt controller级联的情况下。给出一个具体的例子吧，具体的HW block请参考下图：

上图是一个两个GIC级联的例子，所有的HW block封装在了一个SOC chip中。为了方便描述，我们先进行编号：Secondary GIC的IRQ number是A，外设1的IRQ number是B，外设2的IRQ number是C。对于上面的硬件，Linux kernel处理如下：

（a）IRQ A的中断描述符被设定为不能注册irqaction（不能注册specific interrupt handler，或者叫中断服务程序）

（b）IRQ A的highlevel irq-events handler（处理interrupt flow control）负责进行IRQ number的映射，在其irq domain上翻译出具体外设的IRQ number，并重新定向到外设IRQ number对应的highlevel irq-events handler。

（c）所有外设驱动的中断正常request irq，可以任意选择线程化的handler，或者只注册primary handler。

需要注意的是，对root GIC和Secondary GIC寄存器的访问非常快，因此ack、mask、EOI等操作也非常快。

再看看另外一个interrupt controller级联的情况：

IO expander HW block提供了有中断功能的GPIO，因此它也是一个interrupt controller，有它自己的irq domain和irq chip。上图中外设1和外设2使用了IO expander上有中断功能的GPIO，它们有属于自己的IRQ number以及中断描述符。

IO expander HW block的IRQ line连接到SOC内部的interrupt controller上，因此，这也是一个interrupt controller级联的情况，对于这种情况，是否可以采用和上面GIC级联的处理方式呢？

不行，对于GIC级联的情况，如果secondary GIC上的外设1产生了中断，整个关中断的时间是IRQ A的中断描述符的highlevel irq-events handler处理时间＋IRQ B的的中断描述符的highlevel irq-events handler处理时间＋外设1的primary handler的处理时间。所幸对root GIC和Secondary GIC寄存器的访问非常快，因此整个关中断的时间也不是非常的长。但是如果是IO expander这个情况，如果采取和上面GIC级联的处理方式一样的话，关中断的时间非常长。

用外设1产生的中断为例子好了。这时候，由于IRQ B的的中断描述符的highlevel irq-events handler处理设计I2C的操作，因此时间非常的长，这时候，对于整个系统的实时性而言是致命的打击。

对这种硬件情况，Linux kernel处理如下：

（a）IRQ A的中断描述符的highlevel irq-events handler根据实际情况进行设定，并且允许注册irqaction。对于连接到IO expander上的外设，它是没有real time的要求的（否则也不会接到IO expander上），因此一般会进行线程化处理。由于threaded handler中涉及I2C操作，因此要设定IRQF_ONESHOT的flag。

（b）在IRQ A的中断描述符的threaded interrupt handler中进行进行IRQ number的映射，在IO expander irq domain上翻译出具体外设的IRQ number，并直接调用handle_nested_irq函数处理该IRQ。

（c）外设对应的中断描述符设定IRQ_NESTED_THREAD的flag，表明这是一个nested IRQ。nested IRQ没有highlevel irq-events handler，也没有primary handler，它的threaded interrupt handler是附着在其parent IRQ的threaded handler上的。

具体的nested IRQ的处理代码如下：

static int
__setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
{
...
 	/*
	 * Check whether the interrupt nests into another interrupt
	 * thread.
	 */
	nested = irq_settings_is_nested_thread(desc);
	if (nested) {
		if (!new->thread_fn) {
			ret = -EINVAL;
			goto out_mput;
		}
		/*
		 * Replace the primary handler which was provided from
		 * the driver for non nested interrupt handling by the
		 * dummy function which warns when called.
		 */
		new->handler = irq_nested_primary_handler;
	} else {
		 .....
	}
...
}

如果一个中断描述符是nested thread type的，说明这个中断描述符应该设定threaded interrupt handler（当然，内核是不会单独创建一个thread的，它是借着其parent IRQ的interrupt thread执行），否则就会出错返回。

对于primary handler，它应该没有机会被调用到，当然为了调试，kernel将其设定为irq_nested_primary_handler，以便在调用的时候打印一些信息，知道发生了什么状况。

（2）forced irq threading处理

具体的forced irq threading的处理代码如下：

static int
__setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
{
...  
	/*
	 * Check whether the interrupt nests into another interrupt
	 * thread.
	 */
	nested = irq_settings_is_nested_thread(desc);
	if (nested) {
	    .....
	} else {
		if (irq_settings_can_thread(desc)) {
			ret = irq_setup_forced_threading(new);
			if (ret)
				goto out_mput;
		}
	}
...
}

forced irq threading其实就是将系统中所有可以被线程化的中断handler全部线程化，即便在request irq的时候，设定的是primary handler，而不是threaded handler。当然那些不能被线程化的中断（标注了IRQF_NO_THREAD的中断，例如系统timer）还是排除在外的。

irq_settings_can_thread函数就是判断一个中断是否可以被线程化，如果可以的话，则调用irq_setup_forced_threading在set irq的时候强制进行线程化。具体代码如下：

kernel/irq/manage.c

  
static int irq_setup_forced_threading(struct irqaction *new)
{
	if (!force_irqthreads)            //见下面分析 ----a-----
		return 0;
	if (new->flags & (IRQF_NO_THREAD | IRQF_PERCPU | IRQF_ONESHOT))     //见下面分析 ----b-----
		return 0;
 
	/*
	 * No further action required for interrupts which are requested as
	 * threaded interrupts already
	 */
    /* 请求中断的中断不需要进一步的操作 */
	if (new->handler == irq_default_primary_handler)
		return 0;
 
	new->flags |= IRQF_ONESHOT;         //见下面分析 ----c-----
 
	/*
	 * Handle the case where we have a real primary handler and a
	 * thread handler. We force thread them as well by creating a
	 * secondary action.
	 */
    /* 处理我们有一个真正的主处理程序和一个线程处理程序的情况。 我们通过创建辅助操作强制线程化它们。 */
 
	if (new->handler && new->thread_fn) {     //见下面分析 （d）
		/* Allocate the secondary action */
		new->secondary = kzalloc(sizeof(struct irqaction), GFP_KERNEL);
		if (!new->secondary)
			return -ENOMEM;
		new->secondary->handler = irq_forced_secondary_handler;
		new->secondary->thread_fn = new->thread_fn;    //之前的线程,放到第二等级线程上
		new->secondary->dev_id = new->dev_id;
		new->secondary->irq = new->irq;
		new->secondary->name = new->name;
	}
	/* Deal with the primary handler */
	set_bit(IRQTF_FORCED_THREAD, &new->thread_flags);
	new->thread_fn = new->handler;                     //之前的主中断,放到线程上(第一等级)
	new->handler = irq_default_primary_handler;        //之前的主中断,用这默认的替代,这种执行一次
	return 0;
}

（a）系统中有一个强制线程化的选项：CONFIG_IRQ_FORCED_THREADING，如果没有打开该选项，force_irqthreads总是0，因此irq_setup_forced_threading也就没有什么作用，直接return了。如果打开了CONFIG_IRQ_FORCED_THREADING，说明系统支持强制线程化，但是具体是否对所有的中断进行强制线程化处理还是要看命令行参数threadirqs。

如果kernel启动的时候没有传入该参数，那么同样的，irq_setup_forced_threading也就没有什么作用，直接return了。只有bootloader向内核传入threadirqs这个命令行参数，内核才真正在启动过程中，进行各个中断的强制线程化的操作。

kernel/irq/manage.c


#if defined(CONFIG_IRQ_FORCED_THREADING) && !defined(CONFIG_PREEMPT_RT)
DEFINE_STATIC_KEY_FALSE(force_irqthreads_key);

static int __init setup_forced_irqthreads(char *arg)
{
	static_branch_enable(&force_irqthreads_key);
	return 0;
}
early_param("threadirqs", setup_forced_irqthreads);
#endif

（b）看到IRQF_NO_THREAD选项可能会奇怪，前面irq_settings_can_thread函数不是检查过了吗？为何还要重复检查？

其实一个中断是否可以进行线程化可以从两个层面看：一个是从底层看，也就是从中断描述符、从实际的中断硬件拓扑等方面看。另外一个是从中断子系统的用户层面看，也就是各个外设在注册自己的handler的时候是否想进行线程化处理。

所有的IRQF_XXX都是从用户层面看的flag，因此如果用户通过IRQF_NO_THREAD这个flag告知kernel，该interrupt不能被线程化，那么强制线程化的机制还是尊重用户的选择的。

PER CPU的中断都是一些较为特殊的中断，不是一般意义上的外设中断，因此对PER CPU的中断不强制进行线程化。IRQF_ONESHOT选项说明该中断已经被线程化了（而且是特殊的one shot类型的），因此也是直接返回了。

（c）强制线程化只对那些没有设定thread_fn的中断进行处理，这种中断将全部的处理放在了primary interrupt handler中（当然，如果中断处理比较耗时，那么也可能会采用bottom half的机制），由于primary interrupt handler是全程关闭CPU中断的，因此可能对系统的实时性造成影响，因此考虑将其强制线程化。struct irqaction中的thread_flags是和线程相关的flag，给它打上IRQTF_FORCED_THREAD的标签，表明该threaded handler是被强制threaded的。new->thread_fn = new->handler这段代码表示将原来primary handler中的内容全部放到threaded handler中处理，新的primary handler被设定为default handler。

kernel/irq/manage.c

 /*
 * Default primary interrupt handler for threaded interrupts. Is
 * assigned as primary handler when request_threaded_irq is called
 * with handler == NULL. Useful for oneshot interrupts.
 */
static irqreturn_t irq_default_primary_handler(int irq, void *dev_id)
{
	return IRQ_WAKE_THREAD;
}

（d）强制线程化是一个和实时性相关的选项强制线程化导致原来运行在中断上下文的primary handler现在运行在进程上下文，这有可能导致一些难以跟踪和定位的bug。

（3）创建interrupt线程。代码如下：kernel/irq/manage.c

static int
__setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
{
... 	
/*
	 * Create a handler thread when a thread function is supplied
	 * and the interrupt does not nest into another interrupt
	 * thread.
	 */
	if (new->thread_fn && !nested) {
		ret = setup_irq_thread(new, irq, false);
		if (ret)
			goto out_mput;
		if (new->secondary) {
			ret = setup_irq_thread(new->secondary, irq, true);
			if (ret)
				goto out_thread;
		}
	}
 
	/*
	 * Drivers are often written to work w/o knowledge about the
	 * underlying irq chip implementation, so a request for a
	 * threaded irq without a primary hard irq context handler
	 * requires the ONESHOT flag to be set. Some irq chips like
	 * MSI based interrupts are per se one shot safe. Check the
	 * chip flags, so we can avoid the unmask dance at the end of
	 * the threaded handler for those.
	 */
	if (desc->irq_data.chip->flags & IRQCHIP_ONESHOT_SAFE)
		new->flags &= ~IRQF_ONESHOT;
 
...
}
 
 
 
 
 
static int
setup_irq_thread(struct irqaction *new, unsigned int irq, bool secondary)
{
	struct task_struct *t;
	struct sched_param param = {
		.sched_priority = MAX_USER_RT_PRIO/2,
	};
 
	if (!secondary) {
		t = kthread_create(irq_thread, new, "irq/%d-%s", irq,
				   new->name);
	} else {
		t = kthread_create(irq_thread, new, "irq/%d-s-%s", irq,
				   new->name);
		param.sched_priority -= 1;
	}
 
	if (IS_ERR(t))
		return PTR_ERR(t);
 
	sched_setscheduler_nocheck(t, SCHED_FIFO, &param);
 
	/*
	 * We keep the reference to the task struct even if
	 * the thread dies to avoid that the interrupt code
	 * references an already freed task_struct.
	 */
	get_task_struct(t);
	new->thread = t;
	/*
	 * Tell the thread to set its affinity. This is
	 * important for shared interrupt handlers as we do
	 * not invoke setup_affinity() for the secondary
	 * handlers as everything is already set up. Even for
	 * interrupts marked with IRQF_NO_BALANCE this is
	 * correct as we want the thread to move to the cpu(s)
	 * on which the requesting code placed the interrupt.
	 */
	set_bit(IRQTF_AFFINITY, &new->thread_flags);
	return 0;
}

（a）调用kthread_create来创建一个内核线程，并调用sched_setscheduler_nocheck来设定这个中断线程的调度策略和调度优先级。

（b）调用get_task_struct可以为这个threaded handler的task struct增加一次reference count，这样，即便是该thread异常退出也可以保证它的task struct不会被释放掉。这可以保证中断系统的代码不会访问到一些被释放的内存。irqaction的thread 成员被设定为刚刚创建的task，这样，primary handler就知道唤醒哪一个中断线程了。

（c）设定IRQTF_AFFINITY的标志，在threaded handler中会检查该标志并进行IRQ affinity的设定。

（d）分配一个cpu mask的变量的内存，后面会使用到。

（4）共享中断的检查。代码如下：

static int
__setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
{
...  
	/*
	 * The following block of code has to be executed atomically
	 * protected against a concurrent interrupt and any of the other
	 * management calls which are not serialized via
	 * desc->request_mutex or the optional bus lock.
	 */
	raw_spin_lock_irqsave(&desc->lock, flags);
	old_ptr = &desc->action;
	old = *old_ptr;
	if (old) {
		/*
		 * Can't share interrupts unless both agree to and are
		 * the same type (level, edge, polarity). So both flag
		 * fields must have IRQF_SHARED set and the bits which
		 * set the trigger type must match. Also all must
		 * agree on ONESHOT.
		 */
		unsigned int oldtype;
 
		/*
		 * If nobody did set the configuration before, inherit
		 * the one provided by the requester.
		 */
		if (irqd_trigger_type_was_set(&desc->irq_data)) {
			oldtype = irqd_get_trigger_type(&desc->irq_data);
		} else {
			oldtype = new->flags & IRQF_TRIGGER_MASK;
			irqd_set_trigger_type(&desc->irq_data, oldtype);
		}
 
		if (!((old->flags & new->flags) & IRQF_SHARED) ||
		    (oldtype != (new->flags & IRQF_TRIGGER_MASK)) ||
		    ((old->flags ^ new->flags) & IRQF_ONESHOT))
			goto mismatch;
 
		/* All handlers must agree on per-cpuness */
		if ((old->flags & IRQF_PERCPU) !=
		    (new->flags & IRQF_PERCPU))
			goto mismatch;
 
		/* add new interrupt at end of irq queue */
		do {
			/*
			 * Or all existing action->thread_mask bits,
			 * so we can find the next zero bit for this
			 * new action.
			 */
			thread_mask |= old->thread_mask;
			old_ptr = &old->next;
			old = *old_ptr;
		} while (old);
		shared = 1;
	}
...
}

（a）old指向注册之前的action list，如果不是NULL，那么说明需要共享interrupt line。但是如果要共享，需要每一个irqaction都同意共享（IRQF_SHARED），每一个irqaction的触发方式相同（都是level trigger或者都是edge trigger），相同的oneshot类型的中断（都是one shot或者都不是），per cpu类型的相同中断（都是per cpu的中断或者都不是）。

（b）将该irqaction挂入队列的尾部。

（5）thread mask的设定。代码如下：

static int
__setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
{
...
	if (new->flags & IRQF_ONESHOT) {
		/*
		 * Unlikely to have 32 resp 64 irqs sharing one line,
		 * but who knows.
		 */
		if (thread_mask == ~0UL) {
			ret = -EBUSY;
			goto out_unlock;
		}

		new->thread_mask = 1UL << ffz(thread_mask);
 
	} else if (new->handler == irq_default_primary_handler &&
		   !(desc->irq_data.chip->flags & IRQCHIP_ONESHOT_SAFE)) {

		pr_err("Threaded irq requested with handler=NULL and !ONESHOT for irq %d\n",
		       irq);
		ret = -EINVAL;
		goto out_unlock;
	}
...
}

对于one shot类型的中断，我们还需要设定thread mask。如果一个one shot类型的中断只有一个threaded handler（不支持共享），那么事情就很简单（临时变量thread_mask等于0），该irqaction的thread_mask成员总是使用第一个bit来标识该irqaction。但是，如果支持共享的话，事情变得有点复杂。我们假设这个one shot类型的IRQ上有A，B和C三个irqaction，那么A，B和C三个irqaction的thread_mask成员会有不同的bit来标识自己。例如A的thread_mask成员是0x01，B的是0x02，C的是0x04，如果有更多共享的irqaction（必须是oneshot类型），那么其thread_mask成员会依次设定为0x08，0x10……

（a）在上面“共享中断的检查”这个section中，thread_mask变量保存了所有的属于该interrupt line的thread_mask，这时候，如果thread_mask变量如果是全1，那么说明irqaction list上已经有了太多的irq action（大于32或者64，和具体系统和编译器相关）。如果没有满，那么通过ffz函数找到第一个为0的bit作为该irq action的thread bit mask。

 /*
 * Default primary interrupt handler for threaded interrupts. Is
 * assigned as primary handler when request_threaded_irq is called
 * with handler == NULL. Useful for oneshot interrupts.
 */
static irqreturn_t irq_default_primary_handler(int irq, void *dev_id)
{
	return IRQ_WAKE_THREAD;
}

代码非常的简单，返回IRQ_WAKE_THREAD，让kernel唤醒threaded handler就OK了。使用irq_default_primary_handler虽然简单，

但是有一个风险：如果是电平触发的中断，需要操作外设的寄存器才可以让那个asserted的电平信号消失，否则它会一直持续。一般，直接在primary中操作外设寄存器（slow bus类型的interrupt controller不行），尽早的clear interrupt，但是，对于irq_default_primary_handler，它仅仅是wakeup了threaded interrupt handler，并没有clear interrupt，这样，执行完了primary handler，外设中断仍然是asserted，一旦打开CPU中断，立刻触发下一次的中断，然后不断的循环。因此，如果注册中断的时候没有指定primary interrupt handler，并且没有设定IRQF_ONESHOT，那么系统是会报错的。当然，有一种情况可以豁免，当底层的irq chip是one shot safe的（IRQCHIP_ONESHOT_SAFE）。