第5章:并发与竞态条件-0:并发与竞态条件

In continuation of the previous text第三章:字符设备驱动-12:read and write let's GO ahead.

Concurrency and Race Conditions

Thus far, we have paid little attention to the problem of concurrency—i.e., what happens when the system tries to do more than one thing at once. The management of concurrency is, however, one of the core problems in operating systems programming. Concurrency-related bugs are some of the easiest to create and some of the hardest to find. Even expert Linux kernel programmers end up creating concurrencyrelated bugs on occasion.

到目前为止,我们很少关注并发问题 —— 即当系统试图同时执行多个操作时会发生什么。然而,并发管理是操作系统编程的核心问题之一。与并发相关的漏洞是最容易产生且最难发现的,即使是资深的 Linux 内核程序员也偶尔会写出与并发相关的错误代码。

In early Linux kernels, there were relatively few sources of concurrency. Symmetric multiprocessing (SMP) systems were not supported by the kernel, and the only cause of concurrent execution was the servicing of hardware interrupts. That approach offers simplicity, but it no longer works in a world that prizes performance on systems with more and more processors, and that insists that the system respond to events quickly. In response to the demands of modern hardware and applications, the Linux kernel has evolved to a point where many more things are going on simultaneously. This evolution has resulted in far greater performance and scalability. It has also, however, significantly complicated the task of kernel programming. Device driver programmers must now factor concurrency into their designs from the beginning, and they must have a strong understanding of the facilities provided by the ker-
nel for concurrency management.

在早期的 Linux 内核中,并发的并发源相对较少。当时内核不支持对称多处理(SMP)系统,并发执行的唯一原因是硬件中断服务。这种方式虽然简单,但在如今这个追求多处理器系统性能且要求系统快速响应事件的时代已经不再适用。为了满足现代硬件和应用的需求,Linux 内核已经发展到可以同时处理更多任务的程度。这种演变带来了显著的性能提升和可扩展性,但也大大增加了内核编程的复杂性。设备驱动程序员现在必须从设计之初就考虑并发因素,并且必须深入理解内核提供的并发管理工具。

The purpose of this chapter is to begin the process of creating that understanding. To that end, we introduce facilities that are immediately applied to the scull driver from Chapter 3. Other facilities presented here are not put to use for some time yet. But first, we take a look at what could go wrong with our simple scull driver and how to avoid these potential problems. 

本章的目的是帮助读者建立这种理解。为此,我们将介绍一些工具,并立即将它们应用到第 3 章的 scull 驱动中。这里介绍的其他工具虽然暂时不会用到,但也很重要。不过,我们首先要看看简单的 scull 驱动可能会出现什么问题,以及如何避免这些潜在问题。

Pitfalls in scull

Let us take a quick look at a fragment of the scull memory management code. Deep
down inside the write logic, scull must decide whether the memory it requires has
been allocated yet or not. One piece of the code that handles this task is:

让我们快速查看 scull 内存管理代码的一个片段。在写入逻辑的深处,scull 必须判断其所需的内存是否已分配。处理此任务的一段代码如下:

if (!dptr->data[s_pos]) {
    dptr->data[s_pos] = kmalloc(quantum, GFP_KERNEL);
    if (!dptr->data[s_pos])
        goto out;
}

Suppose for a moment that two processes (we’ll call them “A” and “B”) are independently attempting to write to the same offset within the same scull device. Each process reaches the if test in the first line of the fragment above at the same time. If the pointer in question is NULL, each process will decide to allocate memory, and each will assign the resulting pointer to dptr->data[s_pos]. Since both processes are assigning to the same location, clearly only one of the assignments will prevail.

假设两个进程(我们称之为 “A” 和 “B”)正在独立尝试写入同一 scull 设备的同一偏移量。每个进程同时执行到上述片段第一行的 if 判断。如果该指针为 NULL,两个进程都会决定分配内存,并都将分配结果的指针赋值给 dptr->data[s_pos]。由于两个进程都在对同一位置赋值,显然只有其中一个赋值会生效。

What will happen, of course, is that the process that completes the assignment second will “win.” If process A assigns first, its assignment will be overwritten by process B. At that point, scull will forget entirely about the memory that A allocated; it only has a pointer to B’s memory. The memory allocated by A, thus, will be dropped and never returned to the system. 

实际结果是,第二个完成赋值的进程会 “胜出”。如果进程 A 先赋值,其赋值会被进程 B 覆盖。此时,scull 会完全忘记进程 A 分配的内存(只保留指向 B 分配的内存的指针),A 分配的内存因此会被丢弃,永远无法归还给系统。

This sequence of events is a demonstration of a race condition. Race conditions are a result of uncontrolled access to shared data. When the wrong access pattern happens, something unexpected results. For the race condition discussed here, the resul is a memory leak. That is bad enough, but race conditions can often lead to system crashes, corrupted data, or security problems as well. Programmers can be tempted to disregard race conditions as extremely low probability events. But, in the computing world, one-in-a-million events can happen every few seconds, and the consequences can be grave.

这种事件序列就是竞态条件的一个示例。竞态条件是对共享数据的非受控访问导致的结果:当错误的访问模式发生时,会产生意外结果。对于此处讨论的竞态条件,结果是内存泄漏 —— 这已经足够糟糕,但竞态条件通常还会导致系统崩溃、数据损坏或安全问题。程序员可能会倾向于忽视竞态条件,认为它们是概率极低的事件。但在计算领域,百万分之一概率的事件可能每几秒就会发生一次,其后果可能非常严重。

We will eliminate race conditions from scull shortly, but first we need to take a more general view of concurrency.

我们很快就会消除 scull 中的竞态条件,但首先需要对并发作更全面的了解。

评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

DeeplyMind

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值