Linux 音频子系统分析7

原创于 2024-12-25 14:54:05 发布 · 2.2k 阅读

36 ·

CC 4.0 BY-SA版权

文章标签：

#linux #音视频 #运维

Linux内核子系统专栏收录该内容

199 篇文章

订阅专栏

Linux 音频子系统分析7（基于Linux6.6）---ALSA pcm介绍

一、PCM

2024/12/25 14:41:23

在 Linux 系统中，PCM（Pulse Code Modulation）是音频数据传输的常用格式和协议。它是音频信号的数字表示方式，是音频处理、存储和传输中最基础的形式。PCM 本质上是将模拟音频信号离散化并数字化，存储为一系列的数字样本。

1.1、PCM 的基本概念

Pulse Code Modulation (PCM) 是一种用数字信号表示模拟声音的方式。它通过对音频信号进行采样（在时间上离散化）并量化（将模拟信号的幅度转换为数字信号）来实现这一过程。

采样率 (Sample Rate)：定义每秒采样的次数。例如，44.1 kHz 意味着每秒对音频信号采样 44,100 次。常见的采样率有 44.1 kHz、48 kHz 和 96 kHz。
量化深度 (Bit Depth)：定义每个采样值的数字精度。常见的量化深度为 16 位、24 位或 32 位。量化深度越高，音频信号的动态范围越大，音质越好。
通道数 (Channels)：表示音频的通道数量。常见的有单声道（Mono）和立体声（Stereo），但也可以支持更多通道，如 5.1 声道或 7.1 声道。

1.2、PCM 在 Linux 系统中的应用

在 Linux 系统中，PCM 被广泛应用于音频输入输出（I/O）过程中，并且与 ALSA（Advanced Linux Sound Architecture） 系统紧密结合。ALSA 提供了一个高级音频接口，它支持 PCM 数据的播放和录制。PCM 数据流的管理是音频系统中最基本的部分之一。

1.PCM 流程的基本架构

音频采样：首先，模拟音频信号通过 ADC（模数转换器）转换为数字信号。这个过程涉及对模拟音频信号进行周期性采样，并为每个采样点赋予一个特定的数字值（即量化）。
PCM 编码：这些采样值通常会以一定的格式存储。常见的 PCM 格式包括线性 PCM（Linear PCM，LPCM）、压缩 PCM（如 ADPCM）等。
PCM 播放和录制：在播放音频时，PCM 数据通过 DAC（数模转换器）转换回模拟信号；而在录制音频时，模拟信号通过 ADC 转换为 PCM 格式的数据。

2.ALSA 中的 PCM 设备

在 ALSA 中，PCM 设备表示音频硬件设备上的输入或输出通道。它们通过一系列的硬件抽象层接口来处理 PCM 数据流。PCM 设备通常通过 pcm 设备文件与用户空间进行交互。

ALSA 中的 PCM 设备通常会暴露为 /dev/snd/pcmC0D0p 这样的文件路径，表示音频卡（C0）的一个 PCM 通道（D0）。这个设备文件可以被用户空间的应用程序用来播放或录制音频数据。

1.3、ALSA 中的 PCM 结构

ALSA 使用 PCM 设备来控制音频的播放和录制。主要涉及以下几个概念：

PCM Substream：每个 PCM 设备可以支持多个子流（substreams），如播放和录制流。每个子流表示一个独立的音频数据流，可以是输入（录音）流或者输出（播放）流。
PCM Hardware Parameters：硬件参数定义了 PCM 设备的采样率、样本格式（如 16 位、24 位等）和通道数。硬件参数是在打开 PCM 设备时设置的。
PCM Software Parameters：软件参数通常定义了缓冲区大小、缓存区数量等。这些参数影响音频流的缓冲和调度。

一个简单的 PCM 数据流的例子包括：

播放 PCM：用户空间应用程序将 PCM 数据写入到 PCM 输出设备，音频硬件设备播放音频信号。
录制 PCM：音频硬件设备将音频信号采集并转换成 PCM 数据，传输到用户空间应用程序进行处理。

1.4、PCM 数据流的配置与控制

在 ALSA 中，PCM 数据流的配置与控制通常通过以下步骤进行：

打开 PCM 设备：通过 snd_pcm_open 函数打开 PCM 设备，指定播放或录制模式。
设置硬件参数：使用 snd_pcm_hw_params_set_* 函数设置硬件参数，包括采样率、样本格式、通道数等。
设置软件参数：使用 snd_pcm_sw_params_set_* 函数设置软件参数，定义缓冲区大小、延迟等。
读写数据：使用 snd_pcm_readi 或 snd_pcm_writei 函数从 PCM 输入设备读取数据或将数据写入到 PCM 输出设备。
关闭 PCM 设备：完成音频操作后，使用 snd_pcm_close 关闭设备。

1.5、PCM 格式和类型

在 Linux 中，常见的 PCM 格式主要包括：

S16_LE (Signed 16-bit Little Endian)：表示 16 位有符号的 PCM 数据，按小端存储。
S32_LE (Signed 32-bit Little Endian)：表示 32 位有符号的 PCM 数据，按小端存储。
U8 (Unsigned 8-bit)：表示 8 位无符号的 PCM 数据。
Float：表示浮点数格式的 PCM 数据，常用于高精度音频处理。

1.6、PCM 音频传输的协议

PCM 数据通常通过以下几种协议进行传输：

I2S (Inter-IC Sound)：I2S 是一种常见的串行总线协议，用于连接音频编解码器（Codec）和处理器（CPU）。它支持同步的数字音频数据传输，广泛应用于嵌入式系统中。
PCM (Pulse Code Modulation)：直接通过 PCM 总线传输音频数据。
TDM (Time Division Multiplexing)：时间分复用传输音频数据，允许多通道音频数据在同一总线上传输。

1.7、PCM 的实际应用

音频播放：PCM 数据被用来播放音频文件（如 WAV 或 PCM 格式的音频文件），这些文件通常包含未经压缩的音频数据。使用 aplay 命令可以播放 PCM 格式的音频文件。
音频录制：音频采集设备（如麦克风）将模拟信号转换为 PCM 数据，通常通过 arecord 命令进行录制。
实时音频处理：在一些实时应用中，PCM 数据可以通过流媒体技术进行传输，进行动态处理（如混音、特效等）。

二、PCM中间层

ALSA已经实现了功能强劲的PCM中间层，自己的驱动中只要实现一些底层的需要访问硬件的函数即可。

要访问PCM的中间层代码，首先要包含头文件<sound/pcm.h>，另外如果需要访问一些与 hw_param相关的函数,可能也要包含<sound/pcm_params.h>。

每个声卡最多可以包含4个pcm的实例，每个pcm实例对应一个pcm设备文件。pcm实例数量的这种限制源于linux设备号所占用的位大小，如果以后使用64位的设备号，我们将可以创建更多的pcm实例。不过大多数情况下，在嵌入式设备中，一个pcm实例已经足够了。

一个pcm实例由一个playback stream和一个capture stream组成，这两个stream又分别有一个或多个substreams组成。

在嵌入式系统中，大多数情况下是一个声卡，一个pcm实例，pcm下面有一个playback和capture stream，playback和capture下面各自有一个substream。

一个pcm实例(例如pcm0)是card下的一个逻辑设备, 这个逻辑设备会在用户空间创建两个设备节点.
一个pcm实例包含两个stream : playback & capture. 每个stream对应一个设备节点.
每个stream下可包含多个substream.

在内核层, 每个substream都有一块自己的Buffer来与用户空间交换音频数据. 从这个角度来看, substream存在的意义貌似是为了分时复用底层的音频硬件。

在用户空间, 每个设备节点可以被open多次, 每次open内核层都会找到一个空闲的substream与之对应, 如果内核层的substream被用完了, 则此次open操作会失败. 这样看来, 用户空间的读、写、控制操作都是针对substream进行的, 这也进一步说明substream可以用来分时复用底层音频硬件。

三、数据结构

snd_pcm是挂在snd_card下面的一个snd_device；

snd_pcm中的字段:streams[2],该数组中的两个元素指向两个snd_pcm_str结构,分别代表playback stream和capture stream；
snd_pcm_str中的substream字段,指向snd_pcm_substream结构；
snd_pcm_substream是pcm中间层的核心，绝大部分任务都是在substream中处理，尤其是他的ops(snd_pcm_ops)字段,许多user空间的应用程序通过alsa-lib对驱动程序的请求都是由该结构中的函数处理。它的runtime字段则指向snd_pcm_runtime结构，snd_pcm_runtime记录这substream的一些重要的软件和硬件运行环境和参数。

3.1、struct snd_pcm (代表一个pcm实例, 也是代表一个pcm逻辑设备)

在ALSA架构下，pcm也被称为设备，所谓的逻辑设备。在linux系统中使用snd_pcm结构表示一个pcm设备。

include/sound/pcm.h

struct snd_pcm {
	struct snd_card *card;
	struct list_head list;
	int device; /* device number */
	unsigned int info_flags;
	unsigned short dev_class;
	unsigned short dev_subclass;
	char id[64];
	char name[80];
	struct snd_pcm_str streams[2];
	struct mutex open_mutex;
	wait_queue_head_t open_wait;
	void *private_data;
	void (*private_free) (struct snd_pcm *pcm);
	bool internal; /* pcm is for internal use only */
	bool nonatomic; /* whole PCM operations are in non-atomic context */
	bool no_device_suspend; /* don't invoke device PM suspend */
#if IS_ENABLED(CONFIG_SND_PCM_OSS)
	struct snd_pcm_oss oss;
#endif
};

3.2、struct snd_pcm_str (代表一个pcm stream)

include/sound/pcm.h

struct snd_pcm_str {
	int stream;				/* stream (direction) */
	struct snd_pcm *pcm;
	/* -- substreams -- */
	unsigned int substream_count;
	unsigned int substream_opened;
	struct snd_pcm_substream *substream;
#if IS_ENABLED(CONFIG_SND_PCM_OSS)
	/* -- OSS things -- */
	struct snd_pcm_oss_stream oss;
#endif
#ifdef CONFIG_SND_VERBOSE_PROCFS
	struct snd_info_entry *proc_root;
#ifdef CONFIG_SND_PCM_XRUN_DEBUG
	unsigned int xrun_debug;	/* 0 = disabled, 1 = verbose, 2 = stacktrace */
#endif
#endif
	struct snd_kcontrol *chmap_kctl; /* channel-mapping controls */
	struct device dev;
};

dev : 一个steam对应一个字符设备节点. 这的dev与创建字符设备节点有关.
substream_count : 该stream下属的substream的个数.
substream_opened : 有多少个substream已经被用户空间open了. 用户空间可以针对同一个设备节点open多次, 每次open内核层都会选一个空闲的substream与之对应. 如果所有的substream都被opened, 则新的open会失败.
substream : 用链表的形式串联多个substream.

3.3.struct snd_pcm_file

include/sound/pcm.h

每个被open的substream对应一个snd_pcm_file.

struct snd_pcm_file {
	struct snd_pcm_substream *substream;
	int no_compat_mmap;
	unsigned int user_pversion;	/* supported protocol version */
};

3.4、struct snd_pcm_substream

代表一个pcm substream. substream的一个重要功能就是要准备一块DMA buffer, 以便与用户空间交换数据.

include/sound/pcm.h

struct snd_pcm_substream {
	struct snd_pcm *pcm;
	struct snd_pcm_str *pstr;
	void *private_data;		/* copied from pcm->private_data */
	int number;
	char name[32];			/* substream name */
	int stream;			/* stream (direction) */
	struct pm_qos_request latency_pm_qos_req; /* pm_qos request */
	size_t buffer_bytes_max;	/* limit ring buffer size */
	struct snd_dma_buffer dma_buffer;
	size_t dma_max;
	/* -- hardware operations -- */
	const struct snd_pcm_ops *ops;
	/* -- runtime information -- */
	struct snd_pcm_runtime *runtime;
        /* -- timer section -- */
	struct snd_timer *timer;		/* timer */
	unsigned timer_running: 1;	/* time is running */
	long wait_time;	/* time in ms for R/W to wait for avail */
	/* -- next substream -- */
	struct snd_pcm_substream *next;
	/* -- linked substreams -- */
	struct list_head link_list;	/* linked list member */
	struct snd_pcm_group self_group;	/* fake group for non linked substream (with substream lock inside) */
	struct snd_pcm_group *group;		/* pointer to current group */
	/* -- assigned files -- */
	int ref_count;
	atomic_t mmap_count;
	unsigned int f_flags;
	void (*pcm_release)(struct snd_pcm_substream *);
	struct pid *pid;
#if IS_ENABLED(CONFIG_SND_PCM_OSS)
	/* -- OSS things -- */
	struct snd_pcm_oss_substream oss;
#endif
#ifdef CONFIG_SND_VERBOSE_PROCFS
	struct snd_info_entry *proc_root;
#endif /* CONFIG_SND_VERBOSE_PROCFS */
	/* misc flags */
	unsigned int hw_opened: 1;
	unsigned int managed_buffer_alloc:1;
};

3.5、struct snd_dma_buffer

用于描述一块DMA buffer.

include/sound/memalloc.h

struct snd_dma_buffer {
	struct snd_dma_device dev;	/* device type */
	unsigned char *area;	/* virtual pointer */
	dma_addr_t addr;	/* physical address */
	size_t bytes;		/* buffer size in bytes */
	void *private_data;	/* private for allocator; don't touch */
};

area : buffer的虚拟地址, 供CPU访问buffer时使用.
addr : buffer的物理地址, 供DMA访问buffer时使用.
bytes : buffer的大小.

3.6、struct snd_pcm_ops

PCM中间层定义的需要底层驱动实现的接口函数, 相当于interface. 中间层在恰当的时候会回调这些接口函数. 从底层驱动的角度来说, 绝大部分工作就是实现这个ops定义的函数(一般只需实现部分, 其它的中间层都有默认实现. 除非中间层的实现在自己的硬件上用不了, 才需要我们自己实现.), 然后向PCM中间层‘注册’即可.

include/sound/pcm.h

struct snd_pcm_ops {
	int (*open)(struct snd_pcm_substream *substream);
	int (*close)(struct snd_pcm_substream *substream);
	int (*ioctl)(struct snd_pcm_substream * substream,
		     unsigned int cmd, void *arg);
	int (*hw_params)(struct snd_pcm_substream *substream,
			 struct snd_pcm_hw_params *params);
	int (*hw_free)(struct snd_pcm_substream *substream);
	int (*prepare)(struct snd_pcm_substream *substream);
	int (*trigger)(struct snd_pcm_substream *substream, int cmd);
	snd_pcm_uframes_t (*pointer)(struct snd_pcm_substream *substream);
	int (*get_time_info)(struct snd_pcm_substream *substream,
			struct timespec *system_ts, struct timespec *audio_ts,
			struct snd_pcm_audio_tstamp_config *audio_tstamp_config,
			struct snd_pcm_audio_tstamp_report *audio_tstamp_report);
	int (*fill_silence)(struct snd_pcm_substream *substream, int channel,
			    unsigned long pos, unsigned long bytes);
	int (*copy_user)(struct snd_pcm_substream *substream, int channel,
			 unsigned long pos, void __user *buf,
			 unsigned long bytes);
	int (*copy_kernel)(struct snd_pcm_substream *substream, int channel,
			   unsigned long pos, void *buf, unsigned long bytes);
	struct page *(*page)(struct snd_pcm_substream *substream,
			     unsigned long offset);
	int (*mmap)(struct snd_pcm_substream *substream, struct vm_area_struct *vma);
	int (*ack)(struct snd_pcm_substream *substream);
};

3.7、snd_pcm_new

int snd_pcm_new(struct snd_card *card, const char *id, int device,
		int playback_count, int capture_count,
		struct snd_pcm **rpcm);

函数实现：

构建一个struct snd_pcm数据结构来代表一个PCM实例
调用snd_pcm_new_stream(pcm, SNDRV_PCM_STREAM_PLAYBACK, playback_count)构建playback stream, 并创建playback_count个substream.
调用snd_pcm_new_stream(pcm, SNDRV_PCM_STREAM_CAPTURE, capture_count)构建capture stream, 并创建capture_count个substream.
最后, 调用snd_device_new把这个实例作为一个逻辑设备添加到card->devices链表下.

在card被注册之前, 需要调用snd_pcm_set_ops为此PCM实例设置回调函数, 因为当用户空间通过设备节点与PCM中间层交互时, PCM中间层需要回调底层驱动实现的ops函数.

3.8、设置pcm操作函数接口

void snd_pcm_set_ops(struct snd_pcm * pcm, int direction,
		     const struct snd_pcm_ops *ops);

3.9、PCM字符设备的创建

当card被注册时, 会扫描下属的每个逻辑设备并注册它们, 这里创建的PCM逻辑设备也会在那时进行注册. 当PCM逻辑设备被注册时, ALSA系统层会回调逻辑设备的snd_device_ops. dev_register函数, 也就是snd_pcm_dev_register. 在该回调函数中, 会针对每一个stream调用snd_register_device, 进而在用户空间创建对应的设备节点.

PCM中间层的snd_pcm_f_ops会负责与用户空间交互, 其主要功能包括:

open / release : 打开或者关闭某substream.
read / write / mmap : 用户空间与PCM中间层交换音频数据.
ioctl : 提供各种各样的控制接口.

四、pcm设备创建完成逻辑图

+-------------------+
|   初始化 ALSA 系统 |
+-------------------+
        |
        v
+---------------------------+
|    打开 PCM 设备 (snd_pcm_open)  |
|   - 指定设备名和方向(输入/输出) |
+---------------------------+
        |
        v
+-------------------------------+
|   设置硬件参数 (snd_pcm_hw_params) |
|   - 采样率 (Sample Rate)        |
|   - 样本格式 (Sample Format)    |
|   - 通道数 (Channels)           |
|   - 缓冲区大小等参数 (Buffer Size) |
+-------------------------------+
        |
        v
+-------------------------------+
|   设置软件参数 (snd_pcm_sw_params) |
|   - 缓冲区数量 (Buffer Count)     |
|   - 延迟时间 (Latency)          |
|   - 写入和读取操作延迟等         |
+-------------------------------+
        |
        v
+---------------------------+
|  分配 PCM 数据缓冲区 (PCM Buffer) |
|  - 分配缓冲区内存               |
+---------------------------+
        |
        v
+---------------------------+
|  播放/录制 PCM 数据           |
|  - 使用 snd_pcm_writei 或     |
|    snd_pcm_readi 进行数据读写 |
+---------------------------+
        |
        v
+---------------------------+
|    关闭 PCM 设备 (snd_pcm_close)  |
+---------------------------+

五、整个流程梳理

5.1、ALSA 整体流程梳理图

+----------------------------+
|    初始化 ALSA 系统         |
|   - 加载驱动、初始化库       |
+----------------------------+
             |
             v
+----------------------------+
|    打开音频设备 (snd_pcm_open) |
|   - 设备选择：输入/输出方向  |
|   - 打开 PCM 设备            |
+----------------------------+
             |
             v
+----------------------------+
|    设置硬件参数 (snd_pcm_hw_params)  |
|   - 配置采样率、样本格式、    |
|     通道数、缓冲区大小等      |
+----------------------------+
             |
             v
+----------------------------+
|    设置软件参数 (snd_pcm_sw_params)  |
|   - 配置延迟、缓冲区数量等    |
+----------------------------+
             |
             v
+----------------------------+
|    分配 PCM 缓冲区           |
|   - 内存分配、缓冲区初始化    |
+----------------------------+
             |
             v
+----------------------------+
|   播放/录制音频数据          |
|   - 写入/读取 PCM 数据       |
|   - 使用 snd_pcm_writei/     |
|     snd_pcm_readi 函数       |
+----------------------------+
             |
             v
+----------------------------+
|    关闭音频设备 (snd_pcm_close) |
|   - 清理资源、关闭设备       |
+----------------------------+

5.2、详细解释各个步骤

初始化 ALSA 系统：
- 在 Linux 系统中，ALSA 库通常会在应用程序启动时加载，自动初始化音频设备所需的驱动和资源。这包括加载相关的驱动程序和音频库，创建用于与硬件交互的接口。
打开音频设备（snd_pcm_open）：
- 使用 snd_pcm_open 打开一个音频设备，指定设备名称（如 hw:0）和操作方向（输入或输出）。这个步骤会选择一个音频设备并为后续操作建立连接。
- 设备名称是一个字符串，如 hw:0，表示设备的硬件接口。
设置硬件参数（snd_pcm_hw_params）：
- 通过 snd_pcm_hw_params 设置音频硬件的各种参数：
  - 采样率（如 44.1kHz, 48kHz 等）。
  - 样本格式（如 16-bit, 24-bit 等）。
  - 通道数（单声道、立体声等）。
  - 缓冲区大小和采样周期等。
- 硬件参数的设置是非常重要的，因为它们直接决定了音频输出的质量和性能。
设置软件参数（snd_pcm_sw_params）：
- 软件参数配置与硬件配置类似，但它侧重于操作系统层面的控制：
  - 设置数据流的延迟（如音频流的输入/输出延迟）。
  - 设置缓冲区数量，确保数据流不会丢失。
  - 设置音频流的其他控制参数，如同步和数据传输的时间等。
分配 PCM 缓冲区：
- 在这个步骤中，系统会分配内存作为 PCM 缓冲区，用于存储音频数据。缓冲区的大小和数量会根据硬件和软件参数进行设置。
- 这一步的目标是确保音频数据可以平稳地从应用程序传输到硬件设备，或者从设备读取回应用程序。
播放/录制音频数据：
- 一旦设备打开并且硬件/软件参数设置完成，音频数据可以通过 snd_pcm_writei（用于写入音频数据到设备）或 snd_pcm_readi（用于从设备读取音频数据）进行传输。
- 播放过程通过向 PCM 设备写入数据，而录制过程则是从 PCM 设备读取数据。
- 数据流通常是在循环中进行处理，实时读取或写入。
关闭音频设备（snd_pcm_close）：
- 在音频操作完成后，调用 snd_pcm_close 函数关闭设备。这将释放所有与设备相关的资源，并确保设备的状态被清理。
- 这个步骤是必要的，以避免内存泄漏和其他资源问题。