WAVE PCM soundfile format

本文详细解析了WAVE PCM格式的基本构成,包括RIFF头部、格式标识、子块描述等关键部分,以及如何理解8位和16位样本存储方式。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

WAVE PCM 格式分析:



The canonical WAVE format starts with the RIFF header:

0         4   ChunkID          Contains the letters "RIFF" in ASCII form
                               (0x52494646 big-endian form).
4         4   ChunkSize        36 + SubChunk2Size, or more precisely:
                               4 + (8 + SubChunk1Size) + (8 + SubChunk2Size)
                               This is the size of the rest of the chunk 
                               following this number.  This is the size of the 
                               entire file in bytes minus 8 bytes for the
                               two fields not included in this count:
                               ChunkID and ChunkSize.
8         4   Format           Contains the letters "WAVE"
                               (0x57415645 big-endian form).

The "WAVE" format consists of two subchunks: "fmt " and "data":
The "fmt " subchunk describes the sound data's format:

12        4   Subchunk1ID      Contains the letters "fmt "
                               (0x666d7420 big-endian form).
16        4   Subchunk1Size    16 for PCM.  This is the size of the
                               rest of the Subchunk which follows this number.
20        2   AudioFormat      PCM = 1 (i.e. Linear quantization)
                               Values other than 1 indicate some 
                               form of compression.
22        2   NumChannels      Mono = 1, Stereo = 2, etc.
24        4   SampleRate       8000, 44100, etc.
28        4   ByteRate         == SampleRate * NumChannels * BitsPerSample/8
32        2   BlockAlign       == NumChannels * BitsPerSample/8
                               The number of bytes for one sample including
                               all channels. I wonder what happens when
                               this number isn't an integer?
34        2   BitsPerSample    8 bits = 8, 16 bits = 16, etc.
          2   ExtraParamSize   if PCM, then doesn't exist
          X   ExtraParams      space for extra parameters

The "data" subchunk contains the size of the data and the actual sound:

36        4   Subchunk2ID      Contains the letters "data"
                               (0x64617461 big-endian form).
40        4   Subchunk2Size    == NumSamples * NumChannels * BitsPerSample/8
                               This is the number of bytes in the data.
                               You can also think of this as the size
                               of the read of the subchunk following this 
                               number.
44        *   Data             The actual sound data.
 

看以下例子分析:

 
 
 

Notes:

  • The default byte ordering assumed for WAVE data files is little-endian. Files written using the big-endian byte ordering scheme have the identifier RIFX instead of RIFF.
  • The sample data must end on an even byte boundary. Whatever that means.
  • 8-bit samples are stored as unsigned bytes, ranging from 0 to 255. 16-bit samples are stored as 2's-complement signed integers, ranging from -32768 to 32767.
  • There may be additional subchunks in a Wave data stream. If so, each will have a char[4] SubChunkID, and unsigned long SubChunkSize, and SubChunkSize amount of data.
  • RIFF stands for Resource Interchange File Format.

General discussion of RIFF files:

Multimedia applications require the storage and management of a wide variety of data, including bitmaps, audio data, video data, and peripheral device control information. RIFF provides a way to store all these varied types of data. The type of data a RIFF file contains is indicated by the file extension. Examples of data that may be stored in RIFF files are:
  • Audio/visual interleaved data (.AVI)
  • Waveform data (.WAV)
  • Bitmapped data (.RDI)
  • MIDI information (.RMI)
  • Color palette (.PAL)
  • Multimedia movie (.RMN)
  • Animated cursor (.ANI)
  • A bundle of other RIFF files (.BND)
NOTE: At this point, AVI files are the only type of RIFF files that have been fully implemented using the current RIFF specification. Although WAV files have been implemented, these files are very simple, and their developers typically use an older specification in constructing them.

For more info see http://www.ora.com/centers/gff/formats/micriff/index.htm

References:

  1. http://netghost.narod.ru/gff/graphics/summary/micriff.htm RIFF Format Reference (good).
  2. http://www.lightlink.com/tjweber/StripWav/WAVE.html

在C++中将文本文件存储的音频采样数据转换为PCM格式,首先你需要读取文本文件,该文件可能包含了以某种格式(例如逗号分隔的数值或者其他自定格式)存储的一系列音频样本值。然后,你可以按照以下步骤操作: 1. **打开并读取文本文件**: 使用`fstream`库,打开文件并逐行读取数据,将其存储在一个数组或其他合适的数据结构中。 ```cpp #include <fstream> #include <vector> std::vector<int16_t> audio_samples; std::ifstream file("audio_data.txt", std::ios::binary); ``` 2. **解析数据**: 根据文本文件中的数据格式,解析每行数据为单个或多个音频样本值,并存储到数组中。例如,如果文件是以逗号分隔的16位整数,可以这样做: ```cpp int sample; while (file >> sample) { audio_samples.push_back(sample); } ``` 3. **处理数据**: 如果需要,根据音频的原始比特深度(比如16位),可能还需要做一些数据类型的转换。对于16位PCM,样本通常是二进制补码表示的,可能需要进行移位和调整范围。 4. **保存为PCM文件**: 转换完成后,可以创建一个新的PCM文件,如`.wav`文件,使用第三方库如`libsndfile`或`waveout`等将音频数据写入。这通常涉及到将数据打包成适当的帧结构,并设置合适的通道数(单声道或立体声)和采样率。 ```cpp #include "soundfile.h" // 假设这是libsndfile的头文件 sndfile *sf = sf_open("output.wav", SFM_WRITE | SF_FORMAT_PCM_16, &info); // info应填充正确的音频属性 sf_writef_short(sf, audio_samples.data(), audio_samples.size()); sf_close(sf); ``` 5. **关闭文件**: 最后别忘了关闭之前打开的文本文件和新创建的音频文件。 注意:这个过程假设输入文件已经格式化良好并且可以直接映射到目标的PCM格式。实际操作可能会因为文件格式的不同而有所变化。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值