Getting started with programmatic audio

本文深入探讨了音频处理的基本原理,包括PCM音频的工作流程、如何通过编程改变音频的音量、实现静音、调整播放速度及频谱分析等。通过使用FFT进行频率域操作,可以对音频进行低频增强或高频过滤,实现丰富的音频效果。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >



37 down vote accepted

As some has pointed out in the comments, what you want to look into is PCM audio.

In a nutshell, sound is a wave that travels through air. In order to capture that sound, we use amicrophone, which contains a membrane which will vibrate as the sound waves hit it. This vibration is translated into an electric signal, where the voltage goes up and down. This change in voltage is then changed into a digital signal by an analog-to-digital converter (ADC) by sampling a certain number of times a second ("sampling rate" - the 44 KHz, or 44,100 samples per second) and, in the current case, stored as a pulse-code modulated (PCM) audio data.

speaker works in opposite; the PCM signal is converted to digital by an digital-to-analog converter(DAC), then the analog signal goes to the speaker where it will vibrate a membrane which produces vibrations in the air which results in sound.

Manipulating Audio

There are many libraries out there for many languages that you can manipulate audio with, however you've marked this question as "language-agnostic", I'll mention a few simple ways (as that's all I know!) that you'll be able to manipulate audio in your preferred language.

I'll present the code samples in pseudocode.

The pseudocode will have each audio sample have an amplitude in the range of -1 to 1. This will be dependent on the data type you are using for storing each sample. (I haven't dealt with 32-bit floats before, so this may be different.)

Amplification

In order to amplify the audio, (therefore, increasing the volume of the sound) you'll want to make the vibration of the speakers to be larger so the magnitude of the sound wave is increased.

In order to make that speaker move more, you'll have to increase the value of each sample:

original_samples = [0, 0.5, 0, -0.5, 0]

def amplify(samples):
    foreach s in samples:
        s = s * 2

amplified_samples = amplify(original_samples)

// result: amplified_samples == [0, 1, 0, -1, 0]

The resulting samples are now amplified by 2, and on playback, it should sound much louder than it did before.

Silence

When there are no vibrations, there is no sound. Silence can be achieved by dropping each sample to 0, or to any specific value, but does not have any change in amplitude between samples:

original_samples = [0, 0.5, 0, -0.5, 0]

def silence(samples):
    foreach s in samples:
        s = 0

silent_samples = silence(original_samples)

// result: silent_samples == [0, 0, 0, 0, 0]

Playing back the above should result in no sound, as the membrane on the speaker is not moving at all, due to a lack of change in amplitude in the samples.

Speed Up and Down

Speeding things up and down can be achieved in two ways: (1) changing the playback sampling rate or (2) changing the samples themselves.

Changing the playback sampling rate from 44100 Hz to 22050 Hz will decrease the speed of playback by 2. This will make the sound slower and lower in tone. Going from a 22 KHz source and playing back at 44 KHz, the sound will be very fast and high pitched like birds chirping.

Changing the samples themselves (and keeping a constant playback sampling rate) means that samples either (a) get thrown out or (b) are added in.

To speed up the playback of the audio, throw out samples:

original_samples = [0, 0.1, 0.2, 0.3, 0.4, 0.5]

def faster(samples):
    new_samples = []
    for i = 0 to samples.length:
        if i is even:
            new_samples.add(samples[i])
    return new_samples

faster_samples = faster(original_samples)

// result: silent_samples == [0, 0.2, 0.4]

The result of the above program is that the audio will speed up by a factor of 2, similar to playing back an audio that sampled at 22 KHz at 44 KHz.

To slow down the playback of the audio, throw in a few samples:

original_samples = [0, 0.1, 0.2, 0.3]

def slower(samples):
    new_samples = []
    for i = 0 to samples.length:
        new_samples.add(samples[i])
        new_samples.add(interpolate(s[i], s[i + 1]))
    return new_samples

slower_samples = slower(original_samples)

// result: silent_samples == [0, 0.05, 0.1, 0.15, 0.2, 0.25, 0.3]

Here, extra samples were added, thereby slowing down the playback. Here, we have aninterpolation function that makes a "guess" as to how to fill in that extra space that much be added.

Spectrum Analysis and Sounds Modification by FFT

Using a technique called Fast Fourier transform (FFT), the sound data in the amplitude-time domain can be mapped to the frequency-time domain to find out the frequency components of audio. This can be used to produce the spectrum analyzers that you might see on your favorite audio player.

Not only that, since now you have the frequency components of the audio, if you change the amount of

If you want to cut-off certain frequencies, you can use FFT to transform the sound data into the frequency-time domain, and zero-out the frequency components that are not desired. This is calledfiltering.

Making an high-pass filter, which allows frequencies above a certain frequency can be performed like this:

data = fft(orignal_samples)

for i = (data.length / 2) to data.length:
    data[i] = 0

new_samples = inverse_fft(data)

In the above example, all frequencies over the half-way mark is cutoff. So, if the audio could produce 22 KHz as the maximum frequency, any frequency above 11 KHz will be cut out. (For audio played back at 44 KHz, the maximum theoretical frequency that can be produced is 22 KHz. See Nyquist–Shannon sampling theorem.)

If you want to do something like increase the low-frequency range (similar to the bass boost effect), take the lower-end of the FFT-transformed data and increase its magnitude:

data = fft(orignal_samples)

for i = 0 to (data.length / 4):
    increase(data[i])

new_samples = inverse_fft(data)

This example increases the lower quarter of the frequency components of the audio, leading to the low frequencies to become louder.


There are quite a few things that can be done to the samples to manipulate the audio. Just go ahead and experiment! It's the most exciting way to learn.

Good luc

37 down vote accepted

As some has pointed out in the comments, what you want to look into is PCM audio.

In a nutshell, sound is a wave that travels through air. In order to capture that sound, we use amicrophone, which contains a membrane which will vibrate as the sound waves hit it. This vibration is translated into an electric signal, where the voltage goes up and down. This change in voltage is then changed into a digital signal by an analog-to-digital converter (ADC) by sampling a certain number of times a second ("sampling rate" - the 44 KHz, or 44,100 samples per second) and, in the current case, stored as a pulse-code modulated (PCM) audio data.

speaker works in opposite; the PCM signal is converted to digital by an digital-to-analog converter(DAC), then the analog signal goes to the speaker where it will vibrate a membrane which produces vibrations in the air which results in sound.

Manipulating Audio

There are many libraries out there for many languages that you can manipulate audio with, however you've marked this question as "language-agnostic", I'll mention a few simple ways (as that's all I know!) that you'll be able to manipulate audio in your preferred language.

I'll present the code samples in pseudocode.

The pseudocode will have each audio sample have an amplitude in the range of -1 to 1. This will be dependent on the data type you are using for storing each sample. (I haven't dealt with 32-bit floats before, so this may be different.)

Amplification

In order to amplify the audio, (therefore, increasing the volume of the sound) you'll want to make the vibration of the speakers to be larger so the magnitude of the sound wave is increased.

In order to make that speaker move more, you'll have to increase the value of each sample:

original_samples = [0, 0.5, 0, -0.5, 0]

def amplify(samples):
    foreach s in samples:
        s = s * 2

amplified_samples = amplify(original_samples)

// result: amplified_samples == [0, 1, 0, -1, 0]

The resulting samples are now amplified by 2, and on playback, it should sound much louder than it did before.

Silence

When there are no vibrations, there is no sound. Silence can be achieved by dropping each sample to 0, or to any specific value, but does not have any change in amplitude between samples:

original_samples = [0, 0.5, 0, -0.5, 0]

def silence(samples):
    foreach s in samples:
        s = 0

silent_samples = silence(original_samples)

// result: silent_samples == [0, 0, 0, 0, 0]

Playing back the above should result in no sound, as the membrane on the speaker is not moving at all, due to a lack of change in amplitude in the samples.

Speed Up and Down

Speeding things up and down can be achieved in two ways: (1) changing the playback sampling rate or (2) changing the samples themselves.

Changing the playback sampling rate from 44100 Hz to 22050 Hz will decrease the speed of playback by 2. This will make the sound slower and lower in tone. Going from a 22 KHz source and playing back at 44 KHz, the sound will be very fast and high pitched like birds chirping.

Changing the samples themselves (and keeping a constant playback sampling rate) means that samples either (a) get thrown out or (b) are added in.

To speed up the playback of the audio, throw out samples:

original_samples = [0, 0.1, 0.2, 0.3, 0.4, 0.5]

def faster(samples):
    new_samples = []
    for i = 0 to samples.length:
        if i is even:
            new_samples.add(samples[i])
    return new_samples

faster_samples = faster(original_samples)

// result: silent_samples == [0, 0.2, 0.4]

The result of the above program is that the audio will speed up by a factor of 2, similar to playing back an audio that sampled at 22 KHz at 44 KHz.

To slow down the playback of the audio, throw in a few samples:

original_samples = [0, 0.1, 0.2, 0.3]

def slower(samples):
    new_samples = []
    for i = 0 to samples.length:
        new_samples.add(samples[i])
        new_samples.add(interpolate(s[i], s[i + 1]))
    return new_samples

slower_samples = slower(original_samples)

// result: silent_samples == [0, 0.05, 0.1, 0.15, 0.2, 0.25, 0.3]

Here, extra samples were added, thereby slowing down the playback. Here, we have aninterpolation function that makes a "guess" as to how to fill in that extra space that much be added.

Spectrum Analysis and Sounds Modification by FFT

Using a technique called Fast Fourier transform (FFT), the sound data in the amplitude-time domain can be mapped to the frequency-time domain to find out the frequency components of audio. This can be used to produce the spectrum analyzers that you might see on your favorite audio player.

Not only that, since now you have the frequency components of the audio, if you change the amount of

If you want to cut-off certain frequencies, you can use FFT to transform the sound data into the frequency-time domain, and zero-out the frequency components that are not desired. This is calledfiltering.

Making an high-pass filter, which allows frequencies above a certain frequency can be performed like this:

data = fft(orignal_samples)

for i = (data.length / 2) to data.length:
    data[i] = 0

new_samples = inverse_fft(data)

In the above example, all frequencies over the half-way mark is cutoff. So, if the audio could produce 22 KHz as the maximum frequency, any frequency above 11 KHz will be cut out. (For audio played back at 44 KHz, the maximum theoretical frequency that can be produced is 22 KHz. See Nyquist–Shannon sampling theorem.)

If you want to do something like increase the low-frequency range (similar to the bass boost effect), take the lower-end of the FFT-transformed data and increase its magnitude:

data = fft(orignal_samples)

for i = 0 to (data.length / 4):
    increase(data[i])

new_samples = inverse_fft(data)

This example increases the lower quarter of the frequency components of the audio, leading to the low frequencies to become louder.


There are quite a few things that can be done to the samples to manipulate the audio. Just go ahead and experiment! It's the most exciting way to learn.

Good luc

内容概要:本文详细探讨了基于MATLAB/SIMULINK的多载波无线通信系统仿真及性能分析,重点研究了以OFDM为代表的多载波技术。文章首先介绍了OFDM的基本原理和系统组成,随后通过仿真平台分析了不同调制方式的抗干扰性能、信道估计算法对系统性能的影响以及同步技术的实现与分析。文中提供了详细的MATLAB代码实现,涵盖OFDM系统的基本仿真、信道估计算法比较、同步算法实现和不同调制方式的性能比较。此外,还讨论了信道特征、OFDM关键技术、信道估计、同步技术和系统级仿真架构,并提出了未来的改进方向,如深度学习增强、混合波形设计和硬件加速方案。; 适合人群:具备无线通信基础知识,尤其是对OFDM技术有一定了解的研究人员和技术人员;从事无线通信系统设计与开发的工程师;高校通信工程专业的高年级本科生和研究生。; 使用场景及目标:①理解OFDM系统的工作原理及其在多径信道环境下的性能表现;②掌握MATLAB/SIMULINK在无线通信系统仿真中的应用;③评估不同调制方式、信道估计算法和同步算法的优劣;④为实际OFDM系统的设计和优化提供理论依据和技术支持。; 其他说明:本文不仅提供了详细的理论分析,还附带了大量的MATLAB代码示例,便于读者动手实践。建议读者在学习过程中结合代码进行调试和实验,以加深对OFDM技术的理解。此外,文中还涉及了一些最新的研究方向和技术趋势,如AI增强和毫米波通信,为读者提供了更广阔的视野。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值