Description of FIR Filters

本文详细介绍了有限脉冲响应(FIR)滤波器的工作原理及设计过程。FIR滤波器通过最近的数据样本与滤波器系数进行点乘运算来实现信号处理。文章解释了如何将期望的频率响应转换为滤波器系数,并讨论了应用窗口函数以改善滤波器性能的方法。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Description of FIR Filters

FIR filters are often used because they are simple and easy to understand [6]. Diagrams of the operation of an FIR filter are shown in Figure S1. An FIR filter works by multiplying an array of the most recent n data samples by an array of constants (called the tap coefficients), and summing the elements of the resulting array. (This operation is commonly called a dot product.) The filter then inputs another sample of data (which causes the oldest piece of data to be thrown away) and repeats the process.

[Figure showing filter operation a) in the time domain and b) in the frequency domain]
Sidebar Fig - (a) FIR Filter operation in the time domain. Given the h[k]'s, we are implementing an FIR filter as shown in the diagram. (b) FIR Filter operation in the frequency domain. This diagram shows the frequency response of the filter in part a. Notice that the tap coefficients (h[k]'s) are simply the Fourier series coefficients of the frequency response.

The interesting part of designing FIR filters is translating the desired frequency response into filter tap coefficients. As can be seen from Figure S1a, the equation for the output of an FIR filter in the time domain is:

y(t)=\sum_{k=0}^n h[k] x(t - k/fs) (Eq. S1)

where fs is the sampling frequency and k is the filter tap number. Since a delay of 1/fs in time corresponds to a multiplication by a complex exponential, the corresponding equation in the frequency domain (from Figure S1b) is:

Y(f)=H(f)X(f) (Eq. S2a)

where

H(f)=\sum_{k=0}^n h[k] e^(-j2\pi f k/fs) (Eq. S2b)

is the frequency response of the FIR filter. It can be seen that h[k], the filter tap coefficients, are precisely the Fourier series coefficients of H(f) which is periodic with period fs. Therefore the tap coefficients may be calculated from the following equation:

h[k]=1/fs \int_0^fs H(f) e^(j2\pi k f/fs) df (Eq. S3)

An additional option for improving the performance of the filters, is to apply a window function to the filter tap coefficients [2]. Coefficients affecting the higher frequencies are scaled down to reduce the ripple in the stop band. This can be done as follows:

hw[k]=h[k]w[k] (Eq. S4)

where w[k] are the window scale factors, h[k] are the original tap coefficients, and hw[k] are the windowed tap coefficients. The resulting coefficients may now be used in place of the original ones. An unwanted side effect of windowing, however, is the sacrifice of a sharp roll-off at the cutoff frequency.

So, the design process is to pick the desired frequency response, H(f), and then calculate the tap coefficients (Eq. S3). The actual frequency response will only approximate the desired response because the number of filter taps is finite. Optionally, a window function may then be applied to the filter tap coefficients. The final step is to plot the actual frequency response, H(f), using Eq. S2b to make sure the resulting response is acceptable. Lousy responses can be tweaked using different windowing schemes.

 

转载于:https://www.cnblogs.com/nickchan/archive/2012/03/30/3104433.html

### 使用Transformer模型进行图像分类的方法 #### 方法概述 为了使Transformer能够应用于图像分类任务,一种有效的方式是将图像分割成固定大小的小块(patches),这些小块被线性映射为向量,并加上位置编码以保留空间信息[^2]。 #### 数据预处理 在准备输入数据的过程中,原始图片会被切分成多个不重叠的patch。假设一张尺寸为\(H \times W\)的RGB图像是要处理的对象,则可以按照设定好的宽度和高度参数来划分该图像。例如,对于分辨率为\(224\times 224\)像素的图像,如果选择每边切成16个部分的话,那么最终会得到\((224/16)^2=196\)个小方格作为单独的特征表示单元。之后,每一个这样的补丁都会通过一个简单的全连接层转换成为维度固定的嵌入向量。 ```python import torch from torchvision import transforms def preprocess_image(image_path, patch_size=16): transform = transforms.Compose([ transforms.Resize(256), transforms.CenterCrop(224), # 假设目标分辨率是224x224 transforms.ToTensor(), ]) image = Image.open(image_path).convert('RGB') tensor = transform(image) patches = [] for i in range(tensor.shape[-2] // patch_size): # 高度方向上的循环 row_patches = [] for j in range(tensor.shape[-1] // patch_size): # 宽度方向上的循环 patch = tensor[:, :, i*patch_size:(i+1)*patch_size, j*patch_size:(j+1)*patch_size].flatten() row_patches.append(patch) patches.extend(row_patches) return torch.stack(patches) ``` #### 构建Transformer架构 构建Vision Transformer (ViT),通常包括以下几个组成部分: - **Patch Embedding Layer**: 将每个图像块转化为低维向量; - **Positional Encoding Layer**: 添加绝对或相对位置信息给上述获得的向量序列; - **Multiple Layers of Self-Attention and Feed Forward Networks**: 多层自注意机制与前馈神经网络交替堆叠而成的核心模块; 最后,在顶层附加一个全局平均池化层(Global Average Pooling)以及一个多类别Softmax回归器用于预测类标签。 ```python class VisionTransformer(nn.Module): def __init__(self, num_classes=1000, embed_dim=768, depth=12, num_heads=12, mlp_ratio=4., qkv_bias=False, drop_rate=0.): super().__init__() self.patch_embed = PatchEmbed(embed_dim=embed_dim) self.pos_embed = nn.Parameter(torch.zeros(1, self.patch_embed.num_patches + 1, embed_dim)) self.cls_token = nn.Parameter(torch.zeros(1, 1, embed_dim)) dpr = [drop_rate for _ in range(depth)] self.blocks = nn.Sequential(*[ Block( dim=embed_dim, num_heads=num_heads, mlp_ratio=mlp_ratio, qkv_bias=qkv_bias, drop=dpr[i], ) for i in range(depth)]) self.norm = nn.LayerNorm(embed_dim) self.head = nn.Linear(embed_dim, num_classes) def forward(self, x): B = x.shape[0] cls_tokens = self.cls_token.expand(B, -1, -1) x = self.patch_embed(x) x = torch.cat((cls_tokens, x), dim=1) x += self.pos_embed x = self.blocks(x) x = self.norm(x) return self.head(x[:, 0]) ```
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值