三值网络--Trained Ternary Quantization

本文介绍了新的三值网络,传统二值网络将权重量化为 +1、-1,三值网络 TWN 将权重量化为 {−W_l,0,+W_l }。正负权值通过网络学习得到,还给出了阈值计算公式。同时说明了在 CIFAR - 10 和 ImageNet 数据集上的阈值选择,以及模型量化训练的具体步骤。
部署运行你感兴趣的模型镜像

Trained Ternary Quantization
ICLR 2017
https://github.com/TropComplique/trained-ternary-quantization pytorch
https://github.com/buaabai/Ternary-Weights-Network pytorch

传统的二值网络将权重 W 量化为 +1、-1; 三值网络 TWN (Ternary weight networks) 将权重W 量化为 {−W_l ,0,+W_l }
在这里插入图片描述
阈值的计算公式如下所示
在这里插入图片描述
本文提出了新的三值网络
在这里插入图片描述
positive and negative weights,三个不同的值用于表示三值网络,这个正负权值是通过网络学习得到的
对应的梯度计算如下
在这里插入图片描述
在这里插入图片描述
本文的阈值选择采用:
在这里插入图片描述
set t to 0.05 in experiments on CIFAR-10 and ImageNet dataset

The quantization roughly proceeds as follows.

  1. Train a model of your choice as usual (or take a trained model).

  2. Copy all full precision weights that you want to quantize. Then do the initial quantization:
    in the model replace them by ternary values {-1, 0, +1} using some heuristic.

  3. Repeat until convergence:
    1). Make the forward pass with the quantized model. 使用量化后的网络进行前向计算
    2). Compute gradients for the quantized model. 对量化网络进行梯度计算
    3). Preprocess the gradients and apply them to the copy of full precision weights. 使用梯度更新网络模型的权重
    4). Requantize the model using the changed full precision weights. 对新的权重进行量化

  4. Throw away the copy of full precision weights and use the quantized model.

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

11

您可能感兴趣的与本文相关的镜像

FLUX.1-dev

FLUX.1-dev

图片生成
FLUX

FLUX.1-dev 是一个由 Black Forest Labs 创立的开源 AI 图像生成模型版本,它以其高质量和类似照片的真实感而闻名,并且比其他模型更有效率

### PyTorch Quantization Aware Training (QAT) with YOLOv8 Implementation #### Overview of QAT in PyTorch Quantization Aware Training (QAT) is a technique that simulates the effects of post-training quantization during training, allowing models to be trained directly for lower precision inference. This approach helps mitigate accuracy loss when converting floating-point models to integer-based representations like INT8[^2]. For implementing QAT specifically within the context of YOLOv8 using PyTorch, several key steps need attention: #### Preparation Steps Before Applying QAT on YOLOv8 Model Before applying QAT, ensure the environment setup includes necessary libraries such as `torch`, `torchvision` along with specific versions compatible with your hardware and software stack. Ensure the model architecture supports QAT by verifying compatibility or making adjustments where required. Some layers might not support direct quantization; hence modifications may be needed before proceeding further. ```python import torch from ultralytics import YOLO model = YOLO('yolov8n.pt') # Load pre-trained YOLOv8 nano model ``` #### Configuring the Model for QAT To prepare the YOLOv8 model for QAT, configure it according to PyTorch guidelines provided in official documentation[^1]. The configuration involves setting up observers which will collect statistics about activations and weights throughout different stages of forward passes. ```python # Prepare model for QAT model.train() model.qconfig = torch.quantization.get_default_qat_qconfig('fbgemm') torch.quantization.prepare_qat(model, inplace=True) for name, module in model.named_modules(): if isinstance(module, torch.nn.Conv2d): torch.quantization.fuse_modules( model, [name], inplace=True ) ``` #### Fine-Tuning Process During QAT Phase During fine-tuning phase under QAT mode, continue training while periodically validating performance metrics against validation datasets. Adjust learning rates carefully since aggressive changes could negatively impact convergence properties observed earlier without quantization constraints applied. Monitor both original float32 evaluation scores alongside their corresponding int8 counterparts generated through simulated low-bit operations introduced via inserted fake_quant modules across network paths. ```python optimizer = torch.optim.SGD(model.parameters(), lr=0.001, momentum=0.9) criterion = torch.nn.CrossEntropyLoss() for epoch in range(num_epochs): train_one_epoch(model, criterion, optimizer, data_loader_train, device=device) validate(model, criterion, data_loader_val, device=device) ``` #### Exporting Post-QAT Trained Models into ONNX Format Once satisfied with achieved accuracies after completing sufficient epochs count, export resulting optimized graph structure together with learned parameters ready for deployment onto target platforms supporting efficient execution over reduced bit-width arithmetic units. Exported files should contain explicit instructions regarding how each tensor gets transformed between full-range floats versus narrow-scaled integers at runtime boundaries defined inside exported protocol buffers specification documents adhering closely enough so they remain interoperable among diverse ecosystem components involved from development until production phases inclusive. ```python dummy_input = torch.randn(1, 3, 640, 640).to(device) torch.onnx.export( model.eval(), dummy_input, 'qat_yolov8.onnx', opset_version=13, do_constant_folding=True, input_names=['input'], output_names=['output'] ) ``` #### Best Practices When Implementing QAT on YOLOv8 Adopting best practices ensures successful application of QAT techniques leading towards effective utilization of computational resources available today's edge devices capable running deep neural networks efficiently even constrained environments characterized limited power supply options present mobile phones cameras drones etcetera. - **Start Simple**: Begin experimentation process utilizing smaller variants first e.g., Nano version instead larger ones initially. - **Validate Early & Often**: Regularly check intermediate results ensuring no significant drop occurs compared baseline configurations prior introducing any form approximation schemes whatsoever. - **Adjust Learning Rate Carefully**: Gradually decrease step sizes especially near end iterations avoiding abrupt shifts causing instability issues otherwise avoided altogether following systematic reduction schedules designed maintain stability throughout entire procedure duration spanned multiple rounds optimization cycles executed sequentially ordered fashion preserving overall integrity final product delivered customers hands ultimately.
评论 1
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值