YOLOv8s模型INT8量化方法

原创于 2025-09-16 10:16:27 发布 · 1.3k 阅读

20 ·

CC 4.0 BY-SA版权

文章标签：

#YOLO #neo4j

部署运行你感兴趣的模型镜像

YOLOv8s模型INT8量化方法 (Linux Python)

步骤一：安装必要的库

你可能需要以下库：

   PyTorch: YOLOv8 是基于 PyTorch 的。

   Ultralytics (YOLOv8): 包含 YOLOv8 的导出功能。

   ONNX: 用于处理 ONNX 模型。

   onnx-tf: 用于将 ONNX 模型转换为 TensorFlow SavedModel。

   TensorFlow: 用于加载 SavedModel 并将其转换为 TFLite。

   Numpy: 常用数据处理库。

   OpenCV (cv2): 用于图像预处理。

pip install torch torchvision ultralytics onnx onnx-tf tensorflow numpy opencv-python

步骤二：准备校准数据集

INT8 量化（无论是 OpenVINO 还是 TFLite）都需要一个小的、有代表性的数据集来校准量化参数。这个数据集只需要原始图片，不需要标签。

、创建一个目录存储你的校准图片，例如 calibration_data/images。
、将一些训练集或验证集的图片复制到这个目录中。确保这些图片能代表你的实际推理数据。建议数量：100-500 张。

步骤三：将 YOLOv8 模型导出为 ONNX 格式

使用 Ultralytics 库内置的 export 方法可以轻松地将你的 best.pt 模型导出为 ONNX 格式。

import torch
from ultralytics import YOLO
# 加载你的训练好的模型**
model = YOLO('best.pt')  # 替换为你的模型路径
# 导出为 ONNX 格式 # 建议使用 opset=12 或更高版本，以获得更好的兼容性 # 如果在转换 ONNX 到 TF 时遇到问题，可以尝试不同的 opset 版本 # 例如，yolov8n.pt 默认输出的 ONNX 的输入是 [1, 3, 640, 640] # 确保你的模型输入尺寸是固定的，如果不是，可以在训练时设置 img_size 参数**
model.export(format='onnx', opset=12, simplify=True, dynamic=False, imgsz=640) # imgsz 确保输入尺寸固定

print("模型已成功导出为 best.onnx")

导出后，你会在你的工作目录下看到一个 best.onnx 文件。

步骤四：将 ONNX 模型转换为 TensorFlow SavedModel

使用 onnx-tf 工具将 best.onnx 转换为 TensorFlow SavedModel 格式。

import onnx
from onnx_tf.backend import prepare

# 加载 ONNX 模型
onnx_model = onnx.load("best.onnx")

# 转换到 TensorFlow SavedModel
tf_rep = prepare(onnx_model)

# 保存 SavedModel
tf_rep.export_graph("yolov8_saved_model")

print("ONNX 模型已成功转换为 TensorFlow SavedModel，并保存到 yolov8_saved_model 目录。")

步骤五：将 TensorFlow SavedModel 转换为 INT8 量化的 TFLite 模型

这一步我们将使用 TensorFlow Lite Converter。为了进行 INT8 量化，我们需要提供一个代表性数据集的生成器函数。

5.1. 创建校准数据生成器

创建一个 Python 脚本来加载和预处理校准数据集。这个生成器将用于在量化过程中提供输入样本。

import tensorflow as tf
import numpy as np
import cv2
from pathlib import Path

# 定义YOLOv8的输入尺寸
IMG_SIZE = 640 # 与ONNX导出时的imgsz参数一致

def preprocess_image(image_path):
    img = cv2.imread(str(image_path))
    if img is None:
        print(f"警告: 无法加载图片 {image_path}")
        return None

    # Letterbox resize (与YOLOv8训练/推理预处理保持一致)
    shape = img.shape[:2]  # current shape [height, width]
    r = min(IMG_SIZE / shape[0], IMG_SIZE / shape[1])
    new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
    dw, dh = IMG_SIZE - new_unpad[0], IMG_SIZE - new_unpad[1]  # width, height paddings
    dw /= 2  # divide padding into 2 sides
    dh /= 2

    if shape[::-1] != new_unpad:  # resize
        img = cv2.resize(img, new_unpad, interpolation=cv2.INTER_LINEAR)
    top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
    left, right = int(round(dw - 0.1)), int(round(dh + 0.1))
    img = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=(114, 114, 114))  # add border

    # BGR to RGB, HWC to CHW, normalize to [0, 1]
    img = img[:, :, ::-1].transpose(2, 0, 1)  # BGR to RGB, HWC to CHW
    img = np.ascontiguousarray(img)
    img = img.astype(np.float32) / 255.0

    # TFLite 模型通常期望 [1, H, W, C] 或 [1, C, H, W]
    # 如果你的SavedModel输入是 [1, 3, 640, 640] (ONNX默认), 那么保持CHW
    # 如果SavedModel转换后输入变为 [1, 640, 640, 3]，则需要调整为 HWC
    # 假设你的SavedModel输入还是 CHW (YOLOv8 ONNX 默认)
    # img = np.expand_dims(img, axis=0) # TFLite Converter 期望的是一个列表的单个样本
    return img

def representative_data_gen():
    calibration_data_path = Path("calibration_data/images")
    image_files = list(calibration_data_path.glob('*.jpg')) + \
                  list(calibration_data_path.glob('*.png'))

    # 限制校准图片数量，避免过长时间
    num_calibration_samples = min(len(image_files), 300) # 建议使用100-500张
    
    for i in range(num_calibration_samples):
        img_path = image_files[i]
        preprocessed_img = preprocess_image(img_path)
        if preprocessed_img is not None:
            # yield 一个张量列表，每个张量代表一个输入
            # 由于YOLOv8通常只有一个输入，所以列表只包含一个元素
            yield [preprocessed_img.astype(np.float32)] # 必须是 float32
        else:
            print(f"跳过图片: {img_path}")

重要提示：

representative_data_gen 必须是一个生成器函数。

它应该 yield 一个列表，其中每个元素都是一个输入张量。对于 YOLOv8，通常只有一个输入。

输入张量的 dtype 必须是 tf.float32 (或 np.float32)，即使目标是 INT8。转换器会自行处理数据类型。

preprocess_image 函数的逻辑要与 YOLOv8 的预期输入完全一致（例如，归一化到 [0,1]，RGB 顺序，HWC 或 CHW 布局）。YOLOv8 的 ONNX 导出通常是 [batch, channels, height, width] (CHW)。

5.2. 执行 TFLite 转换和 INT8 量化

import tensorflow as tf
import numpy as np
import cv2
from pathlib import Path

# 确保上面定义的 preprocess_image 和 representative_data_gen 函数已在当前脚本或导入
# IMG_SIZE 必须与 ONNX 导出时的 imgsz 参数一致
IMG_SIZE = 640 

# 加载 SavedModel
converter = tf.lite.TFLiteConverter.from_saved_model("yolov8_saved_model")

# 启用量化
converter.optimizations = [tf.lite.Optimize.DEFAULT]

# 指定代表性数据集
converter.representative_dataset = representative_data_gen

# 确保输入和输出都是全整数量化
# 这在某些硬件上可能提供最佳性能，但可能牺牲准确性
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8  # 输入是 int8
converter.inference_output_type = tf.int8 # 输出是 int8

# 转换模型
tflite_model_int8 = converter.convert()

# 保存 TFLite 模型
with open("yolov8s_int8.tflite", "wb") as f:
    f.write(tflite_model_int8)

print("YOLOv8 INT8 TFLite 模型已成功生成并保存为 yolov8s_int8.tflite")

关键参数解释：

converter.optimizations = [tf.lite.Optimize.DEFAULT]：启用默认优化，包括量化。

converter.representative_dataset = representative_data_gen：提供用于量化校准的数据集。

converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]：指示转换器使用 TFLite 内置的 INT8 运算符。

converter.inference_input_type = tf.int8：指定模型的预期输入是 INT8。这意味着你的推理代码也需要将输入数据转换为 INT8。

converter.inference_output_type = tf.int8：指定模型的预期输出是 INT8。这意味着你的推理代码需要将输出从 INT8 反量化回浮点数。

如果想只进行权重 INT8 量化（输入/输出仍为 FP32）：
这种模式称为“全整数量化（仅权重）”，通常能提供较好的性能提升和较小的精度损失。

import tensorflow as tf
import numpy as np
import cv2
from pathlib import Path

print(f"TensorFlow 版本: {tf.__version__}")
if tf.__version__ < '2.5.0':
    print("警告: TensorFlow 版本低于 2.5.0，某些功能可能不兼容。建议升级。")

# --- 配置参数 ---
SAVED_MODEL_DIR = "yolov8_saved_model"
CALIBRATION_DATA_PATH = Path("calibration_data/images")
OUTPUT_TFLITE_PATH = "yolov8s_weights_int8_fp32io.tflite"

# 定义YOLOv8的输入尺寸，这必须与ONNX导出时的imgsz参数一致
# 通常YOLOv8s的默认输入是640x640
IMG_SIZE = 640 

# --- 1. 定义校准数据预处理函数 ---
# 这个函数用于加载和预处理单张图片，使其符合YOLOv8模型的输入要求
def preprocess_image_for_yolov8(image_path, img_size=(IMG_SIZE, IMG_SIZE)):
    img = cv2.imread(str(image_path))
    if img is None:
        print(f"警告: 无法加载图片 {image_path}")
        return None

    # Letterbox resize (与YOLOv8训练/推理预处理保持一致)
    shape = img.shape[:2]  # current shape [height, width]
    r = min(img_size[0] / shape[0], img_size[1] / shape[1])
    new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
    dw, dh = img_size[1] - new_unpad[0], img_size[0] - new_unpad[1]  # width, height paddings
    dw /= 2  # divide padding into 2 sides
    dh /= 2

    if shape[::-1] != new_unpad:  # resize
        img = cv2.resize(img, new_unpad, interpolation=cv2.INTER_LINEAR)
    top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
    left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
    img = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=(114, 114, 114))  # add border

    # BGR to RGB, HWC to CHW, normalize to [0, 1]
    img = img[:, :, ::-1].transpose(2, 0, 1)  # BGR to RGB, HWC to CHW
    img = np.ascontiguousarray(img)
    img = img.astype(np.float32) / 255.0

    # TFLite Converter 期望的是一个列表的单个样本，每个样本是 FP32
    # 添加 batch 维度 (1, C, H, W)
    img = np.expand_dims(img, axis=0) 
    return img

# --- 2. 定义代表性数据集生成器 ---
# 这个生成器函数会在量化过程中被 TensorFlow Lite Converter 调用
def representative_data_gen():
    image_files = list(CALIBRATION_DATA_PATH.glob('*.jpg')) + \
                  list(CALIBRATION_DATA_PATH.glob('*.png'))
    
    # 限制校准图片数量，避免过长时间，同时保证足够的统计信息
    # 通常100-500张图片就足够了
    num_calibration_samples = min(len(image_files), 300) 
    print(f"将使用 {num_calibration_samples} 张图片进行量化校准...")

    for i in range(num_calibration_samples):
        img_path = image_files[i]
        preprocessed_img = preprocess_image_for_yolov8(img_path)
        if preprocessed_img is not None:
            # yield 一个张量列表，每个张量代表一个输入。
            # YOLOv8通常只有一个输入，所以列表只包含一个元素。
            # 输入必须是 tf.float32 类型。
            yield [tf.constant(preprocessed_img, dtype=tf.float32)]
        else:
            print(f"跳过图片: {img_path}")

# --- 3. 初始化 TFLite Converter 并设置量化选项 ---
print(f"\n开始从 SavedModel ({SAVED_MODEL_DIR}) 转换到 TFLite (INT8 权重, FP32 输入/输出)...")
converter = tf.lite.TFLiteConverter.from_saved_model(SAVED_MODEL_DIR)

# 启用默认优化，这会包括权重和激活的量化
converter.optimizations = [tf.lite.Optimize.DEFAULT]

# 设置代表性数据集以进行量化校准
converter.representative_dataset = representative_data_gen


# **关键点：不指定 inference_input_type 和 inference_output_type**
# 这将导致模型接受 FP32 输入并产生 FP32 输出，但内部操作和权重会被量化为 INT8。
# converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8] 
# 上面这行通常也不需要显式指定，因为 Optimize.DEFAULT 会自行决定最佳操作集。
# 如果你的目标设备明确不支持 FP16 或只有 INT8 硬件加速，可以考虑添加。
# 但对于权重INT8量化+FP32 IO，通常是默认的OpSet就足够。

# --- 4. 执行模型转换 ---
try:
    tflite_model_quant = converter.convert()
    print("模型转换成功！")
except Exception as e:
    print(f"模型转换失败: {e}")
    # 在这里可以添加更详细的错误日志或回溯信息
    exit()

# --- 5. 保存量化后的 TFLite 模型 ---
with open(OUTPUT_TFLITE_PATH, "wb") as f:
    f.write(tflite_model_quant)

print(f"量化后的 TFLite 模型已成功保存到: {OUTPUT_TFLITE_PATH}")

# --- 6. 验证 TFLite 模型（可选） ---
print("\n正在验证生成的 TFLite 模型...")
try:
    interpreter = tf.lite.Interpreter(model_path=OUTPUT_TFLITE_PATH)
    interpreter.allocate_tensors()

    # 获取输入和输出张量的信息
    input_details = interpreter.get_input_details()
    output_details = interpreter.get_output_details()

    print("--- TFLite 模型输入详情 ---")
    for detail in input_details:
        print(f"  名称: {detail['name']}, 形状: {detail['shape']}, 类型: {detail['dtype']}")
        print(f"  量化参数 (scale, zero_point): {detail['quantization']}") # FP32 IO时，这里会是(0.0, 0)
    
    print("--- TFLite 模型输出详情 ---")
    for detail in output_details:
        print(f"  名称: {detail['name']}, 形状: {detail['shape']}, 类型: {detail['dtype']}")
        print(f"  量化参数 (scale, zero_point): {detail['quantization']}") # FP32 IO时，这里会是(0.0, 0)

    # 简要推理测试 (使用假数据)
    input_shape = input_details[0]['shape']
    dummy_input = np.random.rand(*input_shape).astype(np.float32)
    
    interpreter.set_tensor(input_details[0]['index'], dummy_input)
    interpreter.invoke()
    dummy_output = interpreter.get_tensor(output_details[0]['index'])
    
    print(f"假数据推理成功，输出形状: {dummy_output.shape}, 数据类型: {dummy_output.dtype}")

    # 检查模型是否确实包含 INT8 权重
    # 这不是一个严格的检查，但可以辅助判断
    has_int8_tensors = False
    for tensor_detail in interpreter.get_tensor_details():
        if tensor_detail['dtype'] == np.int8:
            has_int8_tensors = True
            break
    print(f"TFLite 模型内部是否包含 INT8 张量: {has_int8_tensors}")


except Exception as e:
    print(f"TFLite 模型验证失败: {e}")

print("\n转换和验证流程完成。")

建议先尝试这种方式，它通常更容易实现，且精度损失更小。

步骤六：在 TFLite Runtime 中加载和推理 INT8 模型

加载 INT8 TFLite 模型并进行推理需要注意输入/输出的数据类型。

import tensorflow as tf
import numpy as np
import cv2

# 加载 TFLite 模型
interpreter = tf.lite.Interpreter(model_path="yolov8s_int8.tflite")
interpreter.allocate_tensors()

# 获取输入和输出张量的信息
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

print("Input details:", input_details)
print("Output details:", output_details)

# 获取输入张量的形状
input_shape = input_details[0]['shape']
input_dtype = input_details[0]['dtype']
print(f"模型期望输入形状: {input_shape}, 数据类型: {input_dtype}")

# 假设模型期望 [1, 3, 640, 640]
_, C, H, W = input_shape 

# 准备测试图片 (与量化时的预处理方式一致)
test_image_path = "path/to/your/test_image.jpg" # 替换为你的测试图片路径
img = cv2.imread(test_image_path)
if img is None:
    print(f"无法加载测试图片: {test_image_path}")
    exit()

def preprocess_for_tflite(image, img_size=(W, H), input_dtype=np.float32):
    # 与 representative_data_gen 中的 preprocess_image 逻辑一致
    shape = image.shape[:2]
    r = min(img_size[0] / shape[0], img_size[1] / shape[1])
    new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
    dw, dh = img_size[1] - new_unpad[0], img_size[0] - new_unpad[1]
    dw /= 2
    dh /= 2

    if shape[::-1] != new_unpad:
        image = cv2.resize(image, new_unpad, interpolation=cv2.INTER_LINEAR)
    top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
    left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
    image = cv2.copyMakeBorder(image, top, bottom, left, right, cv2.BORDER_CONSTANT, value=(114, 114, 114))

    image = image[:, :, ::-1].transpose(2, 0, 1) # BGR to RGB, HWC to CHW
    image = np.ascontiguousarray(image)
    image = image.astype(np.float32) / 255.0 # 归一化到 [0, 1]

    # 添加 batch 维度
    input_data = np.expand_dims(image, axis=0)

    # 如果输入类型是 int8，需要进行量化
    if input_dtype == np.int8:
        # 获取输入张量的量化参数
        input_scale, input_zero_point = input_details[0]['quantization']
        # 量化到 int8
        input_data = input_data / input_scale + input_zero_point
        input_data = input_data.astype(input_dtype)

    return input_data

input_data = preprocess_for_tflite(img, img_size=(W, H), input_dtype=input_dtype)

# 设置输入张量
interpreter.set_tensor(input_details[0]['index'], input_data)

# 运行推理
interpreter.invoke()

# 获取输出张量
output_data = interpreter.get_tensor(output_details[0]['index'])

# 如果输出类型是 int8，需要进行反量化
if output_details[0]['dtype'] == np.int8:
    output_scale, output_zero_point = output_details[0]['quantization']
    output_data = (output_data.astype(np.float32) - output_zero_point) * output_scale

print("TFLite INT8 模型推理完成。")
print("输出数据形状:", output_data.shape)


# 后处理 (NMS, 框解码等)
# 这一部分需要你自己根据 YOLOv8 的输出格式实现
# YOLOv8 的 ONNX/TFLite 输出通常是 [1, num_boxes, 4+num_classes] 或类似
# 你可能需要进一步解析这些原始输出以获得最终的检测框和分数。

疑难解答与注意事项：

ONNX 到 TensorFlow 转换错误

YOLOv8 模型（尤其是带锚框和NMS后处理的复杂图）在 onnx-tf 转换时经常遇到问题。onnx-tf 可能不支持所有YOLOv8
使用的 ONNX 运算符。
尝试不同 opset 版本的 ONNX 导出。
检查 onnx-tf 的 GitHub issues，看是否有其他人遇到类似问题并找到解决方案。
替代方案：如果 onnx-tf 无法成功转换，你可能需要尝试更复杂的路径，例如：
使用 TorchScript 导出（如果 YOLOv8 支持）。
使用 MNN converter 等其他工具。
手动重构 TensorFlow 图：最复杂，但可行。在 TensorFlow 中重新实现 YOLOv8 的推理逻辑。将后处理逻辑分离：
如果问题出在后处理 (NMS)，可以尝试只将骨干网络导出为 TFLite，然后NMS在CPU上用Numpy或OpenCV实现。

注意： YOLOv8 模型的 ONNX 导出通常已经包含了 NMS (Non-Maximum Suppression) 算子。这个算子在
onnx-tf 或 TFLite 中可能不支持或转换不佳。如果出现问题，可以考虑在导出 ONNX 时禁用 NMS (如果
ultralytics 提供了这个选项)，然后在 TFLite 推理后手动执行 NMS。
TFLite 量化精度损失：
INT8 量化总会带来一定的精度损失。使用充足且有代表性的校准数据可以最大程度地减少这种损失。

如果精度损失太大，可以考虑：

使用“浮点输入/输出，整数权重”的量化方式（上面代码中注释掉的部分）。

尝试 OpenVINO 的量化，它在某些模型上表现可能更好，并且支持 Intel 硬件的优化。

考虑使用量化感知训练 (Quantization Aware Training, QAT)，但这个过程更复杂，需要修改训练代码。
输入形状和数据布局：
仔细检查 TFLite 模型期望的输入形状和数据布局（[N, C, H, W] vs [N, H, W, C]）。YOLOv8 的
ONNX 导出通常是 CHW。确保你的预处理和 representative_data_gen 输出的 Numpy 数组与此匹配。
TFLite 输出格式：
YOLOv8 的原始输出（在 ONNX 和 TFLite
中）通常是一个扁平化的张量，需要进行后处理才能得到可用的边界框、置信度和类别信息。这部分后处理逻辑在 TFLite
推理完成后仍然需要你自己实现。通常是解析 [batch, num_predictions, 4 + num_classes]
形式的张量，然后应用阈值和 NMS。