欢迎访问《图学学报》 分享到:

图学学报 ›› 2025, Vol. 46 ›› Issue (4): 709-718.DOI: 10.11996/JG.j.2095-302X.2025040709

• 图像处理与计算机视觉 • 上一篇    下一篇

面向轻量卷积神经网络的训练后量化方法

杨杰1(), 李琮1, 胡庆浩2(), 陈显达1, 王云鹏1, 刘晓晶1   

  1. 1.国网山东省电力公司济南供电公司,山东 济南 250012
    2.中国科学院自动化研究所复杂系统认知与决策重点实验室,北京 100190
  • 收稿日期:2024-10-05 修回日期:2024-12-13 出版日期:2025-08-30 发布日期:2025-08-11
  • 通讯作者:胡庆浩(1992-),男,副研究员,博士。主要研究方向为深度神经网络轻量化。E-mail:huqinghao2014@ia.ac.cn
  • 第一作者:杨杰(1989-),男,高级工程师,硕士。主要研究方向为设备运检与边缘智能。E-mail:18753137902@139.com
  • 基金资助:
    国网山东省电力公司科技项目(52060122000Q)

A post-training quantization method for lightweight CNNs

YANG Jie1(), LI Cong1, HU Qinghao2(), CHEN Xianda1, WANG Yunpeng1, LIU Xiaojing1   

  1. 1. State Grid Jinan Power Supply Company, Jinan Shandong 250012, China
    2. The Key Laboratory of Cognition and Decision Intelligence for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
  • Received:2024-10-05 Revised:2024-12-13 Published:2025-08-30 Online:2025-08-11
  • First author:YANG Jie (1989-), senior engineer, master. His main research interests cover equipment operation and inspection. E-mail:18753137902@139.com
  • Supported by:
    The Science and Technology Project of State Grid Shandong Electric Power Company(52060122000Q)

摘要:

当前训练后量化方法(post-training quantization)在高比特量化位宽下可以实现精度近乎无损的量化,但对于轻量卷积神经网络(CNN)来说,其量化误差仍然不可忽视,特别是低位宽(<4比特)量化的情况。针对该问题,提出了一种面向轻量CNN的训练后量化方法,即块级批归一化学习(BBL)方法。不同于当前训练后量化方法合并批归一化层的方式,该方法以模型块为单位保留批归一化层的权重,基于块级特征图重建损失对模型量化参数和批归一化层的参数进行学习,且更新批归一化层的均值和方差等统计量,以一种简单且有效的方式缓解了轻量CNN在低比特量化时产生的分布漂移问题。其次,为了降低训练后量化方法对校准数据集的过拟合,构建了块级的数据增强方法,避免不同模型块对同一批校准数据进行学习。并在ImageNet数据集上进行了实验验证,实验结果表明,相比于当前训练后量化算法,BBL方法识别精度最高能提升7.72个百分点,并有效减少轻量CNN在低比特训练后量化时产生的量化误差。

关键词: 深度神经网络压缩, 训练后量化, 低比特量化, 轻量卷积神经网络, 轻量化智能

Abstract:

The current post-training quantization methods can achieve near lossless quantization at high quantization bit-width, however, for lightweight convolutional neural networks (CNN), the quantization error remains nonnegligible, especially in the case of low bit-width quantization (<4 bits). To address this, a post-training quantization method for lightweight CNN, called the block-level BatchNorm learning (BBL) method, was proposed. Unlike current post-training quantization methods that merge the batch normalization layers, this method retained the weights of the batch normalization layer on a per-block basis, and learned the quantized model parameters and batch normalization layer parameters based on the block-level feature map reconstruction loss. It also updated the mean and variance statistics of the batch normalization layer. This method mitigated the distribution shift problem caused by low-bit quantization of lightweight CNN in a simple and effective manner. Furthermore, to reduce overfitting of the post-training quantization method to the calibration dataset, the method constructed a block-level data augmentation approach by ensuring different model blocks did not learn from the same batch of calibration data. To verify the proposed method, extensive experiments on the ImageNet dataset, demonstrated that compared with current post-training quantization algorithms, the BBL method can improve the accuracy by up to 7.72 percentage points and can effectively reduce the quantization error caused by low-bit post-training quantization of lightweight CNN.

Key words: deep neural networks compression, post-training quantization, low-bit quantization, lightweight convolutional neural networks, lightweight intelligence

中图分类号: