Journal of Graphics ›› 2025, Vol. 46 ›› Issue (4): 709-718.DOI: 10.11996/JG.j.2095-302X.2025040709
• Image Processing and Computer Vision • Previous Articles Next Articles
YANG Jie1(), LI Cong1, HU Qinghao2(
), CHEN Xianda1, WANG Yunpeng1, LIU Xiaojing1
Received:
2024-10-05
Revised:
2024-12-13
Online:
2025-08-30
Published:
2025-08-11
Contact:
HU Qinghao
About author:
First author contact:YANG Jie (1989-), senior engineer, master. His main research interests cover equipment operation and inspection. E-mail:18753137902@139.com
Supported by:
CLC Number:
YANG Jie, LI Cong, HU Qinghao, CHEN Xianda, WANG Yunpeng, LIU Xiaojing. A post-training quantization method for lightweight CNNs[J]. Journal of Graphics, 2025, 46(4): 709-718.
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.txxb.com.cn/EN/10.11996/JG.j.2095-302X.2025040709
方法 | 位宽(W/A) | MBV2 | SFV2 | Reg600 |
---|---|---|---|---|
全精度模型 | 32/32 | 72.49 | 69.36 | 73.71 |
AdaRound | 4/4 | 61.52 | - | 68.20 |
BRECQ | 4/4 | 67.51 | - | 70.44 |
QDrop | 4/4 | 67.89 | 63.45 | 70.62 |
Ours+QDrop | 4/4 | 68.13 | 64.80 | 70.74 |
MRECG | 4/4 | 68.84 | - | 71.22 |
Ours+MRECG | 4/4 | 68.97 | - | 71.09 |
AdaRound | 2/4 | 36.31 | - | 57.00 |
BRECQ | 2/4 | 52.30 | - | 61.77 |
QDrop | 2/4 | 52.92 | 46.36 | 63.10 |
Ours+QDrop | 2/4 | 55.16 | 52.11 | 63.72 |
MRECG | 2/4 | 57.85 | - | 65.16 |
Ours+MRECG | 2/4 | 58.84 | - | 65.62 |
AdaRound | 3/3 | 34.55 | - | 58.29 |
BRECQ | 3/3 | 52.03 | - | 62.61 |
QDrop | 3/3 | 54.27 | 49.22 | 64.53 |
Ours+QDrop | 3/3 | 56.33 | 52.88 | 64.58 |
MRECG | 3/3 | 58.40 | - | 66.08 |
Ours+MRECG | 3/3 | 60.51 | - | 66.22 |
BRECQ | 2/2 | 7.03 | - | 28.89 |
QDrop | 2/2 | 8.46 | 6.33 | 38.90 |
Ours+QDrop | 2/2 | 16.18 | 10.50 | 41.29 |
MRECG | 2/2 | 14.44 | - | 43.67 |
Ours+MRECG | 2/2 | 22.03 | - | 45.27 |
Table 1 The accuracy of different PTQ methods with different bit-width settings
方法 | 位宽(W/A) | MBV2 | SFV2 | Reg600 |
---|---|---|---|---|
全精度模型 | 32/32 | 72.49 | 69.36 | 73.71 |
AdaRound | 4/4 | 61.52 | - | 68.20 |
BRECQ | 4/4 | 67.51 | - | 70.44 |
QDrop | 4/4 | 67.89 | 63.45 | 70.62 |
Ours+QDrop | 4/4 | 68.13 | 64.80 | 70.74 |
MRECG | 4/4 | 68.84 | - | 71.22 |
Ours+MRECG | 4/4 | 68.97 | - | 71.09 |
AdaRound | 2/4 | 36.31 | - | 57.00 |
BRECQ | 2/4 | 52.30 | - | 61.77 |
QDrop | 2/4 | 52.92 | 46.36 | 63.10 |
Ours+QDrop | 2/4 | 55.16 | 52.11 | 63.72 |
MRECG | 2/4 | 57.85 | - | 65.16 |
Ours+MRECG | 2/4 | 58.84 | - | 65.62 |
AdaRound | 3/3 | 34.55 | - | 58.29 |
BRECQ | 3/3 | 52.03 | - | 62.61 |
QDrop | 3/3 | 54.27 | 49.22 | 64.53 |
Ours+QDrop | 3/3 | 56.33 | 52.88 | 64.58 |
MRECG | 3/3 | 58.40 | - | 66.08 |
Ours+MRECG | 3/3 | 60.51 | - | 66.22 |
BRECQ | 2/2 | 7.03 | - | 28.89 |
QDrop | 2/2 | 8.46 | 6.33 | 38.90 |
Ours+QDrop | 2/2 | 16.18 | 10.50 | 41.29 |
MRECG | 2/2 | 14.44 | - | 43.67 |
Ours+MRECG | 2/2 | 22.03 | - | 45.27 |
方法 | 位宽 (W/A) | MBV2/% | ResNet18/% | ||
---|---|---|---|---|---|
Top1 Acc | Top5 Acc | Top1 Acc | Top5 Acc | ||
FlexRound | 4/4 | 66.66 | 87.21 | 69.26 | 88.81 |
Ours* | 4/4 | 68.72 | 88.50 | 69.35 | 88.96 |
FlexRound | 3/3 | 51.49 | 76.90 | 65.43 | 86.60 |
Ours* | 3/3 | 56.60 | 80.48 | 66.05 | 87.08 |
Table 2 The accuracy compared with FlexRound
方法 | 位宽 (W/A) | MBV2/% | ResNet18/% | ||
---|---|---|---|---|---|
Top1 Acc | Top5 Acc | Top1 Acc | Top5 Acc | ||
FlexRound | 4/4 | 66.66 | 87.21 | 69.26 | 88.81 |
Ours* | 4/4 | 68.72 | 88.50 | 69.35 | 88.96 |
FlexRound | 3/3 | 51.49 | 76.90 | 65.43 | 86.60 |
Ours* | 3/3 | 56.60 | 80.48 | 66.05 | 87.08 |
方法 | 位宽 (W/A) | 块级BN 参数学习 | 块级数 据增强 | MBV2/% |
---|---|---|---|---|
Baseline | 4/4 | 67.89 | ||
Baseline +BN参数学习 | 4/4 | √ | 68.08 | |
Ours | 4/4 | √ | √ | 68.13 |
Baseline | 2/4 | 52.92 | ||
Baseline +BN参数学习 | 2/4 | √ | 54.86 | |
Ours | 2/4 | √ | √ | 55.16 |
Baseline | 3/3 | 54.27 | ||
Baseline +BN参数学习 | 3/3 | √ | 56.28 | |
Ours | 3/3 | √ | √ | 56.33 |
Baseline | 2/2 | 8.46 | ||
Baseline +BN参数学习 | 2/2 | √ | 15.79 | |
Ours | 2/2 | √ | √ | 16.18 |
Table 3 The ablation study result
方法 | 位宽 (W/A) | 块级BN 参数学习 | 块级数 据增强 | MBV2/% |
---|---|---|---|---|
Baseline | 4/4 | 67.89 | ||
Baseline +BN参数学习 | 4/4 | √ | 68.08 | |
Ours | 4/4 | √ | √ | 68.13 |
Baseline | 2/4 | 52.92 | ||
Baseline +BN参数学习 | 2/4 | √ | 54.86 | |
Ours | 2/4 | √ | √ | 55.16 |
Baseline | 3/3 | 54.27 | ||
Baseline +BN参数学习 | 3/3 | √ | 56.28 | |
Ours | 3/3 | √ | √ | 56.33 |
Baseline | 2/2 | 8.46 | ||
Baseline +BN参数学习 | 2/2 | √ | 15.79 | |
Ours | 2/2 | √ | √ | 16.18 |
[1] | KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90. |
[2] | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 779-788. |
[3] | LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]// 2015 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2015: 3431-3440. |
[4] | 张明, 张芳慧, 宗佳平, 等. 基于轻量级网络的人脸检测及嵌入式实现[J]. 图学学报, 2022, 43(2): 239-246. |
ZHANG M, ZHANG F H, ZONG J P, et al. Face detection and embedded implementation of lightweight network[J]. Journal of Graphics, 2022, 43(2): 239-246 (in Chinese).
DOI |
|
[5] |
皮骏, 刘宇恒, 李久昊. 基于YOLOv5s的轻量化森林火灾检测算法研究[J]. 图学学报, 2023, 44(1): 26-32.
DOI |
PI J, LIU Y H, LI J H. Research on lightweight forest fire detection algorithm based on YOLOv5s[J]. Journal of Graphics, 2023, 44(1): 26-32 (in Chinese).
DOI |
|
[6] | COURBARIAUX M, BENGIO Y, DAVID J P. BinaryConnect: training deep neural networks with binary weights during propagations[C]// The 29th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2015: 3123-3131. |
[7] | RASTEGARI M, ORDONEZ V, REDMON J, et al. XNOR-Net: ImageNet classification using binary convolutional neural networks[C]// The 14th European Conference on Computer Vision. Cham: Springer, 2016: 525-542. |
[8] | CHOI J, WANG Z, VENKATARAMANI S, et al. PACT: parameterized clipping activation for quantized neural networks[EB/OL]. [2024-07-05]. https://arxiv.org/abs/1805.06085. |
[9] | ZHANG D Q, YANG J L, YE D Q Z, et al. LQ-Nets: learned quantization for highly accurate and compact deep neural networks[C]// The 15th European Conference on Computer Vision. Cham: Springer, 2018: 373-390. |
[10] | ESSER S K, MCKINSTRY J L, BABLANI D, et al. Learned step size quantization[EB/OL]. [2024-06-05]. https://arxiv.org/abs/1902.08153. |
[11] | HU Q H, WANG P S, CHENG J. From hashing to CNNs: training binary weight networks via Hashing[EB/OL]. [2024-07-05]. https://ojs.aaai.org/index.php/AAAI/article/view/11660. |
[12] | JACOB B, KLIGYS S, CHEN B, et al. Quantization and training of neural networks for efficient integer-arithmetic- only inference[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 2704-2713. |
[13] | MIGACZ S. 8-bit Inference with TensorRT[EB/OL]. [2024-06-05]. https://www.cse.iitd.ernet.in/-rijurekha/course/tensorrt.pdf. |
[14] | BANNER R, NAHSHAN Y, SOUDRY D. Post training 4-bit quantization of convolutional networks for rapid-deployment[C]// The 33rd International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2019: 7950-7958. |
[15] | WU D, TANG Q, ZHAO Y L, et al. EasyQuant: post-training quantization via scale optimization[EB/OL]. [2024-06-05]. https://arxiv.org/abs/2006.16669. |
[16] | WANG P S, CHEN Q, HE X Y, et al. Towards accurate post-training network quantization via bit-split and stitching[EB/OL]. [2024-06-05]. https://dl.acm.org/doi/10.5555/3524938.3525851. |
[17] | NAGEL M, AMJAD R, VAN BAALEN M, et al. Up or down? adaptive rounding for post-training quantization[EB/OL]. [2024-06-05]. https://dl.acm.org/doi/10.5555/3524938.3525605. |
[18] | LI Y H, GONG R H, TAN X, et al. BRECQ: pushing the limit of post-training quantization by block reconstruction[EB/OL]. [2024-07-05]. https://arxiv.org/abs/2102.05426. |
[19] | WEI X Y, GONG R H, LI Y H, et al. QDrop: randomly dropping quantization for extremely low-bit post-training quantization[EB/OL]. [2024-07-05]. https://arxiv.org/abs/2203.05740. |
[20] | MA Y X, LI H X, ZHENG X W, et al. Solving oscillation problem in post-training quantization through a theoretical perspective[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 7950-7959. |
[21] | LEE J H, KIM J, KWON S J, et al. FlexRound: learnable rounding based on element-wise division for post-training quantization[EB/OL]. [2024-06-05]. https://dl.acm.org/doi/10.5555/3618408.3619189. |
[22] | HOWARD A G, ZHU M L, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications[EB/OL]. [2024-07-05]. |
[23] | SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 4510-4520. |
[24] | ZHANG X Y, ZHOU X Y, LIN M X, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices[C]// The IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 6848-6856. |
[25] | MA N N, ZHANG X Y, ZHENG H T, et al. ShuffleNet V2: practical guidelines for efficient CNN architecture design[C]// The 15th European Conference on Computer Vision. Cham: Springer, 2018: 122-138. |
[26] | RADOSAVOVIC I, KOSARAJU R P, GIRSHICK R, et al. Designing network design spaces[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 10425-10433. |
[27] | DENG J, DONG W, SOCHER R, et al. ImageNet: a large-scale hierarchical image database[C]// 2009 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2009: 248-255. |
[1] | GUO Ruidong, LAN Guiwen, FAN Donglin, ZHONG Zhan, XU Zirui, REN Xinyue. An object detection algorithm for powerline inspection based on the feature focus & diffusion network [J]. Journal of Graphics, 2025, 46(4): 719-726. |
[2] | WANG Suqin, DU Yujie, SHI Min, ZHU Dengming. Detection of apparent defects in a small sample of industrial products with category imbalance [J]. Journal of Graphics, 2025, 46(3): 568-577. |
[3] | CHEN Guanhao, XU Dan, HE Kangjian, SHI Hongzhen, ZHANG Hao. TSA-SFNet: transpose self-attention and CNN based stereoscopic fusion network for image super-resolution [J]. Journal of Graphics, 2025, 46(1): 35-46. |
[4] | LI Zhenfeng, FU Shichen, XU Le, MENG Bo, ZHANG Xin, QING Jianjun. Research on gangue target detection algorithm based on MBI-YOLOv8 [J]. Journal of Graphics, 2024, 45(6): 1301-1312. |
[5] | SUN Jilong, LIU Yong, ZHOU Liwei, LU Xin, HOU Xiaolong, WANG Yaqiong, WANG Zhifeng. Research on efficient detection model of tunnel lining crack based on DCNv2 and Transformer Decoder [J]. Journal of Graphics, 2024, 45(5): 1050-1061. |
[6] | HU Xin, CHANG Yashu, QIN Hao, XIAO Jian, CHENG Hongliang. Binocular ranging method based on improved YOLOv8 and GMM image point set matching [J]. Journal of Graphics, 2024, 45(4): 714-725. |
[7] | LI Songyang, WANG Xueting, CHEN Xianglong, CHEN Enqing. Human action recognition based on skeleton dynamic temporal filter [J]. Journal of Graphics, 2024, 45(4): 760-769. |
[8] | ZHU Guanghui, MIAO Jun, HU Hongli, SHEN Ji, DU Ronghua. 3D piece-wise planar reconstruction from a single indoor image based on self-augmented -attention mechanism [J]. Journal of Graphics, 2024, 45(3): 464-471. |
[9] | WANG Jia-jing, WANG Chen, ZHU Yuan-yuan, WANG Xiao-mei. Graph element detection matching based on Republic of China banknotes [J]. Journal of Graphics, 2023, 44(3): 492-501. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||