Journal of Graphics ›› 2023, Vol. 44 ›› Issue (6): 1173-1182.DOI: 10.11996/JG.j.2095-302X.2023061173
Previous Articles Next Articles
HUANG Shao-nian1(), WEN Pei-ran1, QUAN Qi1, CHEN Rong-yuan2(
)
Received:
2023-06-30
Accepted:
2023-10-08
Online:
2023-12-31
Published:
2023-12-17
Contact:
Chen Rongyuan (1977-), professor, Ph.D. His main research interests cover graphic image processing, etc. About author:
Huang Shaonian (1977-), associate professor,Ph.D. Her main research interest covers video content analysis. E-mail:snhuang@hutb.edu.cn
Supported by:
CLC Number:
HUANG Shao-nian, WEN Pei-ran, QUAN Qi, CHEN Rong-yuan. Future frame prediction based on multi-branch aggregation for lightweight video anomaly detection[J]. Journal of Graphics, 2023, 44(6): 1173-1182.
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.txxb.com.cn/EN/10.11996/JG.j.2095-302X.2023061173
方法 | Ped2 | Avenue | ShanghaiTech | 参数(MB) | Flops (G) |
---|---|---|---|---|---|
ConvLSTM-AE[ | 88.1 | 77.0 | - | 6.79 | 89.20 |
MemAE[ | 94.1 | 83.4 | 71.2 | 15.10 | 38.42 |
MNAD-P[ | 97.0 | 88.5 | 70.5 | 15.65 | 43.99 |
Conv-VRNN[ | 96.1 | 85.8 | - | 93.40 | 275.60 |
GADNet[ | 96.1 | 86.2 | 73.2 | - | - |
VEC[ | 97.3 | 90.2 | 74.8 | 21.47 | 1.99 |
MCP[ | 98.0 | 92.1 | 75.3 | - | - |
HF2-VAD[ | 99.3 | 91.1 | 76.2 | 11.80 | 1.84 |
LFP-MBA | 99.4 | 89.8 | 74.2 | 5.40 | 0.60 |
Table 1 Performance comparison on three datasets
方法 | Ped2 | Avenue | ShanghaiTech | 参数(MB) | Flops (G) |
---|---|---|---|---|---|
ConvLSTM-AE[ | 88.1 | 77.0 | - | 6.79 | 89.20 |
MemAE[ | 94.1 | 83.4 | 71.2 | 15.10 | 38.42 |
MNAD-P[ | 97.0 | 88.5 | 70.5 | 15.65 | 43.99 |
Conv-VRNN[ | 96.1 | 85.8 | - | 93.40 | 275.60 |
GADNet[ | 96.1 | 86.2 | 73.2 | - | - |
VEC[ | 97.3 | 90.2 | 74.8 | 21.47 | 1.99 |
MCP[ | 98.0 | 92.1 | 75.3 | - | - |
HF2-VAD[ | 99.3 | 91.1 | 76.2 | 11.80 | 1.84 |
LFP-MBA | 99.4 | 89.8 | 74.2 | 5.40 | 0.60 |
编码器单元 | AUC (%) | 参数 (MB) | Flops (G) |
---|---|---|---|
原始Transformer | 94.2 | 13.2 | 1.5 |
K-MB-Transformers | 90.6 | 4.1 | 0.4 |
MB-Transformers | 99.4 | 5.4 | 0.6 |
Table 2 Performance of multi-branch Transformer encoders
编码器单元 | AUC (%) | 参数 (MB) | Flops (G) |
---|---|---|---|
原始Transformer | 94.2 | 13.2 | 1.5 |
K-MB-Transformers | 90.6 | 4.1 | 0.4 |
MB-Transformers | 99.4 | 5.4 | 0.6 |
解码器模块 | AUC (%) | 参数 (MB) | Flops (G) |
---|---|---|---|
原始Transformer | 95.0 | 15.4 | 1.7 |
MB-Transformers | 97.3 | 10.1 | 1.1 |
K-MB-Transformers | 99.4 | 5.4 | 0.6 |
Table 3 Performance of multi-branch Transformer decoders
解码器模块 | AUC (%) | 参数 (MB) | Flops (G) |
---|---|---|---|
原始Transformer | 95.0 | 15.4 | 1.7 |
MB-Transformers | 97.3 | 10.1 | 1.1 |
K-MB-Transformers | 99.4 | 5.4 | 0.6 |
操作 | AUC |
---|---|
无连接操作 | 98.5 |
连接操作 | 98.9 |
分支连接操作 | 99.4 |
Table 4 The performance of branch connection (%)
操作 | AUC |
---|---|
无连接操作 | 98.5 |
连接操作 | 98.9 |
分支连接操作 | 99.4 |
预处理方式 | AUC |
---|---|
无滑动窗口 | 94.8 |
2×2滑动窗口 | 99.4 |
32×32特征图 | 99.4 |
64×64特征图 | 96.4 |
无前景图像块 | 95.7 |
有前景图像块 | 99.4 |
Table 5 The performance of data preprocessing (%)
预处理方式 | AUC |
---|---|
无滑动窗口 | 94.8 |
2×2滑动窗口 | 99.4 |
32×32特征图 | 99.4 |
64×64特征图 | 96.4 |
无前景图像块 | 95.7 |
有前景图像块 | 99.4 |
L | AUC |
---|---|
2 | 96.0 |
4 | 99.4 |
8 | 99.4 |
Table 6 The performance of different branch number (%)
L | AUC |
---|---|
2 | 96.0 |
4 | 99.4 |
8 | 99.4 |
k | AUC |
---|---|
64 | 96.1 |
100 | 99.4 |
120 | 97.2 |
400 | 96.8 |
Table 7 The Performance of Different Cluster (%)
k | AUC |
---|---|
64 | 96.1 |
100 | 99.4 |
120 | 97.2 |
400 | 96.8 |
[1] |
NAYAK R, PATI U C, DAS S K. A comprehensive review on deep learning-based methods for video anomaly detection[J]. Image and Vision Computing, 2021, 106: 104078.
DOI URL |
[2] | 杨帆, 肖斌, 於志文. 监控视频的异常检测与建模综述[J]. 计算机研究与发展, 2021, 58(12): 2708-2723. |
YANG F, XIAO B, YU Z W. Anomaly detection and modeling of surveillance video[J]. Journal of Computer Research and Development, 2021, 58(12): 2708-2723 (in Chinese). | |
[3] | 陈亚当, 陈柳任, 余文斌, 等. 多尺度特征融合的知识蒸馏异常检测方法[J]. 计算机辅助设计与图形学学报, 2022, 34(10): 1542-1549. |
CHEN Y D, CHEN L R, YU W B, et al. Knowledge distillation anomaly detection with multi-scale feature fusion[J]. Journal of Computer-Aided Design & Computer Graphics, 2022, 34(10): 1542-1549 (in Chinese). | |
[4] | GONG D, LIU L Q, LE V, et al. Memorizing normality to detect anomaly: memory-augmented deep autoencoder for unsupervised anomaly detection[C]// 2019 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2019: 1705-1714. |
[5] | CHANG Y P, TU Z G, XIE W, et al. Clustering driven deep autoencoder for video anomaly detection[C]// European Conference on Computer Vision. Cham: Springer, 2020: 329-345. |
[6] | OUYANG Y Q, SANCHEZ V. Video anomaly detection by estimating likelihood of representations[C]// 2020 25th International Conference on Pattern Recognition. New York: IEEE Press, 2021: 8984-8991. |
[7] | ASTRID M, ZAHEER M Z, LEE S I. Synthetic temporal anomaly guided end-to-end video anomaly detection[C]// 2021 IEEE/CVF International Conference on Computer Vision Workshops. New York: IEEE Press, 2021: 207-214. |
[8] |
CHEN D Y, YUE L Y, CHANG X Y, et al. NM-GAN: noise-modulated generative adversarial network for video anomaly detection[J]. Pattern Recognition, 2021, 116: 107969.
DOI URL |
[9] | BERGAOUI K, NAJI Y, SETKOV A, et al. Object-centric and memory-guided normality reconstruction for video anomaly detection[C]// 2022 IEEE International Conference on Image Processing. New York: IEEE Press, 2022: 2691-2695. |
[10] | LIU W, LUO W X, LIAN D Z, et al. Future frame prediction for anomaly detection — a new baseline[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 6536-6545. |
[11] |
WANG X Z, CHE Z P, JIANG B, et al. Robust unsupervised video anomaly detection by multipath frame prediction[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 33(6): 2301-2312.
DOI URL |
[12] | LIU W, LUO W X, LIAN D Z, et al. Future frame prediction for anomaly detection — a new baseline[EB/OL]. [2023-01- 12]. https://www.doc88.com/p-6037800762403.html. |
[13] |
LI S, FANG J W, XU H K, et al. Video frame prediction by deep multi-branch mask network[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 31(4): 1283-1295.
DOI URL |
[14] | SHI X J, CHEN Z R, WANG H, et al. Convolutional LSTM network: a machine learning approach for precipitation nowcasting[C]// The 28th International Conference on Neural Information Processing Systems - Volume 1. New York:ACM, 2015: 802-810. |
[15] | KWON Y H, PARK M G. Predicting future frames using retrospective cycle GAN[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 1811-1820. |
[16] |
MONIRUZZAMAN M D, RASSAU A, CHAI D, et al. Long future frame prediction using optical flow-informed deep neural networks for enhancement of robotic teleoperation in high latency environments[J]. Journal of Field Robotics, 2023, 40(2): 393-425.
DOI URL |
[17] | DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: transformers for image recognition at scale[EB/OL]. [2023-01-12]. https://arxiv.org/abs/2010.11929.pdf. |
[18] | LEE J, NAM W J, LEE S W. Multi-contextual predictions with vision transformer for video anomaly detection[C]// 2022 26th International Conference on Pattern Recognition. New York: IEEE Press, 2022: 1012-1018. |
[19] | FENG X Y, SONG D J, CHEN Y C, et al. Convolutional transformer based dual discriminator generative adversarial networks for video anomaly detection[C]// The 29th ACM International Conference on Multimedia. New York: ACM, 2021: 5546-5554. |
[20] |
ULLAH W, HUSSAIN T, ULLAH F U M, et al. TransCNN: hybrid CNN and transformer mechanism for surveillance anomaly detection[J]. Engineering Applications of Artificial Intelligence, 2023, 123: 106173.
DOI URL |
[21] |
LLOYD S. Least squares quantization in PCM[J]. IEEE Transactions on Information Theory, 1982, 28(2): 129-137.
DOI URL |
[22] | JANG E, GU S X, POOLE B. Categorical reparameterization with gumbel-softmax[EB/OL]. [2023-01-12]. https://arxiv.org/abs/1611.01144v4. |
[23] | LUO W X, LIU W, GAO S H. Remembering history with convolutional LSTM for anomaly detection[C]// 2017 IEEE International Conference on Multimedia and Expo. New York: IEEE Press, 2017: 439-444. |
[24] | PARK H, NOH J, HAM B. Learning memory-guided normality for anomaly detection[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 14360-14369. |
[25] | LU Y W, KUMAR K M, NABAVI S S, et al. Future frame prediction using convolutional VRNN for anomaly detection[C]// 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance. New York: IEEE Press, 2019: 1-8. |
[26] |
LI C B, LI H J, ZHANG G A. Future frame prediction based on generative assistant discriminative network for anomaly detection[J]. Applied Intelligence, 2023, 53(1): 542-559.
DOI |
[27] | YU G, WANG S Q, CAI Z P, et al. Cloze test helps: effective video anomaly detection via learning to complete video events[C]// The 28th ACM International Conference on Multimedia. New York: ACM, 2020: 583-591. |
[28] | LEE J, NAM W J, LEE S W. Multi-contextual predictions with vision transformer for video anomaly detection[C]// 2022 26th International Conference on Pattern Recognition. New York: IEEE Press, 2022: 1012-1018. |
[29] | LIU Z A, NIE Y W, LONG C J, et al. A hybrid video anomaly detection framework via memory-augmented flow reconstruction and flow-guided frame prediction[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2022: 13588-13597. |
[1] |
LI Jiaqi, WANG Hui, GUO Yu.
Classification and segmentation network based on Transformer for triangular mesh
[J]. Journal of Graphics, 2024, 45(1): 78-89.
|
[2] |
LV Heng, YANG Hongyu.
A 3D human pose estimation approach based on spatio-temporal motion interaction modeling
[J]. Journal of Graphics, 2024, 45(1): 159-168.
|
[3] | SHI Jia-hao, YAO Li. Video captioning based on semantic guidance [J]. Journal of Graphics, 2023, 44(6): 1191-1201. |
[4] | YANG Chen-cheng, DONG Xiu-cheng, HOU Bing, ZHANG Dang-cheng, XIANG Xian-ming, FENG Qi-ming. Reference based transformer texture migrates depth images super resolution reconstruction [J]. Journal of Graphics, 2023, 44(5): 861-867. |
[5] | YANG Hong-ju, GAO Min, ZHANG Chang-you, BO Wen, WU Wen-jia, CAO Fu-yuan. A local optimization generation model for image inpainting [J]. Journal of Graphics, 2023, 44(5): 955-965. |
[6] | HAO Shuai, ZHAO Xin-sheng, MA Xu, ZHANG Xu, HE Tian, HOU Li-xiang. Multi-class defect target detection method for transmission lines based on TR-YOLOv5 [J]. Journal of Graphics, 2023, 44(4): 667-676. |
[7] | LI Gang, ZHANG Yun-tao, WANG Wen-kai, ZHANG Dong-yang. Defect detection method of transmission line bolts based on DETR and prior knowledge fusion [J]. Journal of Graphics, 2023, 44(3): 438-447. |
[8] | YAN Shan-wu, XIAO Hong-bing, WANG Yu, SUN Mei. Video anomaly detection combining pedestrian spatiotemporal information [J]. Journal of Graphics, 2023, 44(1): 95-103. |
[9] | PAN Dong-hui, JIN Ying-han, SUN Xu, LIU Yu-sheng, ZHANG Dong-liang. CTH-Net: CNN-Transformer hybrid network for garment image generation from sketches and color points [J]. Journal of Graphics, 2023, 44(1): 120-130. |
[10] | WANG Yu-ping, ZENG Yi, LI Sheng-hui, ZHANG Lei. A Transformer-based 3D human pose estimation method [J]. Journal of Graphics, 2023, 44(1): 139-145. |
[11] | HU Hai-tao , DU Hao-chen , WANG Su-qin , SHI Min , ZHU Deng-ming, . Improved YOLOX method for detecting surface defects of drug blister aluminum foil [J]. Journal of Graphics, 2022, 43(5): 803-814. |
[12] | LYU Hao, YI Peng-fei, LIU Rui, ZHOU Dong-sheng, ZHANG Qiang, WEI Xiao-peng. Sequential multi-scale autoencoder for video anomaly detection [J]. Journal of Graphics, 2022, 43(2): 223-229. |
[13] | CHANG Shengjie1,2, DANG Jianwu1,2, WANG Yangping1,2. BIM Model and Information Integration of Traction Substation Equipment Based on Revit [J]. Journal of Graphics, 2018, 39(4): 771-777. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||