Welcome to Journal of Graphics share: 

Journal of Graphics ›› 2023, Vol. 44 ›› Issue (6): 1173-1182.DOI: 10.11996/JG.j.2095-302X.2023061173

Previous Articles     Next Articles

Future frame prediction based on multi-branch aggregation for lightweight video anomaly detection

HUANG Shao-nian1(), WEN Pei-ran1, QUAN Qi1, CHEN Rong-yuan2()   

  1. 1. School of Computer Science, Hunan University of Technology and Business, Changsha Hunan 410205, China
    2. School of Resource and Environment, Hunan University of Technology and Business, Changsha Hunan 410205, China
  • Received:2023-06-30 Accepted:2023-10-08 Online:2023-12-31 Published:2023-12-17
  • Contact: Chen Rongyuan (1977-), professor, Ph.D. His main research interests cover graphic image processing, etc. E-mail:chry@hutb.edu.cn
  • About author:

    Huang Shaonian (1977-), associate professor,Ph.D. Her main research interest covers video content analysis. E-mail:snhuang@hutb.edu.cn

  • Supported by:
    The National Social Science Foundation of China(21BTJ026);The Scientific Research Fund of Hunan Provincial Education Department(19A270);The Scientific Research Fund of Hunan Provincial Education Department(21A0370);Funds for Creative Research of China Universities on the Integration of Industry, Education and Research(2020ITA09005);Funds for Creative Research of China Universities on the Integration of Industry, Education and Research(2021ITA05049)

Abstract:

Video anomaly detection in complex scenes holds significant research value and practical applications. Despite the remarkable performance of current prediction-based methods, they encountered challenges, such as the use of large model parameters. To address these problems, we proposed a lightweight model based on multi-branch aggregation for frame prediction. The proposed model leveraged Transformer units as basic structures, with multi-branch aggregation, reducing the model parameters significantly. This method not only reduced computational costs but also enhanced detection accuracy. Building on this foundation, we designed a multi-branch Transformer fusion encoder extracting temporal motion features of normal events. The proposed encoder utilized a multi-branch connection operation to achieve multi-layer feature fusion, elevating the encoder's feature optimization ability. Moreover, a multi-branch clustering decoder was developed using K-means to mitigate the impact of normal feature diversity on anomaly detection performance. Experiments were conducted on three public datasets: UCSD Ped2, CUHK Avenue, and ShanghaiTech. The results demonstrated that the proposed model outperformed the current mainstream algorithms, achieving better detection performance and lower computational cost.

Key words: frame prediction, video anomaly detection, multi-branch fusion, Transformer

CLC Number: