欢迎访问《图学学报》 分享到:

图学学报 ›› 2025, Vol. 46 ›› Issue (1): 104-113.DOI: 10.11996/JG.j.2095-302X.2025010104

• 图像处理与计算机视觉 • 上一篇    下一篇

基于频域和空域多特征融合的深度伪造检测方法

董佳乐1(), 邓正杰1,2(), 李喜艳1, 王诗韵1   

  1. 1.海南师范大学信息科学技术学院,海南 海口 571127
    2.广西图像图形与智能处理重点实验室,广西 南宁 541004
  • 收稿日期:2024-08-22 接受日期:2024-11-21 出版日期:2025-02-28 发布日期:2025-02-14
  • 通讯作者:邓正杰(1980-),男,副教授,博士。主要研究方向为人工智能、网络空间安全、虚拟现实和计算机教育等。E-mail:jet_dunn@qq.com
  • 第一作者:董佳乐(1999-),女,硕士研究生。主要研究方向为人工智能系统安全。E-mail:dongjiale1107@163.com
  • 基金资助:
    海南省自然科学基金(623QN236);海口市重大科技计划项目(2022-007);广西图像图形与智能处理重点实验室培育基地(桂林电子科技大学)开放基金(GIIP2012)

Deepfake detection method based on multi-feature fusion of frequency domain and spatial domain

DONG Jiale1(), DENG Zhengjie1,2(), LI Xiyan1, WANG Shiyun1   

  1. 1. College of Information Science and Technology, Hainan Normal University, Haikou Hainan 571127, China
    2. Guangxi Key Laboratory of Image Processing and Intelligent Analysis, Nanning Guangxi 541004, China
  • Received:2024-08-22 Accepted:2024-11-21 Published:2025-02-28 Online:2025-02-14
  • Contact: DENG Zhengjie (1980-), associate professor, Ph.D. His main research interests cover artificial intelligence, cyberspace security, virtual reality and computer education, etc. E-mail:jet_dunn@qq.com
  • First author:DONG Jiale (1999-), master student. Her main research interest covers security of artificial intelligence systems. E-mail:dongjiale1107@163.com
  • Supported by:
    Hainan Provincial Natural Science Foundation(623QN236);Haikou Science and Technology Plan Project(2022-007);Open Funds from Guilin University of Electronic Technology, Guangxi Key Laboratory of Image and Graphic Intelligent Processing(GIIP2012)

摘要:

在当今社会,面部伪造技术的迅速发展对社会安全构成了巨大挑战,尤其是在深度学习技术被广泛应用于生成逼真的伪造视频的背景下。这些高质量的伪造内容不仅威胁到个人隐私,还可能被用于不法活动。面对这一挑战,传统的基于单一特征的伪造检测方法已经难以满足检测需求。因此,提出了一种基于频域和空域多特征融合的深度伪造检测方法,以提高面部伪造内容的检测准确率和泛化能力。并将频域动态划分为3个频带来提取在空域中无法挖掘的伪造伪影;对空域使用EfficientNet_b4网络和Transformer架构多尺度划分图像块来计算不同块的差异、根据上下图像块之间的一致性信息来进行检测以及捕获更精细的伪造特征信息;最后使用查询-键-值机制的融合块,将上述中的频域和空域的方法进行融合,从而更全面地挖掘到2个域中的特征信息,提升伪造检测的准确性和迁移性。大量的实验结果显示该方法有效,其性能明显优于传统深度伪造检测方法。

关键词: 深度伪造检测, EfficientNet_b4网络, 频域特征, 空域特征, 特征融合

Abstract:

In today's society, the rapid advancement of facial forgery technology has posed a substantial challenge to social security, especially in the context where deep learning techniques have been widely employed to generate realistic fake videos. These high-quality forged contents not only threaten personal privacy but can also be utilized for illegal activities. Faced with this challenge, traditional forgery detection methods based on single features have become inadequate to meet detection demands. To address this issue, a deepfake detection method based on multi-feature fusion in both frequency and spatial domains was proposed to enhance the detection accuracy and generalization capability for facial forgeries. The frequency domain was dynamically divided into three bands to extract forgery artifacts that cannot be mined in the spatial domain. The spatial domain employed the EfficientNet_b4 network and Transformer architecture to segment image blocks at multiple scales, calculate differences between different blocks, perform detection based on consistency information between upper and lower image blocks, and capture more detailed forgery feature information. Finally, a fusion block using a query-key-value mechanism integrated the methods from the frequency and spatial domains, thereby more comprehensively mining feature information from both domains to enhance the accuracy and transferability of forgery detection. Extensive experimental results confirmed the effectiveness of the proposed method, demonstrating significantly superior performance compared to traditional deepfake detection methods.

Key words: deepfake detection, EfficientNet_b4 network, frequency domain features, spatial domain features, feature fusion

中图分类号: