欢迎访问《图学学报》 分享到:

图学学报 ›› 2025, Vol. 46 ›› Issue (1): 47-58.DOI: 10.11996/JG.j.2095-302X.2025010047

• 图像处理与计算机视觉 • 上一篇    下一篇

一种强化伪造区域关注的深度伪造人脸检测方法

张文祥(), 王夏黎(), 王欣仪, 杨宗宝   

  1. 长安大学信息工程学院,陕西 西安 710064
  • 收稿日期:2024-07-10 接受日期:2024-10-11 出版日期:2025-02-28 发布日期:2025-02-14
  • 通讯作者:王夏黎(1965-),男,副教授,博士。主要研究方向为图形图像处理与计算机视觉。E-mail:xlwang@chd.edu.cn
  • 第一作者:张文祥(2001-),男,硕士研究生。主要研究方向为图形图像处理与计算机视觉。E-mail:2495898570@qq.com
  • 基金资助:
    国家自然科学基金(51678061)

A deepfake face detection method that enhances focus on forgery regions

ZHANG Wenxiang(), WANG Xiali(), WANG Xinyi, YANG Zongbao   

  1. School of Information Engineering, Chang'an University, Xi'an Shaanxi 710064, China
  • Received:2024-07-10 Accepted:2024-10-11 Published:2025-02-28 Online:2025-02-14
  • Contact: WANG Xiali (1965-), associate professor, Ph.D. His main research interests cover graphic image processing and computer vision. E-mail:xlwang@chd.edu.cn
  • First author:ZHANG Wenxiang (2001-), master student. His main research interests cover graphic image processing and computer vision. E-mail:2495898570@qq.com
  • Supported by:
    National Natural Science Foundation of China(51678061)

摘要:

深度伪造人脸技术发展迅速并已被广泛应用于各种不良途径,检测被篡改的面部图像和视频也因此成为了一个重要的研究课题。现有的卷积神经网络存在过拟合,泛化性差的问题,在未知的合成人脸数据上表现不佳。针对这一不足,提出一种强化伪造区域关注的深度伪造人脸检测方法。首先,引入注意力机制处理用于分类的特征图,学习到的注意力图可以突出被篡改的面部区域,提高了模型的泛化能力;其次,在骨干网络之后连接了伪造区域检测模块,通过检测多尺度锚框中是否存在伪造痕迹,减少了全局人脸信息的干扰,进一步加强了模型对局部伪造区域的关注;最后,引入一种一致性表示学习框架,通过明确约束同一输入的不同表示之间的一致性,使模型更加关注内在的伪造证据,避免过拟合。在FaceForensics++,Celeb-DF-v2和DFDC等3个数据集上,分别以EfficientNet-b4和Xception作为骨干网络进行实验。结果表明,该方法在数据集内评估时达到了较好的性能,在跨数据集评估时则优于原网络和其他先进的方法。

关键词: 深度伪造人脸检测, 注意力机制, 伪造区域检测, 多尺度锚框, 一致性表示

Abstract:

The rapid development of deepfake face technology has led to its widespread use in various undesirable ways, making the detection of manipulated facial images and videos an important research topic. Existing convolutional neural networks suffer from overfitting and poor generalization, performing poorly on unknown synthetic face data. To address this limitation, a deepfake face detection method with enhanced focus on forgery regions was proposed. Firstly, an attention mechanism was introduced to process the feature map used for classification, and the learned attention map could highlight the manipulated facial area, thereby improving the generalization capability of the model. Secondly, a forgery regions detection module was connected to the backbone network, reducing the interference of global face information by detecting forgery traces in the multi-scale anchors, further strengthening the model's attention to the local forgery regions. Finally, a consistent representation learning framework was introduced, ensuring that the model pays more attention to the inherent evidence of forgery and avoids overfitting by explicitly constraining the consistency between different representations of the same input. Experiments were conducted on three datasets, including FaceForensics++, Celeb-DF-v2, and DFDC, using EfficientNet-b4 and Xception as the backbone networks, respectively. The results demonstrated that the proposed method achieved good performance in intra-dataset evaluation, and outperformed the original networks and other advanced methods in cross-dataset evaluation.

Key words: deepfake face detection, attention mechanism, forgery regions detection, multi-scale anchors, consistency representation

中图分类号: