Welcome to Journal of Graphics share: 

Journal of Graphics ›› 2025, Vol. 46 ›› Issue (1): 114-125.DOI: 10.11996/JG.j.2095-302X.2025010114

• Image Processing and Computer Vision • Previous Articles     Next Articles

Consistent and unbiased teacher model research for domain adaptive object detection

CHENG Xudong1,2(), SHI Caijuan1,2(), GAO Weixiang1,2, WANG Sen1,2, DUAN Changyu1,2, YAN Xiaodong1,2   

  1. 1. College of Artificial Intelligence, North China University of Science and Technology, Tangshan Hebei 063210, China
    2. Hebei Key Laboratory of Industrial Intelligent Perception, Tangshan Hebei 063210, China
  • Received:2024-07-01 Accepted:2024-10-19 Online:2025-02-28 Published:2025-02-14
  • Contact: SHI Caijuan
  • About author:First author contact:

    CHENG Xudong (1999-), master student. His main research interests cover transfer learning and object detection. E-mail:chengxd99@163.com

  • Supported by:
    Open project of Beijing Key Laboratory of Modern Information Science and Network Technology(XDXX2301);Distinguished Youth Foundation of North China University of Science and Technology(JQ201715);Talent Foundation of Tangshan(A202110011)

Abstract:

As a significant approach, the self-training method has significantly enhanced the performance of domain adaptive object detection methods. The self-training method primarily predicts target domain data through a teacher network, and then selects high-confidence predictions as pseudo-labels to guide student network training. However, due to significant domain differences between the source and target domains, the pseudo-labels generated by the teacher network are often of poor quality, adversely impacting student network training and reducing the performance of object detection. To address this challenge, a consistent and unbiased teacher (CUT) model for domain adaptive object detection was proposed. Firstly, an adaptive threshold generation (ATG) module was designed within the teacher network. The ATG module utilized a Gaussian mixture model (GMM) during training to generate adaptive thresholds for each image, ensuring temporal consistency of pseudo-label quantities and thereby enhancing their quality. Secondly, a prediction-guided sample selection (PSS) strategy was introduced, which leveraged predictions from the region proposal network within the teacher network to guide the selection of positive and negative samples for the student network. The PSS strategy effectively aligned the selected samples with real outcomes, thereby mitigating the impact of low-quality pseudo-labels on the student network. Furthermore, to improve detection performance for small objects and challenging objects with fewer instances, a mixed domain augmentation (MDA) module was devised to generate mixed-domain images containing random information from both the source and target-like domains to supervise student network training. Extensive experiments conducted on three scenario datasets demonstrated the effectiveness of the proposed CUT, with performance improvements of 4.0%, 5.8%, and 3.7%, respectively. Notably, the proposed CUT model applied the self-training method for the first time to address the problem of large domain disparities between visual images and infrared images.

Key words: domain adaption, object detection, self-training, pseudo-labels, consistency

CLC Number: