欢迎访问《图学学报》 分享到:

图学学报 ›› 2023, Vol. 44 ›› Issue (5): 907-917.DOI: 10.11996/JG.j.2095-302X.2023050907

• 图像处理与计算机视觉 • 上一篇    下一篇

基于双源判别器的域自适应城市场景语义分割

张桂梅(), 陶辉, 鲁飞飞, 彭昆   

  1. 南昌航空大学计算机视觉研究所,江西 南昌 330063
  • 收稿日期:2023-04-27 接受日期:2023-08-07 出版日期:2023-10-31 发布日期:2023-10-31
  • 作者简介:张桂梅(1970-),女,教授,博士。主要研究方向为图像处理与计算机视觉。E-mail:guimei.zh@163.com
  • 基金资助:
    国家自然科学基金项目(62162045)

Domain adaptive urban scene semantic segmentation based on dual-source discriminator

ZHANG Gui-mei(), TAO Hui, LU Fei-fei, PENG Kun   

  1. Institute of computer Vision, Nanchang Hangkong University, Nanchang Jiangxi 330063, China
  • Received:2023-04-27 Accepted:2023-08-07 Online:2023-10-31 Published:2023-10-31
  • About author:ZHANG Gui-mei (1970-), Professor, Ph.D. Her main research interests cover image processing and computer vision. E-mail:guimei.zh@163.com
  • Supported by:
    National Natural Science Foundation of China(62162045)

摘要:

域自适应分割网络是城市场景跨域语义分割的有效方法,但由于跨域数据集外观分布不同导致域差异,且网络对小目标分割精度不理想。针对该问题,提出基于双源判别器的域自适应分割方法。首先,对源域S使用风格转换方法FastPhotoStyle得到新源域S',从图像层面降低域差异。然后,利用生成器分别提取源域S、新源域S'和目标域T的分割特征图,将新源域的特征图作为中间桥梁,分别与源域特征图,目标域特征图进行通道维度上的特征融合,将得到的2个融合后的特征图输入双源判别器中,双源判别器和生成器迭代进行对抗训练。由于该模型的判别器输入为双源特征,故称为双源判别器,双源输入的2个特征包含相似的特征信息,进一步从特征层面降低域差异。为了进一步提高分割精度,引入自训练的伪标签,同时针对训练时出现的类不平衡问题,提出在目标域的损失函数中引入类平衡因子,增加网络对小目标的分割能力。在2个分割任务GTA5→Cityscapes和SYNTHIA→Cityscapes上进行的实验证明了该方法的先进性和有效性。

关键词: 双源判别器, 对抗学习, 域自适应, 语义分割, 自训练

Abstract:

The adaptive segmentation network represents an efficacious method for cross-domain semantic segmentation within urban scenes. However, the challenge arises from the distinct appearance distributions among cross-domain datasets, leading to domain gaps and unsatisfactory network segmentation accuracy for small targets. To address these issues, a domain adaptive segmentation method based on a dual-source discriminator was proposed. Firstly, the new source domain S' was obtained using the style translation technology FastPhotoStyle for the source domain S, thereby reducing the domain gaps at the image level. Next, the generator was employed to extract segmentation feature maps from the source domain S, the new source domain S', and the target domain T, respectively. The feature map of the new source domain served as an intermediate bridge for the channel-wise fusion between the source and target domains feature maps. The two fused feature maps were input into the dual-source discriminator, with both the dual-source discriminator and the generator undergoing iterative training. Since the discriminator input of the proposed model consists of dual-source features, it is referred to as a dual-source discriminator. The two features from the dual-source input contained similar feature information, which further reduced domain differences at the feature level. To enhance segmentation accuracy, a self-training pseudo-label was introduced. At the same time, to address class imbalance issues during training, a class balance factor was incorporated into the loss function of the target domain, thereby enhancing the network’s ability to segment small targets. Experiments on two segmentation tasks GTA5→Cityscapes and SYNTHIA→Cityscapes demonstrated the advancement and effectiveness of the proposed method.

Key words: dual-source discriminator, adversarial learning, domain adaptation, semantic segmentation, self-training

中图分类号: