Journal of Graphics ›› 2023, Vol. 44 ›› Issue (5): 907-917.DOI: 10.11996/JG.j.2095-302X.2023050907
• Image Processing and Computer Vision • Previous Articles Next Articles
ZHANG Gui-mei(
), TAO Hui, LU Fei-fei, PENG Kun
Received:2023-04-27
Accepted:2023-08-07
Online:2023-10-31
Published:2023-10-31
About author:ZHANG Gui-mei (1970-), Professor, Ph.D. Her main research interests cover image processing and computer vision. E-mail:guimei.zh@163.com
Supported by:CLC Number:
ZHANG Gui-mei, TAO Hui, LU Fei-fei, PENG Kun. Domain adaptive urban scene semantic segmentation based on dual-source discriminator[J]. Journal of Graphics, 2023, 44(5): 907-917.
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.txxb.com.cn/EN/10.11996/JG.j.2095-302X.2023050907
| 编号 | 数据集Dataset | 颜色直方图 | SSIM |
|---|---|---|---|
| 图像1 | DS-T | 0.247 9 | 0.117 8 |
| DS'-T | 0.375 7 | 0.157 9 | |
| 图像2 | DS-T | 0.188 3 | 0.061 8 |
| DS'-T | 0.295 0 | 0.312 4 | |
| 图像3 | DS-T | 0.214 8 | 0.078 6 |
| DS'-T | 0.294 3 | 0.150 9 |
Table 1 Comparison of experimental results pre and post style translation
| 编号 | 数据集Dataset | 颜色直方图 | SSIM |
|---|---|---|---|
| 图像1 | DS-T | 0.247 9 | 0.117 8 |
| DS'-T | 0.375 7 | 0.157 9 | |
| 图像2 | DS-T | 0.188 3 | 0.061 8 |
| DS'-T | 0.295 0 | 0.312 4 | |
| 图像3 | DS-T | 0.214 8 | 0.078 6 |
| DS'-T | 0.294 3 | 0.150 9 |
| Domain adaptation dataset | Method | mIoU (%) |
|---|---|---|
| GTA5→Cityscapes | AT(S+T) | 35.0 |
| AT(S'+T) | 42.0 | |
| SYNTHIA→Cityscapes | AT(S+T) | 37.6 |
| AT(S'+T) | 44.6 |
Table 2 Comparison of segmentation accuracy pre and post style translation
| Domain adaptation dataset | Method | mIoU (%) |
|---|---|---|
| GTA5→Cityscapes | AT(S+T) | 35.0 |
| AT(S'+T) | 42.0 | |
| SYNTHIA→Cityscapes | AT(S+T) | 37.6 |
| AT(S'+T) | 44.6 |
| 类别 | 方法 | |||||
|---|---|---|---|---|---|---|
| AdaptSegNet[ | AdvEnt[ | CLAN[ | Cycada[ | SEGL[ | Ours | |
| Road | 87.30 | 86.9 | 88.00 | 87.30 | 2.1 | 92.4 |
| Sidewalk | 29.80 | 28.7 | 30.60 | 33.50 | 53.9 | 54.5 |
| Building | 78.60 | 78.7 | 79.20 | 77.90 | 81.4 | 83.2 |
| Wall | 21.10 | 28.5 | 23.40 | 20.90 | 27.3 | 30.8 |
| Fence | 18.20 | 25.2 | 20.50 | 17.90 | 25.1 | 24.8 |
| Pole | 22.50 | 17.1 | 26.10 | - | 33.2 | 34.0 |
| Light | 21.50 | 20.3 | 23.00 | 33.40 | 38.8 | 39.1 |
| Sign | 11.00 | 10.9 | 14.80 | 19.70 | 23.0 | 24.5 |
| Vegetation | 79.70 | 80.0 | 81.60 | 83.20 | 83.5 | 84.1 |
| Terrain | 29.60 | 26.4 | 34.50 | - | 34.1 | 34.9 |
| Sky | 71.30 | 70.2 | 72.00 | 70.10 | 70.7 | 78.7 |
| Person | 46.80 | 47.1 | 45.80 | 43.30 | 58.5 | 51.9 |
| Rider | 6.50 | 8.4 | 7.90 | - | 29.4 | 19.2 |
| Car | 80.10 | 81.5 | 80.50 | 77.40 | 84.2 | 84.3 |
| Truck | 23.00 | 26.0 | 26.60 | - | 27.8 | 28.3 |
| Bus | 26.90 | 17.2 | 29.90 | 22.50 | 34.8 | 38.3 |
| Train | 0.01 | 18.9 | 0.01 | 3.40 | 4.8 | 3.6 |
| Motorbike | 10.60 | 11.7 | 10.70 | 11.30 | 25.1 | 13.2 |
| Bike | 0.30 | 1.6 | 0.00 | 12.90 | 19.4 | 20.4 |
| mIoU | 35.00 | 36.1 | 36.60 | 37.20 | 44.8 | 45.8 |
Table 3 GTA5→Cityscapes typical domain adaptive segmentation methods comparative experiments (%)
| 类别 | 方法 | |||||
|---|---|---|---|---|---|---|
| AdaptSegNet[ | AdvEnt[ | CLAN[ | Cycada[ | SEGL[ | Ours | |
| Road | 87.30 | 86.9 | 88.00 | 87.30 | 2.1 | 92.4 |
| Sidewalk | 29.80 | 28.7 | 30.60 | 33.50 | 53.9 | 54.5 |
| Building | 78.60 | 78.7 | 79.20 | 77.90 | 81.4 | 83.2 |
| Wall | 21.10 | 28.5 | 23.40 | 20.90 | 27.3 | 30.8 |
| Fence | 18.20 | 25.2 | 20.50 | 17.90 | 25.1 | 24.8 |
| Pole | 22.50 | 17.1 | 26.10 | - | 33.2 | 34.0 |
| Light | 21.50 | 20.3 | 23.00 | 33.40 | 38.8 | 39.1 |
| Sign | 11.00 | 10.9 | 14.80 | 19.70 | 23.0 | 24.5 |
| Vegetation | 79.70 | 80.0 | 81.60 | 83.20 | 83.5 | 84.1 |
| Terrain | 29.60 | 26.4 | 34.50 | - | 34.1 | 34.9 |
| Sky | 71.30 | 70.2 | 72.00 | 70.10 | 70.7 | 78.7 |
| Person | 46.80 | 47.1 | 45.80 | 43.30 | 58.5 | 51.9 |
| Rider | 6.50 | 8.4 | 7.90 | - | 29.4 | 19.2 |
| Car | 80.10 | 81.5 | 80.50 | 77.40 | 84.2 | 84.3 |
| Truck | 23.00 | 26.0 | 26.60 | - | 27.8 | 28.3 |
| Bus | 26.90 | 17.2 | 29.90 | 22.50 | 34.8 | 38.3 |
| Train | 0.01 | 18.9 | 0.01 | 3.40 | 4.8 | 3.6 |
| Motorbike | 10.60 | 11.7 | 10.70 | 11.30 | 25.1 | 13.2 |
| Bike | 0.30 | 1.6 | 0.00 | 12.90 | 19.4 | 20.4 |
| mIoU | 35.00 | 36.1 | 36.60 | 37.20 | 44.8 | 45.8 |
| 类别 | 方法 | |||||
|---|---|---|---|---|---|---|
| AdaptSegNet[ | AdvEnt[ | CLAN[ | Cycada[ | SEGL[ | Ours | |
| Road | 78.9 | 67.9 | 80.4 | 84.4 | 83.2 | 78.9 |
| Sidewalk | 29.2 | 29.4 | 30.7 | 29.6 | 40.6 | 31.8 |
| Building | 75.5 | 71.9 | 74.7 | 74.1 | 80.3 | 78.8 |
| Light | 0.1 | 0.6 | 1.4 | 12.6 | 7.9 | 9.1 |
| Sign | 4.8 | 2.6 | 8.0 | 14.3 | 11.2 | 8.7 |
| Vegetation | 72.6 | 74.9 | 77.1 | 79.2 | 79.4 | 79.4 |
| Sky | 76.7 | 74.9 | 79.0 | 80.8 | 84.6 | 74.1 |
| Person | 43.4 | 35.4 | 46.5 | 44.9 | 54.1 | 45.0 |
| Rider | 8.8 | 9.6 | 8.9 | 7.9 | 20.9 | 18.1 |
| Car | 71.1 | 67.8 | 73.8 | 73.6 | 73.4 | 72.4 |
| Bus | 16.0 | 21.4 | 18.2 | 21.4 | 33.2 | 14.6 |
| Motorbike | 3.6 | 4.1 | 2.2 | 3.4 | 18.1 | 15.1 |
| Bike | 8.4 | 15.5 | 9.9 | 27.2 | 27.3 | 37.9 |
| mIoU | 37.6 | 36.6 | 39.3 | 41.6 | 47.2 | 48.5 |
Table 4 SYNTHIA→Cityscapes typical domain adaptive segmentation methods comparative experiments (%)
| 类别 | 方法 | |||||
|---|---|---|---|---|---|---|
| AdaptSegNet[ | AdvEnt[ | CLAN[ | Cycada[ | SEGL[ | Ours | |
| Road | 78.9 | 67.9 | 80.4 | 84.4 | 83.2 | 78.9 |
| Sidewalk | 29.2 | 29.4 | 30.7 | 29.6 | 40.6 | 31.8 |
| Building | 75.5 | 71.9 | 74.7 | 74.1 | 80.3 | 78.8 |
| Light | 0.1 | 0.6 | 1.4 | 12.6 | 7.9 | 9.1 |
| Sign | 4.8 | 2.6 | 8.0 | 14.3 | 11.2 | 8.7 |
| Vegetation | 72.6 | 74.9 | 77.1 | 79.2 | 79.4 | 79.4 |
| Sky | 76.7 | 74.9 | 79.0 | 80.8 | 84.6 | 74.1 |
| Person | 43.4 | 35.4 | 46.5 | 44.9 | 54.1 | 45.0 |
| Rider | 8.8 | 9.6 | 8.9 | 7.9 | 20.9 | 18.1 |
| Car | 71.1 | 67.8 | 73.8 | 73.6 | 73.4 | 72.4 |
| Bus | 16.0 | 21.4 | 18.2 | 21.4 | 33.2 | 14.6 |
| Motorbike | 3.6 | 4.1 | 2.2 | 3.4 | 18.1 | 15.1 |
| Bike | 8.4 | 15.5 | 9.9 | 27.2 | 27.3 | 37.9 |
| mIoU | 37.6 | 36.6 | 39.3 | 41.6 | 47.2 | 48.5 |
| 方法 | 风格 转换 | 双源 判别器 | 类平衡 因子 | mIoU (%) |
|---|---|---|---|---|
| 对抗学习 | √ | - | - | 42.0 |
| √ | √ | - | 43.2 | |
| 自训练+对抗学习 | √ | - | - | 43.3 |
| √ | √ | - | 44.9 | |
| √ | √ | √ | 45.8 |
Table 5 GTA5→Cityscapes Cross-domain segmentation comparison experiment
| 方法 | 风格 转换 | 双源 判别器 | 类平衡 因子 | mIoU (%) |
|---|---|---|---|---|
| 对抗学习 | √ | - | - | 42.0 |
| √ | √ | - | 43.2 | |
| 自训练+对抗学习 | √ | - | - | 43.3 |
| √ | √ | - | 44.9 | |
| √ | √ | √ | 45.8 |
| 方法 | 风格 转换 | 双源 判别器 | 类平衡 因子 | mIoU (%) |
|---|---|---|---|---|
| 对抗学习 | √ | - | - | 44.6 |
| √ | √ | - | 45.8 | |
| 自训练+对抗学习 | √ | - | - | 46.2 |
| √ | √ | - | 46.9 | |
| √ | √ | √ | 48.5 |
Table 6 SYNTHIA→Cityscapes Cross-domain segmentation comparison experiment
| 方法 | 风格 转换 | 双源 判别器 | 类平衡 因子 | mIoU (%) |
|---|---|---|---|---|
| 对抗学习 | √ | - | - | 44.6 |
| √ | √ | - | 45.8 | |
| 自训练+对抗学习 | √ | - | - | 46.2 |
| √ | √ | - | 46.9 | |
| √ | √ | √ | 48.5 |
| 方法 | α | β | mIoU (%) |
|---|---|---|---|
| 普通 | 1 | 0 | 35.0 |
| 0 | 1 | 41.0 | |
| 双源判别器(S'作为中间桥梁) | 0.1 | 0.9 | 41.8 |
| 0.9 | 0.1 | 41.6 | |
| 0.5 | 0.5 | 42.0 | |
| 双源判别器(S作为中间桥梁) | 0.5 | 0.5 | 41.7 |
Table 7 Segmentation coefficient comparison experiment
| 方法 | α | β | mIoU (%) |
|---|---|---|---|
| 普通 | 1 | 0 | 35.0 |
| 0 | 1 | 41.0 | |
| 双源判别器(S'作为中间桥梁) | 0.1 | 0.9 | 41.8 |
| 0.9 | 0.1 | 41.6 | |
| 0.5 | 0.5 | 42.0 | |
| 双源判别器(S作为中间桥梁) | 0.5 | 0.5 | 41.7 |
| [1] | 青晨, 禹晶, 肖创柏, 等. 深度卷积神经网络图像语义分割研究进展[J]. 中国图象图形学报, 2020, 25(6): 1069-1090. |
| QING C, YU J, XIAO C B, et al. Deep convolutional neural network for semantic image segmentation[J]. Journal of Image and Graphics, 2020, 25(6): 1069-1090. (in Chinese) | |
| [2] |
LECUN Y, BENGIO Y, HINTON G. Deep learning[J]. Nature, 2015, 521(7553): 436-444.
DOI |
| [3] | 范苍宁, 刘鹏, 肖婷, 等. 深度域适应综述: 一般情况与复杂情况[J]. 自动化学报, 2021, 47(3): 515-548. |
| FAN C N, LIU P, XIAO T, et al. A review of deep domain adaptation: general situation and complex situation[J]. A Review of Deep Domain Adaptation: General Situation and Complex Situation, 2021, 47(3): 515-548. (in Chinese) | |
| [4] |
LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
DOI URL |
| [5] | SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions[C]// 2015 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2015: 1-9. |
| [6] | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 770-778. |
| [7] | LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[EB/OL]. [2023-01-19]. http://de.arxiv.org/pdf/1411.4038. |
| [8] |
BADRINARAYANAN V, KENDALL A, CIPOLLA R. SegNet: a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495.
DOI PMID |
| [9] |
CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834-848.
DOI URL |
| [10] | TSAI Y H, HUNG W C, SCHULTER S, et al. Learning to adapt structured output space for semantic segmentation[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 7472-7481. |
| [11] | GOODFELLOW I, POUGETABADIE J, MIRZA M, et al. Generative adversarial nets[C]// Neural Information Processing Systems. Cambridge: MIT Press, 2014: 2672-2680. |
| [12] | GONG R, LI W, CHEN Y H, et al. DLOW: domain flow for adaptation and generalization[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 2472-2481. |
| [13] | ZHU J Y, PARK T, ISOLA P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks[C]// 2017 IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 2242-2251. |
| [14] | MATHUR A, ISOPOUSSU A, KAWSAR F, et al. FlexAdapt: flexible cycle-consistent adversarial domain adaptation[C]// 2019 18th IEEE International Conference on Machine Learning and Applications. New York: IEEE Press, 2020: 896-901. |
| [15] | LI Y S, YUAN L, VASCONCELOS N. Bidirectional learning for domain adaptation of semantic segmentation[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 6929-6938. |
| [16] | VU T H, JAIN H, BUCHER M, et al. ADVENT: adversarial entropy minimization for domain adaptation in semantic segmentation[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 2512-2521. |
| [17] | HUANG J X, LU S J, GUAN D Y, et al. Contextual-relation consistent domain adaptation for semantic segmentation[M]// Computer Vision - ECCV 2020. Cham: Springer International Publishing, 2020: 705-722. |
| [18] | WANG Y C, WANG H C, SHEN Y J, et al. Semi-supervised semantic segmentation using unreliable pseudo-labels[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 4238-4247. |
| [19] |
邵文斌, 刘玉杰, 孙晓瑞, 等. 基于残差增强注意力的跨模态行人重识别[J]. 图学学报, 2023, 44(1): 33-40.
DOI |
| SHAO W B, LIU Y J, SUN X R, et al. Cross modality person re-identification based on residual enhanced attention[J]. Journal of Graphics, 2023, 44(1): 33-40. (in Chinese) | |
| [20] | CHEN S J, JIA X, HE J Z, et al. Semi-supervised domain adaptation based on dual-level domain mixing for semantic segmentation[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 11013-11022. |
| [21] | LI Y J, LIU M Y, LI X T, et al. A closed-form solution to photorealistic image stylization[M]// Computer Vision - ECCV 2018. Cham: Springer International Publishing, 2018: 468-483. |
| [22] | LIN Y X, STANLEY TAN D, CHENG W H, et al. Spatially-aware domain adaptation for semantic segmentation of urban scenes[C]// 2019 IEEE International Conference on Image Processing. New York: IEEE Press, 2019: 1870-1874. |
| [23] | LIN Y X, TAN D S, CHENG W H, et al. Adapting semantic segmentation of urban scenes via mask-aware gated discriminator[C]// 2019 IEEE International Conference on Multimedia and Expo. New York: IEEE Press, 2019: 218-223. |
| [24] | ZOU Y, YU Z D, VIJAYA KUMAR B K, et al. Unsupervised domain adaptation for semantic segmentation via class-balanced self-training[C]// Computer Vision - ECCV 2018: 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part III. New York: ACM, 2018: 297-313. |
| [25] | LUO Y W, ZHENG L, GUAN T, et al. Taking a closer look at domain shift: category-level adversaries for semantics consistent domain adaptation[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 2502-2511. |
| [26] |
张桂梅, 鲁飞飞, 龙邦耀, 等. 结合自集成和对抗学习的域自适应城市场景语义分割[J]. 模式识别与人工智能, 2021, 34(1): 58-67.
DOI |
|
ZHANG G M, LU F F, LONG B Y, et al. Domain adaptation semantic segmentation for urban scene combining self-ensembling and adversarial learning[J]. Pattern Recognition and Artificial Intelligence, 2021, 34(1): 58-67. (in Chinese)
DOI |
|
| [27] | HUANG J X, LU S J, GUAN D Y, et al. Contextual-relation consistent domain adaptation for semantic segmentation[M]// Computer Vision - ECCV 2020. Cham: Springer International Publishing, 2020: 705-722. |
| [28] | 张桂梅, 潘国峰, 刘建新. 域自适应城市场景语义分割[J]. 中国图象图形学报, 2020, 25(5): 913-925. |
| ZHANG G M, PAN G F, LIU J X. Domain adaptation for semantic segmentation based on adaption learning rate[J]. Journal of Image and Graphics, 2020, 25(5): 913-925. (in Chinese) | |
| [29] | HOYER L, DAI D X, VAN GOOL L. DAFormer: improving network architectures and training strategies for domain-adaptive semantic segmentation[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 9914-9925. |
| [30] | ZHANG P, ZHANG B, ZHANG T, et al. Prototypical pseudo label denoising and target structure learning for domain adaptive semantic segmentation[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 12409-12419. |
| [31] | PENG D, LEI Y J, HAYAT M, et al. Semantic-aware domain generalized segmentation[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 2584-2595. |
| [32] |
LIN T Y, GOYAL P, GIRSHICK R B, et al. Focal loss for dense object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 42(2): 318-327.
DOI URL |
| [1] | WU Wen-huan, ZHANG Hao-kun. Semantic segmentation with fusion of spatial criss-cross and channel multi-head attention [J]. Journal of Graphics, 2023, 44(3): 531-539. |
| [2] | CUI Zhen-dong , LI Zong-min, YANG Shu-lin , LIU Yu-jie , LI Hua. 3D object detection based on semantic segmentation guidance [J]. Journal of Graphics, 2022, 43(6): 1134-1142. |
| [3] | FAN Yi-hua , WANG Yong-zhen , YAN Xue-feng , GONG Li-na , GUO Yan-wen , WEI Ming-qiang. Face recognition-driven low-light image enhancement [J]. Journal of Graphics, 2022, 43(6): 1170-1181. |
| [4] | PENG Guo-qin, ZHANG Hao, XU Dan. Unsupervised emotion recognition of Yunnan Heavy Color Paintings based on domain adaptation [J]. Journal of Graphics, 2022, 43(4): 641-650. |
| [5] | YAO Han, YIN Xue-feng, LI Tong, ZHANG Zhao-xuan, YANG Xin, YIN Bao-cai . Research on depth prediction algorithm based on multi-task model [J]. Journal of Graphics, 2021, 42(3): 446-453. |
| [6] | ZHENG Guping, WANG Min, LI Gang . Semantic Segmentation of Multi-Scale Fusion Aerial Image Based on Attention Mechanism [J]. Journal of Graphics, 2018, 39(6): 1069-1077. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||