Journal of Graphics ›› 2023, Vol. 44 ›› Issue (5): 907-917.DOI: 10.11996/JG.j.2095-302X.2023050907
• Image Processing and Computer Vision • Previous Articles Next Articles
ZHANG Gui-mei(), TAO Hui, LU Fei-fei, PENG Kun
Received:
2023-04-27
Accepted:
2023-08-07
Online:
2023-10-31
Published:
2023-10-31
About author:
ZHANG Gui-mei (1970-), Professor, Ph.D. Her main research interests cover image processing and computer vision. E-mail:guimei.zh@163.com
Supported by:
CLC Number:
ZHANG Gui-mei, TAO Hui, LU Fei-fei, PENG Kun. Domain adaptive urban scene semantic segmentation based on dual-source discriminator[J]. Journal of Graphics, 2023, 44(5): 907-917.
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.txxb.com.cn/EN/10.11996/JG.j.2095-302X.2023050907
编号 | 数据集Dataset | 颜色直方图 | SSIM |
---|---|---|---|
图像1 | DS-T | 0.247 9 | 0.117 8 |
DS'-T | 0.375 7 | 0.157 9 | |
图像2 | DS-T | 0.188 3 | 0.061 8 |
DS'-T | 0.295 0 | 0.312 4 | |
图像3 | DS-T | 0.214 8 | 0.078 6 |
DS'-T | 0.294 3 | 0.150 9 |
Table 1 Comparison of experimental results pre and post style translation
编号 | 数据集Dataset | 颜色直方图 | SSIM |
---|---|---|---|
图像1 | DS-T | 0.247 9 | 0.117 8 |
DS'-T | 0.375 7 | 0.157 9 | |
图像2 | DS-T | 0.188 3 | 0.061 8 |
DS'-T | 0.295 0 | 0.312 4 | |
图像3 | DS-T | 0.214 8 | 0.078 6 |
DS'-T | 0.294 3 | 0.150 9 |
Domain adaptation dataset | Method | mIoU (%) |
---|---|---|
GTA5→Cityscapes | AT(S+T) | 35.0 |
AT(S'+T) | 42.0 | |
SYNTHIA→Cityscapes | AT(S+T) | 37.6 |
AT(S'+T) | 44.6 |
Table 2 Comparison of segmentation accuracy pre and post style translation
Domain adaptation dataset | Method | mIoU (%) |
---|---|---|
GTA5→Cityscapes | AT(S+T) | 35.0 |
AT(S'+T) | 42.0 | |
SYNTHIA→Cityscapes | AT(S+T) | 37.6 |
AT(S'+T) | 44.6 |
类别 | 方法 | |||||
---|---|---|---|---|---|---|
AdaptSegNet[ | AdvEnt[ | CLAN[ | Cycada[ | SEGL[ | Ours | |
Road | 87.30 | 86.9 | 88.00 | 87.30 | 2.1 | 92.4 |
Sidewalk | 29.80 | 28.7 | 30.60 | 33.50 | 53.9 | 54.5 |
Building | 78.60 | 78.7 | 79.20 | 77.90 | 81.4 | 83.2 |
Wall | 21.10 | 28.5 | 23.40 | 20.90 | 27.3 | 30.8 |
Fence | 18.20 | 25.2 | 20.50 | 17.90 | 25.1 | 24.8 |
Pole | 22.50 | 17.1 | 26.10 | - | 33.2 | 34.0 |
Light | 21.50 | 20.3 | 23.00 | 33.40 | 38.8 | 39.1 |
Sign | 11.00 | 10.9 | 14.80 | 19.70 | 23.0 | 24.5 |
Vegetation | 79.70 | 80.0 | 81.60 | 83.20 | 83.5 | 84.1 |
Terrain | 29.60 | 26.4 | 34.50 | - | 34.1 | 34.9 |
Sky | 71.30 | 70.2 | 72.00 | 70.10 | 70.7 | 78.7 |
Person | 46.80 | 47.1 | 45.80 | 43.30 | 58.5 | 51.9 |
Rider | 6.50 | 8.4 | 7.90 | - | 29.4 | 19.2 |
Car | 80.10 | 81.5 | 80.50 | 77.40 | 84.2 | 84.3 |
Truck | 23.00 | 26.0 | 26.60 | - | 27.8 | 28.3 |
Bus | 26.90 | 17.2 | 29.90 | 22.50 | 34.8 | 38.3 |
Train | 0.01 | 18.9 | 0.01 | 3.40 | 4.8 | 3.6 |
Motorbike | 10.60 | 11.7 | 10.70 | 11.30 | 25.1 | 13.2 |
Bike | 0.30 | 1.6 | 0.00 | 12.90 | 19.4 | 20.4 |
mIoU | 35.00 | 36.1 | 36.60 | 37.20 | 44.8 | 45.8 |
Table 3 GTA5→Cityscapes typical domain adaptive segmentation methods comparative experiments (%)
类别 | 方法 | |||||
---|---|---|---|---|---|---|
AdaptSegNet[ | AdvEnt[ | CLAN[ | Cycada[ | SEGL[ | Ours | |
Road | 87.30 | 86.9 | 88.00 | 87.30 | 2.1 | 92.4 |
Sidewalk | 29.80 | 28.7 | 30.60 | 33.50 | 53.9 | 54.5 |
Building | 78.60 | 78.7 | 79.20 | 77.90 | 81.4 | 83.2 |
Wall | 21.10 | 28.5 | 23.40 | 20.90 | 27.3 | 30.8 |
Fence | 18.20 | 25.2 | 20.50 | 17.90 | 25.1 | 24.8 |
Pole | 22.50 | 17.1 | 26.10 | - | 33.2 | 34.0 |
Light | 21.50 | 20.3 | 23.00 | 33.40 | 38.8 | 39.1 |
Sign | 11.00 | 10.9 | 14.80 | 19.70 | 23.0 | 24.5 |
Vegetation | 79.70 | 80.0 | 81.60 | 83.20 | 83.5 | 84.1 |
Terrain | 29.60 | 26.4 | 34.50 | - | 34.1 | 34.9 |
Sky | 71.30 | 70.2 | 72.00 | 70.10 | 70.7 | 78.7 |
Person | 46.80 | 47.1 | 45.80 | 43.30 | 58.5 | 51.9 |
Rider | 6.50 | 8.4 | 7.90 | - | 29.4 | 19.2 |
Car | 80.10 | 81.5 | 80.50 | 77.40 | 84.2 | 84.3 |
Truck | 23.00 | 26.0 | 26.60 | - | 27.8 | 28.3 |
Bus | 26.90 | 17.2 | 29.90 | 22.50 | 34.8 | 38.3 |
Train | 0.01 | 18.9 | 0.01 | 3.40 | 4.8 | 3.6 |
Motorbike | 10.60 | 11.7 | 10.70 | 11.30 | 25.1 | 13.2 |
Bike | 0.30 | 1.6 | 0.00 | 12.90 | 19.4 | 20.4 |
mIoU | 35.00 | 36.1 | 36.60 | 37.20 | 44.8 | 45.8 |
类别 | 方法 | |||||
---|---|---|---|---|---|---|
AdaptSegNet[ | AdvEnt[ | CLAN[ | Cycada[ | SEGL[ | Ours | |
Road | 78.9 | 67.9 | 80.4 | 84.4 | 83.2 | 78.9 |
Sidewalk | 29.2 | 29.4 | 30.7 | 29.6 | 40.6 | 31.8 |
Building | 75.5 | 71.9 | 74.7 | 74.1 | 80.3 | 78.8 |
Light | 0.1 | 0.6 | 1.4 | 12.6 | 7.9 | 9.1 |
Sign | 4.8 | 2.6 | 8.0 | 14.3 | 11.2 | 8.7 |
Vegetation | 72.6 | 74.9 | 77.1 | 79.2 | 79.4 | 79.4 |
Sky | 76.7 | 74.9 | 79.0 | 80.8 | 84.6 | 74.1 |
Person | 43.4 | 35.4 | 46.5 | 44.9 | 54.1 | 45.0 |
Rider | 8.8 | 9.6 | 8.9 | 7.9 | 20.9 | 18.1 |
Car | 71.1 | 67.8 | 73.8 | 73.6 | 73.4 | 72.4 |
Bus | 16.0 | 21.4 | 18.2 | 21.4 | 33.2 | 14.6 |
Motorbike | 3.6 | 4.1 | 2.2 | 3.4 | 18.1 | 15.1 |
Bike | 8.4 | 15.5 | 9.9 | 27.2 | 27.3 | 37.9 |
mIoU | 37.6 | 36.6 | 39.3 | 41.6 | 47.2 | 48.5 |
Table 4 SYNTHIA→Cityscapes typical domain adaptive segmentation methods comparative experiments (%)
类别 | 方法 | |||||
---|---|---|---|---|---|---|
AdaptSegNet[ | AdvEnt[ | CLAN[ | Cycada[ | SEGL[ | Ours | |
Road | 78.9 | 67.9 | 80.4 | 84.4 | 83.2 | 78.9 |
Sidewalk | 29.2 | 29.4 | 30.7 | 29.6 | 40.6 | 31.8 |
Building | 75.5 | 71.9 | 74.7 | 74.1 | 80.3 | 78.8 |
Light | 0.1 | 0.6 | 1.4 | 12.6 | 7.9 | 9.1 |
Sign | 4.8 | 2.6 | 8.0 | 14.3 | 11.2 | 8.7 |
Vegetation | 72.6 | 74.9 | 77.1 | 79.2 | 79.4 | 79.4 |
Sky | 76.7 | 74.9 | 79.0 | 80.8 | 84.6 | 74.1 |
Person | 43.4 | 35.4 | 46.5 | 44.9 | 54.1 | 45.0 |
Rider | 8.8 | 9.6 | 8.9 | 7.9 | 20.9 | 18.1 |
Car | 71.1 | 67.8 | 73.8 | 73.6 | 73.4 | 72.4 |
Bus | 16.0 | 21.4 | 18.2 | 21.4 | 33.2 | 14.6 |
Motorbike | 3.6 | 4.1 | 2.2 | 3.4 | 18.1 | 15.1 |
Bike | 8.4 | 15.5 | 9.9 | 27.2 | 27.3 | 37.9 |
mIoU | 37.6 | 36.6 | 39.3 | 41.6 | 47.2 | 48.5 |
方法 | 风格 转换 | 双源 判别器 | 类平衡 因子 | mIoU (%) |
---|---|---|---|---|
对抗学习 | √ | - | - | 42.0 |
√ | √ | - | 43.2 | |
自训练+对抗学习 | √ | - | - | 43.3 |
√ | √ | - | 44.9 | |
√ | √ | √ | 45.8 |
Table 5 GTA5→Cityscapes Cross-domain segmentation comparison experiment
方法 | 风格 转换 | 双源 判别器 | 类平衡 因子 | mIoU (%) |
---|---|---|---|---|
对抗学习 | √ | - | - | 42.0 |
√ | √ | - | 43.2 | |
自训练+对抗学习 | √ | - | - | 43.3 |
√ | √ | - | 44.9 | |
√ | √ | √ | 45.8 |
方法 | 风格 转换 | 双源 判别器 | 类平衡 因子 | mIoU (%) |
---|---|---|---|---|
对抗学习 | √ | - | - | 44.6 |
√ | √ | - | 45.8 | |
自训练+对抗学习 | √ | - | - | 46.2 |
√ | √ | - | 46.9 | |
√ | √ | √ | 48.5 |
Table 6 SYNTHIA→Cityscapes Cross-domain segmentation comparison experiment
方法 | 风格 转换 | 双源 判别器 | 类平衡 因子 | mIoU (%) |
---|---|---|---|---|
对抗学习 | √ | - | - | 44.6 |
√ | √ | - | 45.8 | |
自训练+对抗学习 | √ | - | - | 46.2 |
√ | √ | - | 46.9 | |
√ | √ | √ | 48.5 |
方法 | α | β | mIoU (%) |
---|---|---|---|
普通 | 1 | 0 | 35.0 |
0 | 1 | 41.0 | |
双源判别器(S'作为中间桥梁) | 0.1 | 0.9 | 41.8 |
0.9 | 0.1 | 41.6 | |
0.5 | 0.5 | 42.0 | |
双源判别器(S作为中间桥梁) | 0.5 | 0.5 | 41.7 |
Table 7 Segmentation coefficient comparison experiment
方法 | α | β | mIoU (%) |
---|---|---|---|
普通 | 1 | 0 | 35.0 |
0 | 1 | 41.0 | |
双源判别器(S'作为中间桥梁) | 0.1 | 0.9 | 41.8 |
0.9 | 0.1 | 41.6 | |
0.5 | 0.5 | 42.0 | |
双源判别器(S作为中间桥梁) | 0.5 | 0.5 | 41.7 |
[1] | 青晨, 禹晶, 肖创柏, 等. 深度卷积神经网络图像语义分割研究进展[J]. 中国图象图形学报, 2020, 25(6): 1069-1090. |
QING C, YU J, XIAO C B, et al. Deep convolutional neural network for semantic image segmentation[J]. Journal of Image and Graphics, 2020, 25(6): 1069-1090. (in Chinese) | |
[2] |
LECUN Y, BENGIO Y, HINTON G. Deep learning[J]. Nature, 2015, 521(7553): 436-444.
DOI |
[3] | 范苍宁, 刘鹏, 肖婷, 等. 深度域适应综述: 一般情况与复杂情况[J]. 自动化学报, 2021, 47(3): 515-548. |
FAN C N, LIU P, XIAO T, et al. A review of deep domain adaptation: general situation and complex situation[J]. A Review of Deep Domain Adaptation: General Situation and Complex Situation, 2021, 47(3): 515-548. (in Chinese) | |
[4] |
LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
DOI URL |
[5] | SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions[C]// 2015 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2015: 1-9. |
[6] | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 770-778. |
[7] | LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[EB/OL]. [2023-01-19]. http://de.arxiv.org/pdf/1411.4038. |
[8] |
BADRINARAYANAN V, KENDALL A, CIPOLLA R. SegNet: a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495.
DOI PMID |
[9] |
CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834-848.
DOI URL |
[10] | TSAI Y H, HUNG W C, SCHULTER S, et al. Learning to adapt structured output space for semantic segmentation[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 7472-7481. |
[11] | GOODFELLOW I, POUGETABADIE J, MIRZA M, et al. Generative adversarial nets[C]// Neural Information Processing Systems. Cambridge: MIT Press, 2014: 2672-2680. |
[12] | GONG R, LI W, CHEN Y H, et al. DLOW: domain flow for adaptation and generalization[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 2472-2481. |
[13] | ZHU J Y, PARK T, ISOLA P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks[C]// 2017 IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 2242-2251. |
[14] | MATHUR A, ISOPOUSSU A, KAWSAR F, et al. FlexAdapt: flexible cycle-consistent adversarial domain adaptation[C]// 2019 18th IEEE International Conference on Machine Learning and Applications. New York: IEEE Press, 2020: 896-901. |
[15] | LI Y S, YUAN L, VASCONCELOS N. Bidirectional learning for domain adaptation of semantic segmentation[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 6929-6938. |
[16] | VU T H, JAIN H, BUCHER M, et al. ADVENT: adversarial entropy minimization for domain adaptation in semantic segmentation[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 2512-2521. |
[17] | HUANG J X, LU S J, GUAN D Y, et al. Contextual-relation consistent domain adaptation for semantic segmentation[M]// Computer Vision - ECCV 2020. Cham: Springer International Publishing, 2020: 705-722. |
[18] | WANG Y C, WANG H C, SHEN Y J, et al. Semi-supervised semantic segmentation using unreliable pseudo-labels[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 4238-4247. |
[19] |
邵文斌, 刘玉杰, 孙晓瑞, 等. 基于残差增强注意力的跨模态行人重识别[J]. 图学学报, 2023, 44(1): 33-40.
DOI |
SHAO W B, LIU Y J, SUN X R, et al. Cross modality person re-identification based on residual enhanced attention[J]. Journal of Graphics, 2023, 44(1): 33-40. (in Chinese) | |
[20] | CHEN S J, JIA X, HE J Z, et al. Semi-supervised domain adaptation based on dual-level domain mixing for semantic segmentation[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 11013-11022. |
[21] | LI Y J, LIU M Y, LI X T, et al. A closed-form solution to photorealistic image stylization[M]// Computer Vision - ECCV 2018. Cham: Springer International Publishing, 2018: 468-483. |
[22] | LIN Y X, STANLEY TAN D, CHENG W H, et al. Spatially-aware domain adaptation for semantic segmentation of urban scenes[C]// 2019 IEEE International Conference on Image Processing. New York: IEEE Press, 2019: 1870-1874. |
[23] | LIN Y X, TAN D S, CHENG W H, et al. Adapting semantic segmentation of urban scenes via mask-aware gated discriminator[C]// 2019 IEEE International Conference on Multimedia and Expo. New York: IEEE Press, 2019: 218-223. |
[24] | ZOU Y, YU Z D, VIJAYA KUMAR B K, et al. Unsupervised domain adaptation for semantic segmentation via class-balanced self-training[C]// Computer Vision - ECCV 2018: 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part III. New York: ACM, 2018: 297-313. |
[25] | LUO Y W, ZHENG L, GUAN T, et al. Taking a closer look at domain shift: category-level adversaries for semantics consistent domain adaptation[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 2502-2511. |
[26] |
张桂梅, 鲁飞飞, 龙邦耀, 等. 结合自集成和对抗学习的域自适应城市场景语义分割[J]. 模式识别与人工智能, 2021, 34(1): 58-67.
DOI |
ZHANG G M, LU F F, LONG B Y, et al. Domain adaptation semantic segmentation for urban scene combining self-ensembling and adversarial learning[J]. Pattern Recognition and Artificial Intelligence, 2021, 34(1): 58-67. (in Chinese)
DOI |
|
[27] | HUANG J X, LU S J, GUAN D Y, et al. Contextual-relation consistent domain adaptation for semantic segmentation[M]// Computer Vision - ECCV 2020. Cham: Springer International Publishing, 2020: 705-722. |
[28] | 张桂梅, 潘国峰, 刘建新. 域自适应城市场景语义分割[J]. 中国图象图形学报, 2020, 25(5): 913-925. |
ZHANG G M, PAN G F, LIU J X. Domain adaptation for semantic segmentation based on adaption learning rate[J]. Journal of Image and Graphics, 2020, 25(5): 913-925. (in Chinese) | |
[29] | HOYER L, DAI D X, VAN GOOL L. DAFormer: improving network architectures and training strategies for domain-adaptive semantic segmentation[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 9914-9925. |
[30] | ZHANG P, ZHANG B, ZHANG T, et al. Prototypical pseudo label denoising and target structure learning for domain adaptive semantic segmentation[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 12409-12419. |
[31] | PENG D, LEI Y J, HAYAT M, et al. Semantic-aware domain generalized segmentation[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 2584-2595. |
[32] |
LIN T Y, GOYAL P, GIRSHICK R B, et al. Focal loss for dense object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 42(2): 318-327.
DOI URL |
[1] | WU Wen-huan, ZHANG Hao-kun. Semantic segmentation with fusion of spatial criss-cross and channel multi-head attention [J]. Journal of Graphics, 2023, 44(3): 531-539. |
[2] | CUI Zhen-dong , LI Zong-min, YANG Shu-lin , LIU Yu-jie , LI Hua. 3D object detection based on semantic segmentation guidance [J]. Journal of Graphics, 2022, 43(6): 1134-1142. |
[3] | FAN Yi-hua , WANG Yong-zhen , YAN Xue-feng , GONG Li-na , GUO Yan-wen , WEI Ming-qiang. Face recognition-driven low-light image enhancement [J]. Journal of Graphics, 2022, 43(6): 1170-1181. |
[4] | PENG Guo-qin, ZHANG Hao, XU Dan. Unsupervised emotion recognition of Yunnan Heavy Color Paintings based on domain adaptation [J]. Journal of Graphics, 2022, 43(4): 641-650. |
[5] | YAO Han, YIN Xue-feng, LI Tong, ZHANG Zhao-xuan, YANG Xin, YIN Bao-cai . Research on depth prediction algorithm based on multi-task model [J]. Journal of Graphics, 2021, 42(3): 446-453. |
[6] | ZHENG Guping, WANG Min, LI Gang . Semantic Segmentation of Multi-Scale Fusion Aerial Image Based on Attention Mechanism [J]. Journal of Graphics, 2018, 39(6): 1069-1077. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||