图学学报 ›› 2023, Vol. 44 ›› Issue (2): 304-312.DOI: 10.11996/JG.j.2095-302X.2023020304
收稿日期:
2022-08-12
接受日期:
2022-11-21
出版日期:
2023-04-30
发布日期:
2023-05-01
通讯作者:
吴昊(1982-),男,讲师,博士。主要研究方向为图像处理。E-mail:作者简介:
罗启明(1997-),男,硕士研究生。主要研究方向为数字图像处理与计算机视觉。E-mail:qimingluo@mail.ynu.edu.cn
基金资助:
LUO Qi-ming(), WU Hao(
), XIA Xin, YUAN Guo-wu
Received:
2022-08-12
Accepted:
2022-11-21
Online:
2023-04-30
Published:
2023-05-01
Contact:
WU Hao (1982-), lecturer, Ph.D. His main research interest covers digital image processing. E-mail:About author:
LUO Qi-ming (1997-), master student. His main research interests cover digital image processing and computer vision. E-mail:qimingluo@mail.ynu.edu.cn
Supported by:
摘要:
壁画破损区域预测是壁画虚拟修复工作的重要环节,针对现有方法在预测云南少数民族壁画破损区域时容易出现破损区域预测不全、对纹理复杂区域的破损边界预测不准确等问题,提出了一种基于U-Net改进的Dual Dense U-Net分割模型,通过增强破损区域位置特征和纹理特征,获取更多的判别信息,以提高破损掩膜预测的准确度。为使模型能更有效地学习壁画特征,建立了一个包含5 000张云南少数民族壁画图像的分割数据集。Dual Dense U-Net模型利用融合模块去对壁画图像进行多尺度融合,减少壁画图像在前馈过程中的局部纹理信息和空间位置信息损失。首先,利用U-Net结构对输入的壁画图像进行信息提取,融合模块有多个深度可分离卷积,能够提高融合模块效率以及分割精度;其次,融合模块连接两个U-Net,进一步加强浅层特征与深层特征间的联系。实验结果表明,该模型在IoU与Dice评价指标较UNet++提高了3个百分点,模型预测得到的破损区域能显著改善壁画修复网络的修复效果,验证了该模型在壁画破损区域预测领域的有效性。
中图分类号:
罗启明, 吴昊, 夏信, 袁国武. 基于Dual Dense U-Net的云南壁画破损区域预测[J]. 图学学报, 2023, 44(2): 304-312.
LUO Qi-ming, WU Hao, XIA Xin, YUAN Guo-wu. Prediction of damaged areas in Yunnan murals using Dual Dense U-Net[J]. Journal of Graphics, 2023, 44(2): 304-312.
图1 壁画破损类型示例(黄框内为划痕,白框内为脱落,绿框内为裂纹,蓝框内为褪色) ((a)包含划痕和脱落;(b)包含脱落、裂纹和褪色)
Fig. 1 Examples of damage types of murals (scratch in yellow frame, peeling in white frame, crack in green frame, fading in blue frame) ((a) Including scratches and peeling; (b) Including peeling, cracks and fading)
评估标准 | 分割模型 | ||||||||
---|---|---|---|---|---|---|---|---|---|
U-Net | UNet+ | UNet++ | DeepLabV3+ | Swin-Unet | GSCNN | MOoSe | HarDNet | DDU | |
IoU | 44.85 | 45.32 | 47.98 | 32.11 | 43.11 | 48.49 | 40.43 | 45.51 | 50.99 |
Dice | 59.76 | 60.11 | 63.08 | 47.27 | 58.51 | 63.91 | 55.51 | 60.65 | 66.27 |
表1 不同模型的量化评估结果(%)
Table 1 Quantitative evaluation results of different models (%)
评估标准 | 分割模型 | ||||||||
---|---|---|---|---|---|---|---|---|---|
U-Net | UNet+ | UNet++ | DeepLabV3+ | Swin-Unet | GSCNN | MOoSe | HarDNet | DDU | |
IoU | 44.85 | 45.32 | 47.98 | 32.11 | 43.11 | 48.49 | 40.43 | 45.51 | 50.99 |
Dice | 59.76 | 60.11 | 63.08 | 47.27 | 58.51 | 63.91 | 55.51 | 60.65 | 66.27 |
Dconv | MLP | IoU | Dice |
---|---|---|---|
- | - | 47.72 | 63.65 |
- | √ | 49.52 | 65.19 |
√ | - | 48.26 | 64.34 |
√ | √ | 50.99 | 66.27 |
表2 不同设计选择的消融实验(%)
Table 2 Ablation studies of different design choices (%)
Dconv | MLP | IoU | Dice |
---|---|---|---|
- | - | 47.72 | 63.65 |
- | √ | 49.52 | 65.19 |
√ | - | 48.26 | 64.34 |
√ | √ | 50.99 | 66.27 |
图7 不同方法掩模在壁画修复应用中的比较((a)大孔洞修复结果;(b)细小破损修复结果)
Fig. 7 Comparison of different method masks in murals restoration applications ((a) The restoration result of large holes; (b) The restoration result of small damages)
图8 消融实验结果(“w/o MLP”表示去掉注意力机制,“FULL”表示完整模型) ((a), (b)大孔洞消融结果;(c), (d)细小破损消融结果)
Fig. 8 Ablation study results (“w/o MLP” means removing attention mechanism, “FULL” means full model) ((a), (b) Means ablation results of large holes; (c), (d) Means ablation results of small damages)
[1] | 温利龙, 徐丹, 张熹, 等. 基于生成模型的古壁画非规则破损部分修复方法[J]. 图学学报, 2019, 40(5): 925-931. |
WEN L L, XU D, ZHANG X, et al. The inpainting of irregular damaged areas in ancient murals using generative model[J]. Journal of Graphics, 2019, 40(5): 925-931. (in Chinese) | |
[2] | PENG J L, LIU D, XU S C, et al. Generating diverse structure for image inpainting with hierarchical VQ-VAE[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 10770-10779. |
[3] | DONG Q L, CAO C J, FU Y W. Incremental transformer structure enhanced image inpainting with masking positional encoding[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 11348-11358. |
[4] |
BEZDEK J C, EHRLICH R, FULL W. FCM: the fuzzy c-means clustering algorithm[J]. Computers & Geosciences, 1984, 10(2-3): 191-203.
DOI URL |
[5] | 张子迎, 税午阳, 周明全, 等. 数字壁画病害提取与修复算法研究[J]. 计算机应用研究, 2021, 38(8): 2495-2498, 2504. |
ZHANG Z Y, SHUI W Y, ZHOU M Q, et al. Research on disease extraction and inpainting algorithm of digital grotto murals[J]. Application Research of Computers, 2021, 38(8): 2495-2498, 2504. (in Chinese) | |
[6] | 吴萌, 王慧琴, 李文怡. 多尺度唐墓室壁画病害标记及修复技术研究[J]. 计算机工程与应用, 2016, 52(11): 169-174. |
WU M, WANG H Q, LI W Y. Research on multi-scale detection and image inpainting of Tang dynasty tomb murals[J]. Computer Engineering and Applications, 2016, 52(11): 169-174. (in Chinese) | |
[7] | JAIDILERT S, FAROOQUE G. Crack detection and images inpainting method for Thai mural painting images[C]// 2018 IEEE International Conference on Image, Vision and Computing. New York: IEEE Press, 2018: 143-148. |
[8] |
曹建芳, 田晓东, 贾一鸣, 等. 改进DeepLabV3+模型在壁画分割中的应用[J]. 计算机应用, 2021, 41(5): 1471-1476.
DOI |
CAO J F, TIAN X D, JIA Y M, et al. Application of improved DeepLabV3+ model in mural segmentation[J]. Journal of Computer Applications, 2021, 41(5): 1471-1476. (in Chinese) | |
[9] | 吕书强, 王诗涵, 侯妙乐, 等. 基于改进U-Net的壁画颜料层脱落病害区域提取[J]. 地理信息世界, 2022, 29(1): 69-74. |
LYU S Q, WANG S H, HOU M L, et al. Extraction of mural paint loss diseases based on improved U-net[J]. Geomatics World, 2022, 29(1): 69-74. (in Chinese) | |
[10] | RONNEBERGER O, FISCHER P, BROX T. U-net: convolutional networks for biomedical image segmentation[M]// Lecture Notes in Computer Science. Cham: Springer International Publishing, 2015: 234-241. |
[11] |
SHELHAMER E, LONG J, DARRELL T. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4): 640-651.
DOI PMID |
[12] | LIN G S, MILAN A, SHEN C H, et al. RefineNet: multi-path refinement networks for high-resolution semantic segmentation[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 5168-5177. |
[13] | YU F, KOLTUN V. Multi-scale context aggregation by dilated convolutions[EB/OL]. [2022-07-11]. https://arxiv.org/abs/1511.07122. |
[14] | HUANG Z L, WANG X G, HUANG L C, et al. CCNet: criss-cross attention for semantic segmentation[C]// 2019 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2020: 603-612. |
[15] | CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[EB/OL]. [2022-07-11]. https://arxiv.org/abs/1606.00915. |
[16] |
CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834-848.
DOI URL |
[17] |
HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916.
DOI PMID |
[18] | CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation[EB/OL]. [2022-07-11]. https://arxiv.org/abs/1706.05587. |
[19] | CHEN L C, YANG Y, WANG J, et al. Attention to scale: scale-aware semantic image segmentation[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 3640-3649. |
[20] | MNIH V, HEESS N, GRAVES A, et al. Recurrent models of visual attention[EB/OL]. [2022-07-11]. https://arxiv.org/abs/1406.6247. |
[21] | LEE S, CHOI W, KIM C, et al. ADAS: a direct adaptation strategy for multi-target domain adaptive semantic segmentation[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 19174-19184. |
[22] | CHENG Y T, WEI F Y, BAO J M, et al. Dual path learning for domain adaptation of semantic segmentation[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2022: 9062-9071. |
[23] | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[M]//Computer Vision - ECCV 2018. Cham: Springer International Publishing, 2018: 3-19. |
[24] | SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 4510-4520. |
[25] | CHOLLET F. Xception: deep learning with depthwise separable convolutions[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 1800-1807. |
[26] |
HARTIGAN J A, WONG M A. Algorithm AS 136: a K-means clustering algorithm[J]. Applied Statistics, 1979, 28(1): 100.
DOI URL |
[27] |
ZHOU Z W, SIDDIQUEE M M R, TAJBAKHSH N, et al. UNet: redesigning skip connections to exploit multiscale features in image segmentation[J]. IEEE Transactions on Medical Imaging, 2020, 39(6): 1856-1867.
DOI URL |
[28] | CHEN L C, ZHU Y K, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[M]//Computer Vision - ECCV 2018. Cham: Springer International Publishing, 2018: 833-851. |
[29] | CAO H, WANG Y Y, CHEN J, et al. Swin-unet: unet-like pure transformer for medical image segmentation[EB/OL]. [2022-07-11]. https://arxiv.org/abs/2105.05537. |
[30] | TAKIKAWA T, ACUNA D, JAMPANI V, et al. Gated-SCNN: gated shape CNNs for semantic segmentation[C]// 2019 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2020: 5228-5237. |
[31] | GALESSO S, BRAVO M A, NAOUAR M, et al. Probing contextual diversity for dense out-of-distribution detection[EB/OL]. [2022-07-11]. https://arxiv.org/abs/2208.14195. |
[32] | LIAO T Y, YANG C H, LO Y W, et al. HarDNet-DFUS: an enhanced harmonically-connected network for diabetic foot ulcer image segmentation and colonoscopy polyp segmentation[EB/OL]. [2022-07-11]. https://arxiv.org/abs/2209.07313. |
[33] | LEMPITSKY V, VEDALDI A, ULYANOV D. Deep image prior[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 9446-9454. |
[1] | 毕春艳, 刘越. 基于深度学习的视频人体动作识别综述[J]. 图学学报, 2023, 44(4): 625-639. |
[2] | 曹义亲 , 周一纬 , 徐露 .
基于 E-YOLOX 的实时金属表面缺陷检测算法
[J]. 图学学报, 2023, 44(4): 677-690. |
[3] | 邵俊棋, 钱文华, 徐启豪.
基于条件残差生成对抗网络的风景图生成
[J]. 图学学报, 2023, 44(4): 710-717. |
[4] | 余伟群, 刘佳涛, 张亚萍.
融合注意力的拉普拉斯金字塔单目深度估计
[J]. 图学学报, 2023, 44(4): 728-738. |
[5] | 郭印宏, 王立春, 李爽.
基于重复性和特异性约束的图像特征匹配
[J]. 图学学报, 2023, 44(4): 739-746. |
[6] | 毛爱坤, 刘昕明, 陈文壮, 宋绍楼. 改进YOLOv5算法的变电站仪表目标检测方法[J]. 图学学报, 2023, 44(3): 448-455. |
[7] | 王佳婧, 王晨, 朱媛媛, 王笑梅. 基于民国纸币的图元素匹配检索[J]. 图学学报, 2023, 44(3): 492-501. |
[8] | 杨柳, 吴晓群. 基于深度学习的三维形状补全研究综述[J]. 图学学报, 2023, 44(2): 201-215. |
[9] | 曾武, 朱恒亮, 邢树礼, 林江宏, 毛国君. 显著性检测引导的图像数据增强方法[J]. 图学学报, 2023, 44(2): 260-270. |
[10] | 李洪安 , 郑峭雪 , 陶若霖 , 张敏 , 李占利 , 康宝生 . 基于深度学习的图像超分辨率研究综述[J]. 图学学报, 2023, 44(1): 1-15. |
[11] | 邵英杰, 尹辉, 谢颖, 黄华.
草图引导的选择循环推理式人脸图像修复网络
[J]. 图学学报, 2023, 44(1): 67-76. |
[12] | 谷雨, 赵军.
列车闸瓦钎及闸瓦故障图像检测算法研究
[J]. 图学学报, 2023, 44(1): 88-94. |
[13] | 潘东辉, 金映含, 孙旭, 刘玉生, 张东亮.
CTH-Net:从线稿和颜色点生成服装图像的
CNN-Transformer 混合网络
[J]. 图学学报, 2023, 44(1): 120-130. |
[14] | 范震, 刘晓静, 李小波, 崔亚超.
一种对光照和遮挡鲁棒的单应性估计方法
[J]. 图学学报, 2023, 44(1): 166-176. |
[15] | 朱磊 , 李东彪 , 闫星志 , 刘向阳 , 沈才华 .
基于改进 Mask R-CNN 深度学习算法的隧道裂缝智能检测方法
[J]. 图学学报, 2023, 44(1): 177-183. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||