图学学报 ›› 2024, Vol. 45 ›› Issue (1): 65-77.DOI: 10.11996/JG.j.2095-302X.2024010065
王欣雨1,2(), 刘慧1,2(
), 朱积成1,2, 盛玉瑞3, 张彩明2,4
收稿日期:
2023-07-20
接受日期:
2023-09-20
出版日期:
2024-02-29
发布日期:
2024-02-29
通讯作者:
刘慧(1978-),女,教授,博士。主要研究方向为数据挖掘与可视化。E-mail:liuh_lh@sdufe.edu.cn第一作者:
王欣雨(1999-),女,硕士研究生。主要研究方向为多模态数据融合。E-mail:wangxy@mail.sdufe.edu.cn
基金资助:
WANG Xinyu1,2(), LIU Hui1,2(
), ZHU Jicheng1,2, SHENG Yurui3, ZHANG Caiming2,4
Received:
2023-07-20
Accepted:
2023-09-20
Published:
2024-02-29
Online:
2024-02-29
First author:
WANG Xinyu (1999-), master student. Her main research interest covers multimodal data fusion. E-mail:wangxy@mail.sdufe.edu.cn
Supported by:
摘要:
多模态医学图像融合旨在利用跨模态图像的相关性和信息互补性,以增强医学图像在临床应用中的可读性和适用性。然而,现有手工设计的模型无法有效地提取关键目标特征,从而导致融合图像模糊、纹理细节丢失等问题。为此,提出了一种新的基于高低频特征分解的深度多模态医学图像融合网络,将通道注意力和空间注意力机制引入融合过程,在保持全局结构的基础上保留了局部纹理细节信息,实现了更加细致的融合。首先,通过预训练模型VGG-19提取两种模态图像的高频特征,并通过下采样提取其低频特征,形成高低频中间特征图。其次,在特征融合模块嵌入残差注意力网络,依次从通道和空间维度推断注意力图,并将其用来指导输入特征图的自适应特征优化过程。最后,重构模块形成高质量特征表示并输出融合图像。实验结果表明,该算法在Harvard公开数据集和自建腹部数据集峰值信噪比提升8.29%,结构相似性提升85.07%,相关系数提升65.67%,特征互信息提升46.76%,视觉保真度提升80.89%。
中图分类号:
王欣雨, 刘慧, 朱积成, 盛玉瑞, 张彩明. 基于高低频特征分解的深度多模态医学图像融合网络[J]. 图学学报, 2024, 45(1): 65-77.
WANG Xinyu, LIU Hui, ZHU Jicheng, SHENG Yurui, ZHANG Caiming. Deep multimodal medical image fusion network based on high-low frequency feature decomposition[J]. Journal of Graphics, 2024, 45(1): 65-77.
图3 消融实验结果对比((a) MR-T1/MR-T2源图像对;(b) MR/PET源图像对;(c)单个注意力;(d)注意力串联)
Fig. 3 Comparison of ablation experiment results ((a) The source image pairs of MR-T1/MR-T2; (b) The source image pairs of MR/PET; (c) Single attention; (d) Attention concatenation)
实验方式 | PSNR | SSIM | CC | MI | VIF |
---|---|---|---|---|---|
单个注意力 | 64.450 2 | 0.570 1 | 0.872 8 | 3.018 9 | 0.516 8 |
注意力串联 | 64.889 4 | 0.715 8 | 0.888 1 | 3.588 7 | 0.615 6 |
表1 MR-T1/MR-T2图像融合消融实验平均量化指标对比
Table 1 Comparative study of average quantitative metrics in MR-T1/MR-T2 image fusion ablation experiments
实验方式 | PSNR | SSIM | CC | MI | VIF |
---|---|---|---|---|---|
单个注意力 | 64.450 2 | 0.570 1 | 0.872 8 | 3.018 9 | 0.516 8 |
注意力串联 | 64.889 4 | 0.715 8 | 0.888 1 | 3.588 7 | 0.615 6 |
实验方式 | PSNR | SSIM | CC | MI | VIF |
---|---|---|---|---|---|
单个注意力 | 63.103 8 | 0.615 2 | 0.847 9 | 2.240 1 | 0.444 3 |
注意力串联 | 63.250 8 | 0.692 9 | 0.844 6 | 3.596 3 | 0.725 1 |
表2 MR/PET图像融合消融实验平均量化指标对比
Table 2 Comparative study of average quantitative metrics in MR/PET image fusion ablation experiments
实验方式 | PSNR | SSIM | CC | MI | VIF |
---|---|---|---|---|---|
单个注意力 | 63.103 8 | 0.615 2 | 0.847 9 | 2.240 1 | 0.444 3 |
注意力串联 | 63.250 8 | 0.692 9 | 0.844 6 | 3.596 3 | 0.725 1 |
数据集类别 | 图像数量/幅 | 图像分辨率/bit | 层间距/mm | 病灶数据 |
---|---|---|---|---|
CT | 329 | 256×256 | 5 | 钙化灶:最大截面:2.3×1.9 cm2 转移瘤直径:4.5 cm |
PET | 329 | 256×256 | 1 | 肿大淋巴结直:1.4 cm |
MRT1/T2 | 74 | 256×256 | 5 | 肾囊肿:数量3,最大径:0.6 cm,最小径:0.3 cm, 形状:类圆形,边界清晰 |
表3 自建腹部数据集相关信息
Table 3 Relevant information about the self-built abdominal dataset
数据集类别 | 图像数量/幅 | 图像分辨率/bit | 层间距/mm | 病灶数据 |
---|---|---|---|---|
CT | 329 | 256×256 | 5 | 钙化灶:最大截面:2.3×1.9 cm2 转移瘤直径:4.5 cm |
PET | 329 | 256×256 | 1 | 肿大淋巴结直:1.4 cm |
MRT1/T2 | 74 | 256×256 | 5 | 肾囊肿:数量3,最大径:0.6 cm,最小径:0.3 cm, 形状:类圆形,边界清晰 |
图5 不同α和β取值下的融合任务PSNR指标变化情况((a) MR-T1/MR-T2;(b) CT/PET)
Fig. 5 PSNR variations in fusion tasks under different alpha and beta values ((a) MR-T1/MR-T2; (b) CT/PET)
图6 MR-T1/MR-T2源图像及对比方法融合结果((a) MR-T1;(b) MR-T2;(c) EMFusion;(d) FusionGAN;(e) IFCNN;(f) ZLFMIF;(g) DDIFN;(h) MLCF;(i)本文方法)
Fig. 6 MR-T1/MR-T2 source images and fusion results using different comparative methods ((a) MR-T1; (b) MR-T2; (c) EMFusion; (d) FusionGAN; (e) IFCNN; (f) ZLFMIF; (g) DDIFN; (h) MLCF; (i) Ours)
对比方法 | PSNR | SSIM | CC | MI | VIF |
---|---|---|---|---|---|
EMFusion | 64.240 7 | 0.717 1 | 0.881 4 | 2.879 3 | 0.482 6 |
FusionGAN | 62.145 4 | 0.238 1 | 0.304 9 | 2.395 8 | 0.304 9 |
IFCNN | 63.906 2 | 0.706 3 | 0.874 8 | 2.979 3 | 0.513 2 |
ZLFMIF | 62.971 6 | 0.660 8 | 0.838 4 | 4.288 2 | 0.530 4 |
DDIFN | 61.594 9 | 0.469 3 | 0.722 6 | 1.910 6 | 0.492 8 |
MLCF | 61.766 9 | 0.614 1 | 0.751 3 | 3.895 5 | 0.731 5 |
本文方法 | 64.889 4 | 0.715 8 | 0.888 1 | 3.588 7 | 0.615 6 |
表4 不同方法下MR-T1/MR-T2图像融合的平均量化指标
Table 4 Average quantitative metrics for MR-T1/MR-T2 image fusion under different methods
对比方法 | PSNR | SSIM | CC | MI | VIF |
---|---|---|---|---|---|
EMFusion | 64.240 7 | 0.717 1 | 0.881 4 | 2.879 3 | 0.482 6 |
FusionGAN | 62.145 4 | 0.238 1 | 0.304 9 | 2.395 8 | 0.304 9 |
IFCNN | 63.906 2 | 0.706 3 | 0.874 8 | 2.979 3 | 0.513 2 |
ZLFMIF | 62.971 6 | 0.660 8 | 0.838 4 | 4.288 2 | 0.530 4 |
DDIFN | 61.594 9 | 0.469 3 | 0.722 6 | 1.910 6 | 0.492 8 |
MLCF | 61.766 9 | 0.614 1 | 0.751 3 | 3.895 5 | 0.731 5 |
本文方法 | 64.889 4 | 0.715 8 | 0.888 1 | 3.588 7 | 0.615 6 |
图7 MR/PET源图像及对比方法融合结果((a) MR;(b) PET;(c) EMFusion;(d) FusionGAN;(e) IFCNN;(f) ZLFMIF;(g) DDIFN;(h) MLCF;(i)本文方法)
Fig. 7 MR/PET source images and fusion results using different comparative methods ((a) MR; (b) PET; (c) EMFusion; (d) FusionGAN; (e) IFCNN; (f) ZLFMIF; (g) DDIFN; (h) MLCF; (i) Ours)
对比方法 | PSNR | SSIM | CC | MI | VIF |
---|---|---|---|---|---|
EMFusion | 61.878 1 | 0.691 1 | 0.796 9 | 3.040 6 | 0.658 3 |
FusionGAN | 58.002 8 | 0.103 4 | 0.710 7 | 1.949 9 | 0.217 6 |
IFCNN | 62.117 3 | 0.678 8 | 0.825 3 | 2.529 8 | 0.505 9 |
ZLFMIF | 61.293 9 | 0.671 4 | 0.615 2 | 3.638 2 | 0.615 2 |
DDIFN | 60.213 5 | 0.672 0 | 0.718 9 | 1.901 3 | 0.213 6 |
MLCF | 60.895 6 | 0.554 5 | 0.749 8 | 2.026 3 | 0.250 6 |
本文方法 | 63.250 8 | 0.692 9 | 0.844 6 | 3.596 3 | 0.725 1 |
表5 不同方法下MR/PET图像融合的平均量化指标
Table 5 Average quantitative metrics for MR/PET image fusion under different methods
对比方法 | PSNR | SSIM | CC | MI | VIF |
---|---|---|---|---|---|
EMFusion | 61.878 1 | 0.691 1 | 0.796 9 | 3.040 6 | 0.658 3 |
FusionGAN | 58.002 8 | 0.103 4 | 0.710 7 | 1.949 9 | 0.217 6 |
IFCNN | 62.117 3 | 0.678 8 | 0.825 3 | 2.529 8 | 0.505 9 |
ZLFMIF | 61.293 9 | 0.671 4 | 0.615 2 | 3.638 2 | 0.615 2 |
DDIFN | 60.213 5 | 0.672 0 | 0.718 9 | 1.901 3 | 0.213 6 |
MLCF | 60.895 6 | 0.554 5 | 0.749 8 | 2.026 3 | 0.250 6 |
本文方法 | 63.250 8 | 0.692 9 | 0.844 6 | 3.596 3 | 0.725 1 |
图8 CT/PET源图像及对比方法融合结果((a) CT;(b) PET;(c) EMFusion;(d) FusionGAN;(e) IFCNN;(f) ZLFMIF;(g) DDIFN;(h) MLCF;(i)本文方法)
Fig. 8 CT/PET source images and fusion results using different comparative methods ((a) CT; (b) PET; (c) EMFusion; (d) FusionGAN; (e) IFCNN; (f) ZLFMIF; (g) DDIFN; (h) MLCF; (i) Ours)
对比方法 | PSNR | SSIM | CC | MI | VIF |
---|---|---|---|---|---|
EMFusion | 65.609 4 | 0.722 8 | 0.795 3 | 2.982 6 | 0.739 2 |
FusionGAN | 62.288 5 | 0.105 9 | 0.643 8 | 2.104 5 | 0.177 1 |
IFCNN | 65.481 6 | 0.718 1 | 0.826 1 | 2.960 8 | 0.685 5 |
ZLFMIF | 64.959 3 | 0.715 1 | 0.787 4 | 3.294 8 | 0.735 3 |
DDIFN | 65.303 1 | 0.655 5 | 0.734 5 | 2.542 7 | 0.525 6 |
MLCF | 65.716 5 | 0.692 3 | 0.822 4 | 3.180 6 | 0.661 8 |
本文方法 | 66.879 2 | 0.731 7 | 0.861 5 | 3.394 7 | 0.818 6 |
表6 不同方法下CT/PET图像融合的平均量化指标
Table 6 Average quantitative metrics for CT/PET image fusion under different methods
对比方法 | PSNR | SSIM | CC | MI | VIF |
---|---|---|---|---|---|
EMFusion | 65.609 4 | 0.722 8 | 0.795 3 | 2.982 6 | 0.739 2 |
FusionGAN | 62.288 5 | 0.105 9 | 0.643 8 | 2.104 5 | 0.177 1 |
IFCNN | 65.481 6 | 0.718 1 | 0.826 1 | 2.960 8 | 0.685 5 |
ZLFMIF | 64.959 3 | 0.715 1 | 0.787 4 | 3.294 8 | 0.735 3 |
DDIFN | 65.303 1 | 0.655 5 | 0.734 5 | 2.542 7 | 0.525 6 |
MLCF | 65.716 5 | 0.692 3 | 0.822 4 | 3.180 6 | 0.661 8 |
本文方法 | 66.879 2 | 0.731 7 | 0.861 5 | 3.394 7 | 0.818 6 |
[1] |
GAO Y, MA S W, LIU J J, et al. Fusion of medical images based on salient features extraction by PSO optimized fuzzy logic in NSST domain[J]. Biomedical Signal Processing and Control, 2021, 69: 102852.
DOI URL |
[2] | 朱积成, 刘慧, 李珊珊, 等. 结合局部熵与梯度能量的双通道医学图像融合[J/OL]. 计算机辅助设计与图形学学报. [2023-06-18]. https://www.jcad.cn/cn/search. |
ZHU J C, LIU H, LI S S, et al. Two-channel Medical Image Fusion Combining Local Entropy and Gradient Energy[J/OL]. Journal of Computer-Aided Design & Computer Graphics. [2023-06-18]. https://www.jcad.cn/cn/search. (in Chinese). | |
[3] | XU Z Q J, ZHANG Y Y, XIAO Y Y. Training behavior of deep neural network in frequency domain[C]// International Conference on Neural Information Processing. Cham: Springer, 2019: 264-274. |
[4] |
WANG K P, ZHENG M Y, WEI H Y, et al. Multi-modality medical image fusion using convolutional neural network and contrast pyramid[J]. Sensors, 2020, 20(8): 2169.
DOI URL |
[5] |
HERMESSI H, MOURALI O, ZAGROUBA E. Convolutional neural network-based multimodal image fusion via similarity learning in the shearlet domain[J]. Neural Computing and Applications, 2018, 30(7): 2029-2045.
DOI |
[6] |
JOSE J, GAUTAM N, TIWARI M, et al. An image quality enhancement scheme employing adolescent identity search algorithm in the NSST domain for multimodal medical image fusion[J]. Biomedical Signal Processing and Control, 2021, 66: 102480.
DOI URL |
[7] | ZHAO C, YANG P, ZHOU F, et al. MHW-GAN: multidiscriminator hierarchical wavelet generative adversarial network for multimodal image fusion[J/OL]. IEEE Transactions on Neural Networks and Learning Systems. [2023-06-19]. https://ieeexplore.ieee.org/document/10177917. |
[8] |
LIU Y, CHEN X, PENG H, et al. Multi-focus image fusion with a deep convolutional neural network[J]. Information Fusion, 2017, 36: 191-207.
DOI URL |
[9] |
DENG X, DRAGOTTI P L. Deep convolutional neural network for multi-modal image restoration and fusion[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(10): 3333-3348.
DOI URL |
[10] | LI S S, ZOU Y H, WANG G J, et al. Infrared and visible image fusion method based on principal component analysis network and multi-scale morphological gradient[J]. Infrared Physics & Technology, 2023, 133: 104810. |
[11] |
XU H, MA J Y. EMFusion: an unsupervised enhanced medical image fusion network[J]. Information Fusion, 2021, 76: 177-186.
DOI URL |
[12] | MA J Y, YU W, LIANG P W, et al. FusionGAN: a generative adversarial network for infrared and visible image fusion[J]. Information Fusion, 2019, 48(C): 11-26. |
[13] |
ZHANG Y, LIU Y, SUN P, et al. IFCNN: a general image fusion framework based on convolutional neural network[J]. Information Fusion, 2020, 54: 99-118.
DOI URL |
[14] | LAHOUD F, SÜSSTRUNK S. Zero-learning fast medical image fusion[C]// 2019 22th International Conference on Information Fusion. New York: IEEE Press, 2020: 1-8. |
[15] | LIU H, LI S S, ZHU J C, et al. DDIFN: a dual-discriminator multi-modal medical image fusion network[J]. ACM Transactions on Multimedia Computing, Communications, and Applications, 19(4): 145:1-145:17. |
[16] | ZOPH B, GHIASI G, LIN T Y, et al. Rethinking pre-training and self-training[C]// The 34th International Conference on Neural Information Processing Systems. New York: ACM, 2020: 3833-3845. |
[17] | 杜鹏飞, 李小勇, 高雅丽. 多模态视觉语言表征学习研究综述[J]. 软件学报, 2021, 32(2): 327-348. |
DU P F, LI X Y, GAO Y L. Survey on multimodal visual language representation learning[J]. Journal of Software, 2021, 32(2): 327-348 (in Chinese). | |
[18] | KORNBLITH S, SHLENS J, LE Q V. Do better ImageNet models transfer better?[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 2656-2666. |
[19] | HENDRYCKS D, LEE K, MAZEIKA M. Using pre-training can improve model robustness and uncertainty[EB/OL]. [2023-06-18]. https://arxiv.org/abs/1901.09960.pdf. |
[20] | 刘慧, 李珊珊, 高珊珊, 等. 预训练模型特征提取的双对抗磁共振图像融合网络研究[J]. 软件学报, 2023, 34(5): 2134-2151. |
LIU H, LI S S, GAO S S, et al. Research on dual-adversarial MR image fusion network using pre-trained model for feature extraction[J]. Journal of Software, 2023, 34(5): 2134-2151 (in Chinese). | |
[21] | 王军敏, 樊养余, 李祖贺. 基于深度卷积神经网络和迁移学习的纹理图像识别[J]. 计算机辅助设计与图形学学报, 2022, 34(5): 701-710. |
WANG J M, FAN Y Y, LI Z H. Texture image recognition based on deep convolutional neural network and transfer learning[J]. Journal of Computer-Aided Design & Computer Graphics, 2022, 34(5): 701-710 (in Chinese). | |
[22] | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 770-778. |
[23] | MNIH V, HEESS N, GRAVES A, et al. Recurrent models of visual attention[EB/OL]. [2023-06-18]. https://arxiv.org/abs/1406.6247.pdf. |
[24] | WANG F, JIANG M Q, QIAN C, et al. Residual attention network for image classification[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 6450-6458. |
[25] | 金燕, 薛智中, 姜智伟. 基于循环残差卷积神经网络的医学图像分割算法[J]. 计算机辅助设计与图形学学报, 2022, 34(8): 1205-1215. |
JIN Y, XUE Z Z, JIANG Z W. Medical image segmentation based on recurrent residual convolution neural network[J]. Journal of Computer-Aided Design & Computer Graphics, 2022, 34(8): 1205-1215 (in Chinese). | |
[26] | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// European Conference on Computer Vision. Cham: Springer, 2018: 3-19. |
[27] |
KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90.
DOI URL |
[28] |
JAGALINGAM P, HEGDE A V. A review of quality metrics for fused image[J]. Aquatic Procedia, 2015, 4: 133-142.
DOI URL |
[29] |
WANG Z, BOVIK A C, SHEIKH H R, et al. Image quality assessment: from error visibility to structural similarity[J]. IEEE Transactions on Image Processing: a Publication of the IEEE Signal Processing Society, 2004, 13(4): 600-612.
DOI URL |
[30] |
QU G H, ZHANG D L, YAN P F. Information measure for performance of image fusion[J]. Electronics Letters, 2002, 38(7): 313-315.
DOI URL |
[31] |
ASUERO A, SAYAGO A, GONZÁLEZ A G. The correlation coefficient: an overview[J]. Critical Reviews in Analytical Chemistry, 2006, 36: 41-59.
DOI URL |
[32] |
HAN Y, CAI Y Z, CAO Y, et al. A new image fusion performance metric based on visual information fidelity[J]. Information Fusion, 2013, 14(2): 127-135.
DOI URL |
[33] |
TAN W, THITØN W, XIANG P, et al. Multi-modal brain image fusion based on multi-level edge-preserving filtering[J]. Biomedical Signal Processing and Control, 2021, 64: 102280.
DOI URL |
[1] | 胡欣, 常娅姝, 秦皓, 肖剑, 程鸿亮. 基于改进YOLOv8和GMM图像点集匹配的双目测距方法[J]. 图学学报, 2024, 45(4): 714-725. |
[2] | 牛为华, 郭迅. 基于改进YOLOv8的船舰遥感图像旋转目标检测算法[J]. 图学学报, 2024, 45(4): 726-735. |
[3] | 李滔, 胡婷, 武丹丹. 结合金字塔结构和注意力机制的单目深度估计[J]. 图学学报, 2024, 45(3): 454-463. |
[4] | 朱光辉, 缪君, 胡宏利, 申基, 杜荣华. 基于自增强注意力机制的室内单图像分段平面三维重建[J]. 图学学报, 2024, 45(3): 464-471. |
[5] | 王稚儒, 常远, 鲁鹏, 潘成伟. 神经辐射场加速算法综述[J]. 图学学报, 2024, 45(1): 1-13. |
[6] | 李佳琦, 王辉, 郭宇. 基于Transformer的三角形网格分类分割网络[J]. 图学学报, 2024, 45(1): 78-89. |
[7] | 韩亚振, 尹梦晓, 马伟钊, 杨诗耕, 胡锦飞, 朱丛洋. DGOA:基于动态图和偏移注意力的点云上采样[J]. 图学学报, 2024, 45(1): 219-229. |
[8] | 王江安, 黄乐, 庞大为, 秦林珍, 梁温茜. 基于自适应聚合循环递归的稠密点云重建网络[J]. 图学学报, 2024, 45(1): 230-239. |
[9] | 周锐闯, 田瑾, 闫丰亭, 朱天晓, 张玉金. 融合外部注意力和图卷积的点云分类模型[J]. 图学学报, 2023, 44(6): 1162-1172. |
[10] | 王吉, 王森, 蒋智文, 谢志峰, 李梦甜. 基于深度条件扩散模型的零样本文本驱动虚拟人生成方法[J]. 图学学报, 2023, 44(6): 1218-1226. |
[11] | 杨陈成, 董秀成, 侯兵, 张党成, 向贤明, 冯琪茗. 基于参考的Transformer纹理迁移深度图像超分辨率重建[J]. 图学学报, 2023, 44(5): 861-867. |
[12] | 党宏社, 许怀彪, 张选德. 融合结构信息的深度学习立体匹配算法[J]. 图学学报, 2023, 44(5): 899-906. |
[13] | 翟永杰, 郭聪彬, 王乾铭, 赵宽, 白云山, 张冀. 基于隐含空间知识融合的输电线路多金具检测方法[J]. 图学学报, 2023, 44(5): 918-927. |
[14] | 杨红菊, 高敏, 张常有, 薄文, 武文佳, 曹付元. 一种面向图像修复的局部优化生成模型[J]. 图学学报, 2023, 44(5): 955-965. |
[15] | 毕春艳, 刘越. 基于深度学习的视频人体动作识别综述[J]. 图学学报, 2023, 44(4): 625-639. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||