欢迎访问《图学学报》 分享到:

图学学报 ›› 2024, Vol. 45 ›› Issue (1): 65-77.DOI: 10.11996/JG.j.2095-302X.2024010065

• 图像处理与计算机视觉 • 上一篇    下一篇

基于高低频特征分解的深度多模态医学图像融合网络

王欣雨1,2(), 刘慧1,2(), 朱积成1,2, 盛玉瑞3, 张彩明2,4   

  1. 1.山东财经大学计算机科学与技术学院,山东 济南 250014
    2.山东省数字媒体技术重点实验室,山东 济南 250014
    3.山东第一医科大学第一附属医院,山东 济南 250014
    4.山东大学软件学院,山东 济南 250014
  • 收稿日期:2023-07-20 接受日期:2023-09-20 出版日期:2024-02-29 发布日期:2024-02-29
  • 通讯作者:刘慧(1978-),女,教授,博士。主要研究方向为数据挖掘与可视化。E-mail:liuh_lh@sdufe.edu.cn
  • 第一作者:王欣雨(1999-),女,硕士研究生。主要研究方向为多模态数据融合。E-mail:wangxy@mail.sdufe.edu.cn
  • 基金资助:
    国家自然科学基金项目(62072274);国家自然科学基金项目(U22A2033);中央引导地方科技发展项目(YDZX2022009);山东省泰山学者特聘专家计划项目(tstp20221137)

Deep multimodal medical image fusion network based on high-low frequency feature decomposition

WANG Xinyu1,2(), LIU Hui1,2(), ZHU Jicheng1,2, SHENG Yurui3, ZHANG Caiming2,4   

  1. 1. School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan Shandong 250014, China
    2. Shandong Key Laboratory of Digital Media Technology, Jinan Shandong 250014, China
    3. The First Affiliated Hospital of Shandong First Medical University, Jinan Shandong 250014, China
    4. School of Software, Shandong University, Jinan Shandong 250014, China
  • Received:2023-07-20 Accepted:2023-09-20 Published:2024-02-29 Online:2024-02-29
  • First author:WANG Xinyu (1999-), master student. Her main research interest covers multimodal data fusion. E-mail:wangxy@mail.sdufe.edu.cn
  • Supported by:
    National Natural Science Foundation of China(62072274);National Natural Science Foundation of China(U22A2033);The Central Guidance on Local Science and Technology Development Project(YDZX2022009);Mount Taishan Scholar Distinguished Expert Plan of Shandong Province(tstp20221137)

摘要:

多模态医学图像融合旨在利用跨模态图像的相关性和信息互补性,以增强医学图像在临床应用中的可读性和适用性。然而,现有手工设计的模型无法有效地提取关键目标特征,从而导致融合图像模糊、纹理细节丢失等问题。为此,提出了一种新的基于高低频特征分解的深度多模态医学图像融合网络,将通道注意力和空间注意力机制引入融合过程,在保持全局结构的基础上保留了局部纹理细节信息,实现了更加细致的融合。首先,通过预训练模型VGG-19提取两种模态图像的高频特征,并通过下采样提取其低频特征,形成高低频中间特征图。其次,在特征融合模块嵌入残差注意力网络,依次从通道和空间维度推断注意力图,并将其用来指导输入特征图的自适应特征优化过程。最后,重构模块形成高质量特征表示并输出融合图像。实验结果表明,该算法在Harvard公开数据集和自建腹部数据集峰值信噪比提升8.29%,结构相似性提升85.07%,相关系数提升65.67%,特征互信息提升46.76%,视觉保真度提升80.89%。

山东财经大学刘慧教授及其学生王欣雨等提出一种基于高低频特征分解的深度多模态医学图像融合网络,通过预训练模型提取图像的高频特征,下采样提取低频特征,将变换域与深度学习结合,经过高低频信息的交换,提升了特征提取的丰富性和准确性。在特征融合模块,引入残差注意力网络,从通道和空间维度推断注意力图,指导特征图的自适应特征优化过程。在公开和自建数据集上进行了实验,表明了所提模型的有效性。

关键词: 多模态医学图像融合, 预训练模型, 深度学习, 高低频特征提取, 残差注意力网络

Abstract:

Multimodal medical image fusion aims to enhance the interpretability and applicability of medical images in clinical settings by leveraging correlations and complementary information across different imaging modalities. However, existing manually designed models often fail to effectively extract critical target features, resulting in issues such as blurred fusion images and loss of textural details. To address this, a novel deep multimodal medical image fusion network based on high-low frequency feature decomposition was proposed. This approach incorporated channel attention and spatial attention mechanisms into the fusion process, allowing for a more intricate fusion of high-low frequency features while preserving both global structure and local textural details. Firstly, the high-frequency features of two modal images were extracted using the pre-trained model VGG-19, and their low-frequency features were extracted through downsampling to form intermediate features between high and low frequencies. Secondly, a residual attention network was embedded in the feature fusion module to sequentially infer attention maps from independent channels and spatial dimensions. These maps were then employed to guide the adaptive feature optimization of input feature maps. Finally, the reconstruction module fused high-low frequency features and output the fusion image. Experimental results on both the Harvard open dataset and a self-created abdominal dataset demonstrated that compared to the source image, the fusion image produced by the proposed method achieved an 8.29% improvement in peak signal-to-noise ratio, 85.07% in structural similarity, 65.67% in correlation coefficient, 46.76% in feature mutual information, and 80.89% in visual fidelity.

Key words: multi-modal medical image fusion, pre-trained model, deep learning, high-low frequency feature extraction, residual attention network

中图分类号: