Lightweight skin lesion image segmentation network based on Mamba structure

doi:10.11996/JG.j.2095-302X.2025061257

Abstract

Abstract:

Segmentation of skin lesions is an important task in medical image analysis, and is of great significance for the early diagnosis and treatment of skin diseases. However, when processing high-resolution skin images and capturing subtle lesion features, existing models still face challenges such as high computational complexity and insufficient processing of redundant information. To address this end, a lightweight skin lesion image segmentation network based on the Mamba structure was proposed, ResMamba adopted a six-level U-shaped structure, embedding Mamba into the visual state space and introducing it into the codec. The ResVSS module, as the core component of the encoder, reduced the number of parameters by removing a redundant linear layer, and at the same time combined the deep convolution block and learnable scale parameters to scale the residual connection, thereby reducing the complexity of the model while improving the segmentation accuracy. In the hopping connection module, a multi-level multi-scale information fusion module was used to generate spatial and channel attention maps, which effectively fused multi-scale information. Through experimental verification on the public skin dataset ISIC2017 and ISIC2018, the results demonstrated that the ResMamba model achieved good segmentation performance in terms of the number of balance parameters and segmentation performance, thus verifying the effectiveness of the model.

Key words: deep learning, skin lesion segmentation, mamba structure, state space models, lightweight

CLC Number:

HE Mengmeng, ZHANG Xiaoyan, LI Hongan. Lightweight skin lesion image segmentation network based on Mamba structure[J]. Journal of Graphics, 2025, 46(6): 1257-1266.

Figures/Tables 11

References 23

[1]	文思佳, 张栋, 赵伟强, 等. 融合CNN-Transformer的医学图像分割网络[J]. 计算机与数字工程, 2024, 52(8): 2452-2456.
	WEN S J, ZHANG D, ZHAO W Q, et al. Medical image segmentation network integrated with CNN-Transformer[J]. Computer and Digital Engineering, 2024, 52(8): 2452-2456 (in Chinese).
[2]	FAN Y Z, SONG J H, YUAN L, et al. HCT-Unet: multi-target medical image segmentation via a hybrid CNN-Transformer Unet incorporating multi-axis gated multi-layer perceptron[J]. The Visual Computer, 2024, 41(5): 3457-3472. DOI
[3]	熊岚堃, 张桂梅, 刘晖群, 等. 结合轴向增强Transformer与CNN双编码的医学图像分割方法[EB/OL]. (2025-02-06) [2025-02-08]. http://kns.cnki.net/kcms/detail/11.2925.TP.2025 0206.1623.033.html.
	XIONG L K, ZHANG G M, LIU H Q, et al. Combination of axial enhanced Transformer and CNN network for medical image segmentation[EB/OL]. (2025-02-06) [2025-02-08]. http://kns.cnki.net/kcms/detail/11.2925.TP.2025 0206.1623.033.html. (in Chinese).
[4]	RONNEBERGER O, FISCHER P, BROX T. U-net: convolutional networks for biomedical image segmentation[C]// The 18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham: Springer, 2015: 234-241.
[5]	李翠云, 白静, 郑凉. 融合边缘增强注意力机制和U-Net网络的医学图像分割[J]. 图学学报, 2022, 43(2): 273-278.
	LI C Y, BAI J, ZHENG L. A U-Net based contour enhanced attention for medical image segmentation[J]. Journal of Graphics, 2022, 43(2): 273-278 (in Chinese). DOI
[6]	CHEN J N, LU Y Y, YU Q H, et al. TransUNet: transformers make strong encoders for medical image segmentation[EB/OL]. (2021-02-08) [2024-08-04]. https://arxiv.org/abs/2102.04306.
[7]	DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: transformers for image recognition at scale[EB/OL]. (2021-06-03) [2024-08-04]. https://dblp.uni-trier.de/db/conf/iclr/iclr2021.html#DosovitskiyB0WZ21.
[8]	ZHANG Y D, LIU H Y, HU Q. Transfuse: fusing transformers and CNNs for medical image segmentation[C]// The 24th International Conference on Medical Image Computing and Computer Assisted Intervention. Cham: Springer, 2021: 14-24.
[9]	CAO H, WANG Y Y, CHEN J, et al. Swin-Unet: Unet-like pure transformer for medical image segmentation[C]// European Conference on Computer Vision. Cham: Springer, 2023: 205-218.
[10]	LIU Z, LIN Y T, CAO Y, et al. Swin transformer: hierarchical vision transformer using shifted windows[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 9992-10002.
[11]	GU A, DAO T. Mamba: Linear-time sequence modeling with selective state spaces[EB/OL]. (2023-12-01) [2024-08-04]. https://arxiv.org/abs/2312.00752.
[12]	MA J, LI F, WANG B. U-mamba: enhancing long-range dependency for biomedical image segmentation[EB/OL]. (2024-01-09) [2024-08-04]. https://arxiv.org/abs/2401.04722.
[13]	LIU Y, TIAN Y J, ZHAO Y Z, et al. VMamba: visual state space model[EB/OL]. (2024-01-18) [2024-08-04]. https://proceedings.neurips.cc//paper_files/paper/2024/hash/baa2da9ae4bfed26520bb61d259a3653-Abstract-Conference.html.
[14]	RUAN J C, XIANG S C. VM-UNet: vision mamba UNet for medical image segmentation[EB/OL]. (2024-02-04) [2024-08-31]. https://arxiv.org/abs/2402.02491v1.
[15]	ZHANG M Y, YU Y, JIN S, et al. VM-UNET-V2: rethinking vision mamba UNet for medical image segmentation[C]// The 20th International Symposium on Bioinformatics Research and Applications. Cham: Springer, 2024: 335-346.
[16]	WANG Z Y, ZHENG J Q, ZHANG Y C, et al. Mamba-UNet: UNet like pure visual mamba for medical image segmentation[EB/OL]. (2024-02-07) [2024-08-31]. https://arxiv.org/abs/2402.05079.
[17]	LIAO W B, ZHU Y H, WANG X Y, et al. LightM-UNet: mamba assists in lightweight UNet for medical image segmentation[EB/OL]. (2024-03-08) [2024-08-31]. https://arxiv.org/abs/2403.05246.
[18]	KALMAN R E. A new approach to linear filtering and prediction problems[J]. Journal of Basic Engineering, 1960, 82(1): 35-45. DOI URL
[19]	GU A, GOEL K, RÉ C. Efficiently modeling long sequences with structured state spaces[EB/OL]. (2021-10-31) [2024-08-31]. https://dblp.uni-trier.de/db/conf/iclr/iclr2022.html#GuGR22.
[20]	GU A, DAO T, ERMON S, et al. HiPPO: recurrent memory with optimal polynomial projections[C]// The 34th International Conference on Neural Information Processing Systems. New York: ACM, 2020: 125.
[21]	RUAN J C, XIANG S C, XIE M Y, et al. MALUNet: a multi-attention and light-weight UNet for skin lesion segmentation[C]// 2022 IEEE International Conference on Bioinformatics and Biomedicine. New York: IEEE Press, 2022: 1150-1156.
[22]	CODELLA N C F, GUTMAN D, CELEBI M E, et al. Skin lesion analysis toward melanoma detection: A challenge at the 2017 international symposium on biomedical imaging (ISBI), hosted by the international skin imaging collaboration (ISIC)[C]// The 15th IEEE International symposium on biomedical imaging. New York: IEEE Press, 2018: 168-172.
[23]	CODELLA N, ROTEMBERG V, TSCHANDL P, et al. Skin lesion analysis toward melanoma detection 2018:A challenge hosted by the international skin imaging collaboration (ISIC)[EB/OL]. (2019-02-09) [2024-06-04]. https://arxiv.org/abs/1902.03368.

Dateset	Train	Valid	Test	Total
ISIC2017	2 000	150	600	2 750
ISIC2018	1 815	259	520	2 594

Dateset	Train	Valid	Test	Total
ISIC2017	2 000	150	600	2 750
ISIC2018	1 815	259	520	2 594

Model	Params/M	GFLOPs	mIoU↑	DSC↑	ACC↑	Sen↑	Spe↑
U-Net	7.770	13.780	79.29	83.95	95.65	86.82	97.28
TransFuse	26.270	11.530	79.21	88.40	96.17	87.14	97.98
MALUNet	0.175	0.083	78.78	88.13	96.18	84.78	98.47
VM-UNet	27.430	4.112	80.23	89.03	96.62	86.89	97.45
VM-UNetV2	12.380	2.473	81.34	89.73	96.85	87.39	97.38
LightM-UNet	1.270	0.267	82.18	90.22	96.70	88.73	98.67
ResMamba (Ours)	0.043	0.059	83.89	89.95	96.99	89.21	98.73

Model	Params/M	GFLOPs	mIoU↑	DSC↑	ACC↑	Sen↑	Spe↑
U-Net	7.770	13.780	79.29	83.95	95.65	86.82	97.28
TransFuse	26.270	11.530	79.21	88.40	96.17	87.14	97.98
MALUNet	0.175	0.083	78.78	88.13	96.18	84.78	98.47
VM-UNet	27.430	4.112	80.23	89.03	96.62	86.89	97.45
VM-UNetV2	12.380	2.473	81.34	89.73	96.85	87.39	97.38
LightM-UNet	1.270	0.267	82.18	90.22	96.70	88.73	98.67
ResMamba (Ours)	0.043	0.059	83.89	89.95	96.99	89.21	98.73

Model	Params/M	GFLOPs	mIoU↑	DSC↑	ACC↑	Sen↑	Spe↑
U-Net	7.770	13.780	79.56	84.75	94.05	85.86	96.69
TransFuse	26.270	11.530	80.63	89.27	95.66	89.28	95.74
MALUNet	0.175	0.083	80.25	89.04	95.62	89.64	96.19
VM-UNet	27.430	4.112	81.35	89.71	96.19	88.49	96.93
VM-UNetV2	12.380	2.473	81.37	89.73	96.37	87.75	97.61
LightM-UNet	1.270	0.267	82.71	89.32	96.83	89.69	97.89
ResMamba (Ours)	0.043	0.059	84.37	90.21	97.04	89.62	97.92