SAM-based mask generation and segmentation for dermatological images

doi:10.11996/JG.j.2095-302X.2026020322

Abstract

Abstract:

As a malignant tumor with a relatively high incidence rate, the timely detection of skin cancer carries substantial clinical significance. Accurate identification and segmentation of skin lesions serve as critical prerequisites for computer-aided diagnosis. Despite the remarkable performance of deep learning techniques in medical image segmentation, existing models commonly encounter challenges such as insufficient segmentation accuracy at lesion edges and constraints on the scale and diversity of training data. To address these issues, a boundary-enhanced model system named BESA-Diff was proposed. The system employed the boundary-enhanced diffusion model DermoSegDiff as its core segmentation architecture and optimized the model training workflow. The core technical contributions of this research were twofold: First, a framework for the automatic generation of pathological skin images and masks was constructed based on diffusion models. Second, an innovative mask refinement pipeline was designed by innovatively integrating the Segment Anything Model (SAM) with the edge refinement module of DermoSegDiff, and a high-quality synthetic medical image dataset was established. Experimental evaluations on the ISIC2018 standard dataset, PH2 dataset, HAM10000 dataset, and the synthetic dataset demonstrated that the proposed model significantly outperformed baseline models in key segmentation metrics, including the Dice Similarity Coefficient (Dice) and Intersection over Union (IoU). Ablation experiments confirmed that the introduction of SAM for mask refinement was the pivotal factor driving performance improvement. This module effectively enhanced the segmentation of lesion edges, particularly in regions with blurred boundaries or low contrast. The findings of this study validated that integrating the data generation capability of diffusion models with the boundary optimization capability of general segmentation models can effectively improve the accuracy and robustness of skin lesion segmentation. This work provided a high-performance solution for auxiliary diagnosis of skin cancer and highlighted the immense potential of synthetic data technology in overcoming the data bottleneck in medical artificial intelligence.

Key words: skin cancer, data enhancement, diffusion model, edge segmentation, image segmentation

CLC Number:

CHEN Mengqi, ZHAO Junli, DENG Xiaodan. SAM-based mask generation and segmentation for dermatological images[J]. Journal of Graphics, 2026, 47(2): 322-331.

Figures/Tables 13

Fig. 1 Training enhancement framework

Fig. 2 Overall architecture of image generation by diffusion model

Fig. 3 Main trunk network structure

Fig. 4 Real images and generated images ((a), (b) Real images; (c), (d) Generated images)

Fig. 5 Real masks and generated masks ((a), (b) Original masks; (c), (d) Generated masks)

Table 1 Mask performance comparison

实验组别	Dice	IoU
Baseline (Real)	0.917 3	0.860 1
Syn-DiffMask	0.911 7	0.844 1
Syn-SAMMask	0.922 8	0.887 0

Fig. 6 Comparison of skin lesion segmentation results ((a), (e) Original synthetic image; (b), (f) Mask generated by the Diffusion model (the red area represents the segmentation result of the Diffusion model); (c), (g) Mask generated by the SAM model (the green area represents the segmentation result of SAM); (d), (h) The superimposed result of the two (the yellow area represents the overlapping part of the two methods))

Table 2 Model segmentation performance on ISIC2018

方法	Dice ↑	IoU ↑	BF4 ↑
BASE-Diff(ISIC2018)	0.893 2	0.819 2	0.192 3
BASE-Diff (+ 1 000张合成图像)	0.918 5	0.855 4	0.222 9

Table 3 Segmentation performance across different skin lesion types

病灶类型	方法	Dice↑	IoU↑
黑色素瘤	Baseline	0.878 3	0.805 0
黑色素瘤	BESA-Diff	0.901 5	0.825 5
基底细胞癌	Baseline	0.885 1	0.812 2
基底细胞癌	BESA-Diff	0.912 0	0.840 1
痣	Baseline	0.915 8	0.848 9
痣	BESA-Diff	0.925 3	0.862 0
光化性角化病	Baseline	0.868 8	0.798 8
光化性角化病	BESA-Diff	0.895 2	0.815 5

Fig. 7 Training and validation loss curves of the baseline model (trained solely using ground-truth masks)

Fig. 8 Training and validation loss curves of the BESA-DIFF model (incorporates SAM-generated masks followed by data cleaning)

Fig. 9 Comparison of skin lesion segmentation results ((a) Original dermatological image; (b) Ground truth mask; (c) Predicted results (green regions))

Table 4 Model segmentation performance on PH2 and HAM10000

方法	PH2		HAM10000
方法	Dice↑	IoU↑	Dice↑	IoU↑
Baseline (PH2)	0.865 8	0.770 1	0.848 5	0.752 8
BESA-Diff (生成图像)	0.881 2	0.788 3	0.862 7	0.763 9

References 25

[1]	CAO H, WANG Y Y, CHEN J, et al. Swin-Unet: Unet-like pure transformer for medical image segmentation[C]// European Conference on Computer Vision. Cham: Springer, 2023: 205-218.
[2]	CHEN J N, LU Y Y, YU Q H, et al. TransUNet: Transformers make strong encoders for medical image segmentation[EB/OL]. [2025-07-21]. https://doi.org/10.48550/arXiv.2102.04306.
[3]	HUANG X H, DENG Z F, LI D D, et al. MISSFormer: an effective transformer for 2D medical image segmentation[J]. IEEE Transactions on Medical Imaging, 2023, 42(5): 1484-1494. DOI PMID
[4]	HO J, JAIN A, ABBEEL P. Denoising diffusion probabilistic models[EB/OL]. [2025-07-21]. https://doi.org/10.48550/arXiv.2006.11239.
[5]	AZAD R, KAZEROUNI A, HEIDARI M, et al. Advances in medical image analysis with vision transformers: a comprehensive review[J]. Medical Image Analysis, 2024, 91: 103000. DOI URL
[6]	KUMAR A, KANTHEN K R, JOHN J. GS-TransUNet:integrated 2D Gaussian splatting and transformer UNet for accurate skin lesion analysis[EB/OL]. [2025-07-21]. https://doi.org/10.48550/arXiv.2502.16748.
[7]	BOZORGPOUR A, SADEGHEIH Y, KAZEROUNI A, et al. DermoSegDiff: a boundary-aware segmentation diffusion model for skin lesion delineation[C]// The 6th International Workshop on Predictive intelligence in Medicine. Cham: Springer, 2023: 146-158.
[8]	CHEN L C, ZHU Y K, GEORGE P, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]// The 15th European Conference on Computer Vision. Cham: Springer, 2018: 833-851.
[9]	CODELLA N, ROTEMBERG V, TSCHANDL P, et al. Skin lesion analysis toward melanoma detection 2018:a challenge hosted by the international skin imaging collaboration (ISIC)[EB/OL]. [2025-07-21]. https://doi.org/10.48550/arXiv.1902.03368.
[10]	KIRILLOV A, MINTUN E, RAVI N, et al. Segment anything[EB/OL]. [2025-07-21]. https://doi.org/10.48550/arXiv.2304.02643.
[11]	FANG L X, XU Y Y, MA X, et al. Minding fuzzy regions: a data-driven alternating learning paradigm for stable lesion segmentation[C]// 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Computer Vision and Pattern Recognition. New York: IEEE Press, 2025: 10425-10434.
[12]	JIANG L, ZHOU Z Y, LEUNG T, et al. MentorNet: learning data-driven curriculum for very deep neural networks on corrupted labels[EB/OL]. [2025-07-21]. https://doi.org/10.48550/arXiv.1712.05055.
[13]	RONG S H, TU B H, WANG Z L, et al. Boundary-enhanced co-training for weakly supervised semantic segmentation[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2023: 19574-19584.
[14]	YANG H R, HUA W B, XU Z B, et al. Domain-generalized discrete diffusion model for cross-domain medical image segmentation[J]. IEEE Transactions on Medical Imaging, 2025, 44(11): 4131-4142. DOI URL
[15]	XIE P T, ZHANG L, JINDAL B, et al. Generative AI enables medical image segmentation in ultra low-data regimes[J]. Nature Communications, 2025, 16(1): 6486. DOI
[16]	KAZEROUNI A, AGHDAM E K, HEIDARI M, et al. Diffusion models in medical imaging: a comprehensive survey[J]. Medical Image Analysis, 2023, 88: 102846. DOI URL
[17]	KIM B, OH Y, YE J C. Diffusion adversarial representation learning for self-supervised vessel segmentation[EB/OL]. [2025-07-21]. https://doi.org/10.48550/arXiv.2209.14566.
[18]	KHAN T M, LIN D, IQBAL S, et al. TAFM-Net: a novel approach to skin lesion segmentation using transformer attention and focal modulation[EB/OL]. [2025-07-21]. https://doi.org/10.48550/arXiv.2411.17556.
[19]	LEI B Y, XIA Z M, JIANG F, et al. Skin lesion segmentation via generative adversarial networks with dual discriminators[J]. Medical Image Analysis, 2020, 64: 101716. DOI URL
[20]	LIU X W, YANG L, CHEN J G, et al. Region-to-boundary deep learning model with multi-scale feature fusion for medical image segmentation[J]. Biomedical Signal Processing and Control, 2022, 71: 103165. DOI URL
[21]	BI L, FULHAM M, KIM J. Hyper-fusion network for semi-automatic segmentation of skin lesions[J]. Medical Image Analysis, 2022, 76: 102334. DOI URL
[22]	MOLAEI A, AMINIMEHR A, TAVAKOLI A, et al. Implicit neural representation in medical imaging: a comparative survey[C]// 2023 IEEE/CVF International Conference on Computer Vision Workshops. New York: IEEE Press, 2023: 2373-2383.
[23]	NICHOL A, DHARIWAL P. Improved denoising diffusion probabilistic models[EB/OL]. [2025-07-21]. https://doi.org/10.48550/arXiv.2102.09672.
[24]	WANG H N, CAO P, WANG J Q, et al. UCTransnet: rethinking the skip connections in U-Net from a channel-wise perspective with transformer[C]// The 36th AAAI Conference on Artificial Intelligence. Philadelphia: AAAI, 2022: 2441-2449.
[25]	WANG J C, WEI L, WANG L S, et al. Boundary-aware transformers for skin lesion segmentation[C]// The 24th International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham: Springer, 2021: 206-216.