欢迎访问《图学学报》 分享到:

图学学报 ›› 2025, Vol. 46 ›› Issue (5): 969-979.DOI: 10.11996/JG.j.2095-302X.2025050969

• 图像处理与计算机视觉 • 上一篇    下一篇

基于SAM2的腹腔镜手术多目标自动分割方法

刘成1,2(), 张家意1,2,3, 袁烽1,2, 张睿1,2,3, 高欣2,3()   

  1. 1 中国科学技术大学生物医学工程学院(苏州)生命科学与医学部江苏 苏州 215163
    2 中国科学院苏州生物医学工程技术研究所江苏 苏州 215163
    3 济南国科医工科技发展有限公司山东 济南 250101
  • 收稿日期:2025-06-26 接受日期:2025-08-12 出版日期:2025-10-30 发布日期:2025-09-10
  • 通讯作者:高欣(1975-),男,研究员,博士。主要研究方向为基于智能计算的精准医疗、手术导航及机器人、低剂量锥束CT成像。E-mail:xingaosam@163.com
  • 第一作者:刘成(2001-),男,硕士研究生。主要研究方向为手术导航。E-mail:1011948636@qq.com
  • 基金资助:
    国家自然科学基金(82372052);国家自然科学基金(82402373);山东省自然科学基金(ZR2022QF071);山东省自然科学基金(ZR2022QF099);泰山产业创新领军人才项目(tscx202312131)

SAM2-based multi-objective automatic segmentation method for laparoscopic surgery

LIU Cheng1,2(), ZHANG Jiayi1,2,3, YUAN Feng1,2, ZHANG Rui1,2,3, GAO Xin2,3()   

  1. 1 School of Biomedical Engineering (Suzhou), Department of Life Sciences and Medicine, University of Science and Technology of China, Suzhou Jiangsu 215163, China
    2 Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou Jiangsu 215163, China
    3 Jinan Guoke Medical Engineering and Technology Development Co., Ltd., Jinan Shandong 250101, China
  • Received:2025-06-26 Accepted:2025-08-12 Published:2025-10-30 Online:2025-09-10
  • First author:LIU Cheng (2001-), master student. His main research interest covers surgical navigation. E-mail:1011948636@qq.com
  • Supported by:
    National Natural Science Foundation of China(82372052);National Natural Science Foundation of China(82402373);Science Foundation of Shandong(ZR2022QF071);Science Foundation of Shandong(ZR2022QF099);Taishan Industrial Experts Program(tscx202312131)

摘要:

腹腔镜术中场景的自动分割是手术机器人实现自主操作的关键基础,当前仍面临三重挑战:手术目标间纹理高度相似且边界模糊,导致相似目标难以精确分割;从亚毫米级缝合线到厘米级脏器组织存在显著尺度差异,制约了多目标同步分割精度提升;运动伪影和烟雾遮挡等干扰进一步影响术中多目标完整分割的鲁棒性。为此,提出基于视觉大模型SAM2的腹腔镜手术多目标自动分割方法(SAM2-MSNet)。采用LoRA+微调策略优化SAM2图像编码器,高效适配腹腔镜图像的纹理特征表达;设计跨尺度特征同步提取模块,实现多尺度目标的精确分割;构建特征关系全局感知模块,增强网络对运动伪影及烟雾遮挡等干扰的鲁棒性;并引入方向梯度直方图驱动的伪标签辅助监督机制,显著提升目标边缘分割精度。实验结果表明,SAM2-MSNet在Endovis2018和AutoLaparo数据集上分别取得了70.2%和69.6%的平均交并比(mIoU),和78.5%和75.0%的平均Dice系数(mDice)。在推理速度与SAM2-UNet相当(23帧/秒 VS. 25帧/秒)的前提下,其分割精度显著提升了3.0%和6.7% (mIoU)和2.8%和6.8% (mDice)。SAM2-MSNet实现了对腹腔镜手术场景高精度全自动分割,为手术机器人自主化进程提供了关键技术支撑。

关键词: 腹腔镜手术场景分割, 视觉大模型, 跨尺度特征同步提取, 特征关系全局感知, 伪标签辅助监督

Abstract:

Automatic segmentation in laparoscopic surgical scenes is a critical for enabling surgical robots to perform autonomous operations. However, this task faces three major challenges: the high similarity in texture and blurred boundaries of surgical targets, making accurate segmentation difficult; significant scale differences, which hinder the synchronous segmentation of multiple targets; and intraoperative interferences, such as motion artifacts and smoke occlusion, that affect segmentation completeness. To address these challenges, a multi-objective automatic segmentation method for laparoscopic surgery (SAM2-MSNet) based on the visual large model SAM2 was proposed. The network employed a LoRA+ fine-tuning strategy to optimize SAM2’s image encoder, enabling efficient adaptation to the texture features of laparoscopic images. A cross-scale feature synchronous extraction module was designed to realize accurate segmentation of multi-scale targets. Furthermore, a global perception module of feature relationships was constructed to enhance the anti-interference abilities, such as motion artifacts and smoke occlusion. Additionally, a pseudo-label-assisted supervision mechanism driven by directional gradient histograms significantly enhanced the accuracy of target edge segmentation. Experimental results demonstrated that SAM2-MSNet achieved a mean intersection over union (mIoU) of 70.2%/69.6% and a mean Dice coefficient (mDice) of 78.5%/75.0% on the Endovis2018 and AutoLaparo datasets. On the premise that the reasoning speed was equivalent to that of SAM2-UNet (23 frames per second vs. 25 frames per second), the segmentation accuracy was significantly improved by 3.0%/6.7% (mIoU) and 2.8%/6.8% (mDice). This work enabled high-precision automatic segmentation for laparoscopic surgical scenes, providing a robust technical foundation for the autonomous operation of surgical robots.

Key words: laparoscopic surgical scene segmentation, visual large model, synchronous extraction of cross-scale features, global perception of feature relationships, pseudo-label assisted supervision

中图分类号: