Welcome to Journal of Graphics share: 

Journal of Graphics ›› 2024, Vol. 45 ›› Issue (4): 659-669.DOI: 10.11996/JG.j.2095-302X.2024040659

• Image Processing and Computer Vision • Previous Articles     Next Articles

ASC-Net: fast segmentation network for surgical instruments and organs in laparoscopic video

ZHANG Xinyu1,2(), ZHANG Jiayi1,2,3, GAO Xin2,3()   

  1. 1. School of Biomedical Engineering (Suzhou), Department of Life Sciences and Medicine, University of Science and Technology of China, Hefei Anhui 230026, China
    2. Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou Jiangsu 215163, China
    3. Jinan Guoke Medical Engineering and Technology Development Co., Ltd., Jinan Shandong 250101, China
  • Received:2024-03-08 Accepted:2024-05-08 Online:2024-08-31 Published:2024-09-02
  • Contact: GAO Xin
  • About author:First author contact:

    ZHANG Xinyu (1998-), master student. His main research interest covers surgical navigation. E-mail:798091761@qq.com

  • Supported by:
    National Natural Science Foundation of China(82372052);Key Research and Development Program of China(2022YFC2408400);Key Research and Development Program of Jiangsu(BE2021663);Key Research and Development Program of Jiangsu(BE2023714);Key Research and Development Program of Shandong(2021SFGC0104);National Natural Science Foundation of Shandong(ZR2022QF071)

Abstract:

Laparoscopic surgery automation is an important component of intelligent surgery, which is based on the premise of real-time and precise segmentation of surgical instruments and organs under the scope of laparoscopy. Hindered by complex factors such as intraoperative blood contamination and smoke interference, the real-time and precise segmentation of surgical instruments and organs faced great challenges. The existing image segmentation methods all performed poorly. Therefore, a fast segmentation network based on attention perceptron and spatial channel (attention spatial channel net, ASC-Net) was proposed to achieve the rapid and precise segmentation of surgical instruments and organs in laparoscopic images. Under the UNet architecture, attention perceptron and spatial channel modules were designed, which were embedded between the network encoding and decoding modules through skip connections. This enabled the network to focus on the deep semantic information differences between similar targets in the images, while learning multi-dimensional features of each target at multiple scales. In addition, a pre-training fine-tuning strategy was adopted to reduce the network computation. Experimental results demonstrated that on the EndoVis2018 (Endovis robotic scene segmentation challenge 2018) dataset, the mean Dice coefficient (mDice), mean intersection-over-union (mIoU), and mean inference time (mIT) of this method were 90.64%, 86.40%, and 16.73 ms (about 60 frames/s), respectively, which were 26% and 39% higher than existing SOTA methods, with mIT reduced by 56%. On the AutoLaparo (automation in laparoscopic hysterectomy) dataset, the mDice, mIoU, and mIT of this method were 93.72%, 89.43%, and 16.41 ms (about 61 frames/s), respectively, outperforming the comparison method. While ensuring segmentation speed, the proposed method effectively enhanced segmentation accuracy, achieving the rapid and precise segmentation of surgical instruments and organs in laparoscopic images and advancing the field of laparoscopic surgery automation.

Key words: automated surgery, laparoscopic image, multi-object segmentation, attention perceptron, multi-scale features, pre-training fine-tuning

CLC Number: