Welcome to Journal of Graphics share: 

Journal of Graphics ›› 2024, Vol. 45 ›› Issue (4): 683-695.DOI: 10.11996/JG.j.2095-302X.2024040683

• Image Processing and Computer Vision • Previous Articles     Next Articles

Automatic portrait matting model based on semantic guidance

CHENG Yan1,4(), YAN Zhihang2,4, LAI Jianming2,4, WANG Guixi2,4, ZHONG Linhui3,4   

  1. 1. School of Software, Jiangxi Normal University, Nanchang Jiangxi 330022, China
    2. School of Digital Industry, Jiangxi Normal University, Shangrao Jiangxi 334000, China
    3. School of Computer Information and Engineering, Jiangxi Normal University, Nanchang Jiangxi 330022, China
    4. Key Laboratory of Intelligent Information Processing and Emotional Computing of Jiangxi Province, Nanchang Jiangxi 330022, China
  • Received:2024-02-27 Accepted:2024-05-10 Online:2024-08-31 Published:2024-09-03
  • About author:First author contact:

    CHENG Yan (1976-), professor, Ph.D. Her main research interests cover artificial intelligence and image processing. E-mail:chyan88888@jxnu.edu.cn

  • Supported by:
    National Natural Science Foundation of China(62167006);National Natural Science Foundation of China(61967011);Jiangxi Province Science and Technology Innovation Base Plan Jiangxi Province Key Laboratory Project(2024SSY03131);Natural Science Foundation of Jiangxi Province(20212BAB202017);Jiangxi Province 03 Special and 5G Projects(20212ABC03A22);Jiangxi Province Major Disciplines Academic and Technical Leaders Training Plan Leading Talent Project(20213BCJL22047)

Abstract:

To address the issues of semantic discrimination errors and unclear details in existing portrait matting methods, an automatic matting model based on semantic guidance was proposed.Firstly, a hybrid CNN-Transformer architecture EMO was introduced for feature encoding. Then, the semantic segmentation decoding branch utilized a multi-scale hybrid attention module to process the top-level encoded features, enhancing multi-scale representation and pixel-level discrimination capabilities. Next, a feature enhancement module was employed to merge high-level features, facilitating the flow of high-level semantic information through the shallow network. Simultaneously, the aggregation guidance module in the detail extraction decoding branch aggregated features from different branches, utilizing the aggregated features to better guide the network in extracting shallow features, thereby improving the accuracy of edge and detail extraction. Experiments on three datasets demonstrated that our approach outperformed the compared methods, achieving optimal performance while significantly reducing parameter count and computational complexity, validating the competitiveness of our proposed method.

Key words: portrait matting, semantic guidance, multi-scale, feature enhancement, aggregation guidance

CLC Number: