基于语义相似性传播的图像彩色化

doi:10.11996/JG.j.2095-302X.2025010126

图学学报 ›› 2025, Vol. 46 ›› Issue (1): 126-138.DOI: 10.11996/JG.j.2095-302X.2025010126

• 图像处理与计算机视觉 • 上一篇下一篇

基于语义相似性传播的图像彩色化

孟思弘¹^,²(), 刘浩¹^,², 方昊天¹^,², 僧冰枫¹^,², 杜正君¹^,²()

1.青海大学计算机技术与应用学院，青海西宁 810016
2.青海省智能计算与应用实验室，青海西宁 810016

收稿日期:2024-07-01 接受日期:2024-11-20 出版日期:2025-02-28 发布日期:2025-02-14
通讯作者:杜正君(1989-)，男，讲师，博士。主要研究方向为计算机图形学、图像视频处理。E-mail：dzj@qhu.edu.cn
第一作者:孟思弘(1999-)，女，硕士研究生。主要研究方向为图像视频处理。E-mail：meng_sihong@163.com
基金资助:
青海省自然科学基金青年项目(2023-ZJ-951Q)

Image colorization via semantic similarity propagation

MENG Sihong¹^,²(), LIU Hao¹^,², FANG Haotian¹^,², SENG Bingfeng¹^,², DU Zhengjun¹^,²()

1. School of Computer Technology and Application, Qinghai University, Xining Qinghai 810016, China
2. Intelligent Computing and Application Laboratory of Qinghai Province, Xining Qinghai 810016, China

Received:2024-07-01 Accepted:2024-11-20 Published:2025-02-28 Online:2025-02-14
Contact: DU Zhengjun (1989-), lecturer, Ph.D. His main research interests cover computer graphics, image and video processing. E-mail：dzj@qhu.edu.cn
First author：MENG Sihong (1999-)，master student. Her main research interest covers image and video processing. E-mail：meng_sihong@163.com
Supported by:
The Youth Program of Natural Science Foundation of Qinghai Province(2023-ZJ-951Q)

摘要/Abstract

摘要：

图像彩色化旨在将灰度图像转换为彩色图像，这一技术在计算机图形学和计算机视觉领域内长期受到研究者们的广泛关注，并在图像复原、医学成像、电影修复、艺术创作等诸多领域广泛应用，在实际应用中展现出巨大的潜力。经过数十年的发展，研究者们提出了大量基于交互、基于规则以及基于深度学习的算法来提升图像彩色化的效果。尽管如此，现有的图像彩色化算法仍然存在一些显著的缺陷，如计算效率偏低、交互繁琐、颜色饱和度偏低以及无法避免颜色溢出现象等问题。针对上述问题，提出了一种基于语义相似性传播的图像彩色化算法。算法首先利用深度神经网络提取输入灰度图像的语义特征，并构建特征空间。然后，将图像彩色化问题形式化为一个高效的、基于语义相似性传播的能量优化问题，通过优化能量函数求解灰度图像的色度值，从而将用户提供的笔触颜色传播到图像的其他区域。此外，还采用了三线性插值的方法加速能量优化和颜色传播，大幅提升了计算效率。为了验证算法的有效性，在收集的图像集上从多个角度进行了实验评估，包括图像视觉效果、生成图像的质量、算法的运行时间，以及用户交互体验。大量定性和定量实验结果表明，该算法在更少的用户交互下实现了更准确、高效、自然的彩色化效果。

关键词: 图像彩色化, 交互式, 语义相似性, 传播, 能量优化

Abstract:

Image colorization aims to convert grayscale images into color images, a technique that has long received extensive attention from researchers in the fields of computer graphics and computer vision. It has found wide applications in areas such as image restoration, medical imaging, film restoration, and artistic creation. Over decades of development, researchers have proposed a large number of interaction-based, rule-based, and deep learning-based algorithms to enhance the colorization effect of images. Nevertheless, the existing image colorization algorithms exhibit some significant shortcomings, such as low computational efficiency, cumbersome user interaction, low color saturation, and the occurrence of color overflow. To address these challenges, an image colorization algorithm based on semantic similarity propagation was proposed. Semantic features of the input grayscale image were extracted using deep neural networks, and a feature space was constructed. Then, the image colorization task was formalized as an efficient energy optimization problem based on semantic similarity propagation, enabling the propagation of user-supplied stroke colors to other regions of the image. In addition, a trilinear interpolation method was employed to accelerate both energy optimization and color propagation, significantly enhancing computational efficiency. In order to verify the effectiveness of the algorithm, experiments were conducted on a collected image set, evaluating multiple dimensions, such as image visual effect, generated image quality, algorithm running time, and user interaction experience. The results of a large number of qualitative and quantitative experiments demonstrated that the proposed algorithm achieved more accurate, efficient, and natural colorization with reduced user interaction requirements, while achieving substantial improvements in computational efficiency.

Key words: image colorization, interactive, semantic similarity, propagation, energy optimization

中图分类号:

TP391.41

孟思弘, 刘浩, 方昊天, 僧冰枫, 杜正君. 基于语义相似性传播的图像彩色化[J]. 图学学报, 2025, 46(1): 126-138.

MENG Sihong, LIU Hao, FANG Haotian, SENG Bingfeng, DU Zhengjun. Image colorization via semantic similarity propagation[J]. Journal of Graphics, 2025, 46(1): 126-138.

图/表 13

图1 算法整体流程

Fig. 1 Overview of the algorithm

图2 优化前后及添加亮度前后彩色化效果对比((a)输入图像及笔触；(b)初始语义可视化结果；(c)结合(b)的语义和亮度的彩色化结果；(d)优化后的语义可视化结果；(e)结合(d)的语义和亮度得到的彩色化结果；(f)仅根据(d)的语义得到的彩色化结果)

Fig. 2 Comparison of colorization results before and after optimization and before and after adding lightness ((a) Input image and strokes; (b) Initial semantic visualization results; (c) Results by combining the semantics and brightness of (b); (d) Optimized semantic visualization results; (e) Results obtained by combining the semantics and brightness of (d); (f) Results based only on the semantics of (d))

图3 彩色化效果展示((a)卧室；(b)背影；(c)田园)

Fig. 3 Colorization result ((a) Bedroom; (b) Back view; (c) Pastoral)

图4 维度的选择对彩色化结果以及运行时间的影响((a)输入图像及用户笔触；(b)二维；(c)三维；(d)五维)

Fig. 4 The effect of the choice of dimensions on our colorization results and running time ((a) Input image and strokes; (b) 2 dimension; (c) 3 dimensions; (d) 5 dimensions)

图5 参数b对彩色化结果以及运行时间的影响((a)输入图像及用户笔触；(b) b=2；(c) b=4；(d) b=8)

Fig. 5 Effect of parameter b on our colorization results and running time ((a) Input images and strokes; (b) b=2; (c) b=4; (d) b=8)

图6 本文算法与其他编辑传播算法彩色化结果以及运行时间对比((a)输入图像及用户笔触；(b)文献[1]；(c)本文算法)

Fig. 6 Comparison of colorization results and running time of ours with other edit propagation algorithms ((a) Input images and strokes; (b) Reference [1]; (c) Ours)

图7 本文算法与其他基于笔触的深度学习算法对比((a)输入图像及用户笔触；(b)文献[23]；(c)文献[33]；(d)文献[25]；(e)本文算法)

Fig. 7 Comparison of our algorithm with other stroke-based deep learning algorithms ((a) Input images and strokes; (b) Reference [23]; (d) Reference [33]; (d) Reference [25]; (e) Ours)

图8 与现有无需涂鸦的深度学习算法对比((a)输入图像；(b)文献[15]；(c)文献[24]；(d)文献[23]；(e)本文算法)

Fig. 8 Comparison of our algorithm with other deep learning algorithms ((a) Input images; (b) Reference [15]; (c) Reference [24]; (d) Reference [23]; (e) Ours)

图9 本算法与其他彩色化算法的定量对比((a)输入图像；(b)文献[15]；(c)文献[25]；(d)文献[24]；(e)本文算法)

Fig. 9 Quantitative comparison of our algorithm with other colorization algorithms ((a) Input images; (b) Reference [15]; (c) Reference [25]; (d) Reference [24]; (e) Ours)

图10 验证语义特征有效性的消融实验((a)输入图像；(b)仅亮度；(c)仅语义信息；(d)语义+亮度)

Fig. 10 Ablation study to verify the effectiveness of semantic feature ((a) Input images; (b) Luminance only; (c) Semantics only; (d) Semantics and luminance)

表1 使用加速策略前后的运行时间对比

Table 1 Running time before and after accelerating

分辨率	示例	加速前/s	加速后/s
90×60	卧室	19.14	0.02
	背影	18.97	0.03
	田园	17.82	0.03
	树屋	19.64	0.03
	海边	18.74	0.03
	平均	18.86	0.03
128×85	卧室	150.89	0.03
	背影	143.87	0.02
	田园	155.31	0.03
	树屋	154.87	0.03
	海边	165.28	0.02
	平均	154.04	0.03
180×120	卧室	1207.19	0.04
	背影	1195.92	0.05
	田园	1183.51	0.04
	树屋	1164.31	0.05
	海边	1169.94	0.04
	平均	1184.17	0.04
256×171	卧室	10398.67	0.10
	背影	10274.71	0.07
	田园	10904.38	0.09
	树屋	10566.61	0.12
	海边	10568.34	0.09
	平均	10542.54	0.09

表2 所有示例的彩色化运行时间

Table 2 Running time of all examples

示例	图像分辨率	运行时间/s
卧室(图3)	1280×853	0.58
背影(图3)	1280×853	0.53
田园(图3)	1280×853	0.40
小猫(图5)	300×199	0.03
牛群(图5)	1280×853	0.49
玫瑰(图6)	427×285	0.06
核桃(图6)	400×276	0.05
香蕉(图7)	640×960	0.24
花朵(图7)	853×1280	0.39
饼干(图8)	848×1280	0.44
陶瓷(图8)	1280×850	0.41
胶囊(图8)	1280×853	0.38
树屋(图9)	1280×853	0.44
月亮(图9)	853×1280	0.39
海边(图9)	1280×853	0.42
田野(图10)	1280×853	0.49
黄花(图10)	853×1280	0.40
平均		0.34

图11 失败案例((a)输入图像；(b)初始语义；(c)优化后语义；(d)彩色化结果)

Fig. 11 Failure case ((a) Input image; (b) Initial semantics; (c) Optimized semantics; (d) Colorization results)

参考文献 36

[1]	LEVIN A, LISCHINSKI D, WEISS Y. Colorization using optimization[C]// ACM SIGGRAPH 2004 Papers. New York: ACM, 2004: 689-694.
[2]	HUANG Y C, TUNG Y S, CHEN J C, et al. An adaptive edge detection based colorization algorithm and its applications[C]// The 13th Annual ACM International Conference on Multimedia. New York: ACM, 2005: 351-354.
[3]	YATZIV L, SAPIRO G. Fast image and video colorization using chrominance blending[J]. IEEE Transactions on Image Processing, 2006, 15(5): 1120-1129. PMID
[4]	HEU J H, HYUN D Y, KIM C S, et al. Image and video colorization based on prioritized source propagation[C]// The 2009 16th IEEE International Conference on Image Processing. New York: IEEE Press, 2009: 465-468.
[5]	QU Y G, WONG T T, HENG P A. Manga colorization[J]. ACM Transactions on Graphics (TOG), 2006, 25(3): 1214-1220.
[6]	LUAN Q, WEN F, COHEN-OR D, et al. Natural image colorization[C]// The 18th Eurographics conference on Rendering Techniques. Goslar: Eurographics Association, 2007: 309-320.
[7]	王泽文, 张小明. 基于p-Laplace方程的图像彩色化方法[J]. 工程图学学报, 2010, 31(6): 62-67.
	WANG Z W, ZHANG X M. Image colorization based on p-Laplace equation[J]. Journal of Engineering Graphics, 2010, 31(6): 62-67 (in Chinese).
[8]	田建勇, 石林江. 局部线性嵌入与模糊C-均值聚类的红外图像彩色化算法[J]. 图学学报, 2018, 39(5): 917-925. DOI
	TIAN J Y, SHI L J. An infrared image colorization algorithm based on local linear embedding and fuzzy C-means clustering[J]. Journal of Graphics, 2018, 39(5): 917-925 (in Chinese).
[9]	REINHARD E, ADHIKHMIN M, GOOCH B, et al. Color transfer between images[J]. IEEE Computer Graphics and Applications, 2001, 21(5): 34-41.
[10]	WELSH T, ASHIKHMIN M, MUELLER K. Transferring color to greyscale images[C]// The 29th Annual Conference on Computer Graphics and Interactive Techniques. New York: ACM, 2002: 277-280.
[11]	IRONY R, COHEN-OR D, LISCHINSKI D. Colorization by example[C]// The 16th Eurographics Conference on Rendering Techniques. Goslar: Eurographics Association, 2005: 201-210.
[12]	CHIA A Y S, ZHUO S J, GUPTA R K, et al. Semantic colorization with internet images[J]. ACM Transactions on Graphics (TOG), 2011, 30(6): 1-8.
[13]	GUPTA R K, CHIA A Y S, RAJAN D, et al. Image colorization using similar images[C]// The 20th ACM International Conference on Multimedia. New York: ACM, 2012: 369-378.
[14]	CHENG Z Z, YANG Q X, SHENG B. Deep colorization[C]// 2015 IEEE International Conference on Computer Vision. New York: IEEE Press, 2015: 415-423.
[15]	ZHANG R, ISOLA P, EFROS A A. Colorful image colorization[C]// The 14th European Conference on Computer Vision. Cham: Springer, 2016: 649-666.
[16]	ZHAO J J, HAN J G, SHAO L, et al. Pixelated semantic colorization[J]. International Journal of Computer Vision, 2020, 128(4): 818-834.
[17]	GOODFELLOW I, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial networks[J]. Communications of the ACM, 2020, 63(11): 139-144.
[18]	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// The 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 6000-6010.
[19]	HO J, JAIN A, ABBEEL P. Denoising diffusion probabilistic models[C]// The 34th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2020: 574.
[20]	WU Y Z, WANG X T, LI Y, et al. Towards vivid and diverse image colorization with generative color prior[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 14357-14366.
[21]	KIM G, KANG K, KIM S, et al. BigColor: colorization using a generative color prior for natural images[C]// The 17th European Conference on Computer Vision. Cham: Springer, 2022: 350-366.
[22]	WANG Y, XIA M H, QI L, et al. PalGAN: image colorization with palette generative adversarial networks[C]// The 17th European Conference on Computer Vision. Cham: Springer, 2022: 271-288.
[23]	HUANG Z T, ZHAO N X, LIAO J. UniColor: a unified framework for multi-modal colorization with transformer[J]. ACM Transactions on Graphics (TOG), 2022, 41(6): 205.
[24]	KANG X Y, YANG T, OUYANG W Q, et al. DDColor: towards photo-realistic image colorization via dual decoders[C]// 2023 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2023: 328-338.
[25]	ZHANG R, ZHU J Y, ISOLA P, et al. Real-time user-guided image colorization with learned deep priors[J]. ACM Transactions on Graphics (TOG), 2017, 36(4): 119.
[26]	YUN J, LEE S, PARK M, et al. iColoriT: towards propagating local hints to the right region in interactive colorization by leveraging vision transformer[C]// 2023 IEEE/CVF Winter Conference on Applications of Computer Vision. New York: IEEE Press, 2023: 1787-1796.
[27]	DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: transformers for image recognition at scale[EB/OL]. [2024-05-01]. https://dblp.org/db/conf/iclr/iclr2021.html#DosovitskiyB0WZ21.
[28]	BAHNG H, YOO S, CHO W, et al. Coloring with words: Guiding image colorization through text-based palette generation[C]// The 15th European Conference on Computer Vision. Cham: Springer, 2018: 443-459.
[29]	WENG S C, WU H, CHANG Z, et al. L-code: language-based colorization using color-object decoupled conditions[C]// The 36th AAAI Conference on Artificial Intelligence. Washington: AAAI, 2022: 2677-2684.
[30]	CHANG Z, WENG S C, LI Y, et al. L-CoDer: language-based colorization with color-object decoupling transformer[C]// The 17th European Conference on Computer Vision. Cham: Springer, 2022: 360-375.
[31]	HE M M, CHEN D D, LIAO J, et al. Deep exemplar-based colorization[J]. ACM Transactions on Graphics (TOG), 2018, 37(4): 47.
[32]	CONG X Y, WU Y, CHEN Q F, et al. Automatic controllable colorization via imagination[C]// 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2024: 2609-2619.
[33]	LIANG Z X, LI Z C, ZHOU S C, et al. Control color: multimodal diffusion-based interactive image colorization[J]. [2024-05-07]. https://arxiv.org/abs/2402.10855.
[34]	ZHANG L M, RAO A, AGRAWALA M. Adding conditional control to text-to-image diffusion models[C]// 2023 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2023: 3813-3824.
[35]	AKSOY Y, OH T H, PARIS S, et al. Semantic soft segmentation[J]. ACM Transactions on Graphics (TOG), 2018, 37(4): 72.
[36]	KIRILLOV A, MINTUN E, RAVI N, et al. Segment anything[C]// 2023 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2023: 3992-4003.

基于语义相似性传播的图像彩色化

Image colorization via semantic similarity propagation

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 13

参考文献 36

相关文章 15

编辑推荐

Metrics

本文评价

[1]	周婧怡, 张栖桐, 冯结青. 基于混合结构的多视图三维场景重建[J]. 图学学报, 2024, 45(1): 199-208.
[2]	芦楠楠, 刘一雄, 邱铭恺. 基于随机传播图卷积模型的零样本图像分类[J]. 图学学报, 2022, 43(4): 624-631.
[3]	刘肖健, 彭程, 吕芸芸. 配色设计智能优化的非随机种群方法[J]. 图学学报, 2021, 42(6): 1035-1042.
[4]	查东东，张迪，刘华勇. 三参曲线光滑融合的构造及参数的优化[J]. 图学学报, 2020, 41(5): 725-732.
[5]	胡珊 1，蒋旭 1，符凯杰 1，周明煜 1，罗亦鸣 2，乔忠林 3 . 基于 SEM 的交互式公共导识系统体验设计研究[J]. 图学学报, 2020, 41(2): 204-209.
[6]	王华 1，李绅绅 1，何晓宇 1，朱付保 1，姚妮 1，徐明亮 2 . 基于个性化定制的交互式蒙版擦除动画设计与实现[J]. 图学学报, 2019, 40(3): 473-480.
[7]	戴莎 1,2，司伟鑫 2,3，钱银玲 2，郑睿 2，王琼 2，徐东亮 1，彭延军 4，王平安 3,2 . 虚拟显微白内障手术系统人机交互接口设计[J]. 图学学报, 2019, 40(3): 565-573.
[8]	孙昭 1，柳有权 1，张彩荣 1，石剑 2，陈彦云 3. 一种场景内容分布的交互式渲染系统[J]. 图学学报, 2019, 40(1): 87-91.
[9]	刘子建，艾东旭，段彦杰，张励. 基于信息传播的绥德石雕艺术元素数据平台开发[J]. 图学学报, 2017, 38(6): 925-930.
[10]	刘玉杰1，封江力1，李宗民1，李华2. 基于扩散的图像显著性检测[J]. 图学学报, 2017, 38(2): 204-210.
[11]	许晓丽. 基于多代表点近邻传播的大数据图像分割算法[J]. 图学学报, 2016, 37(1): 91-96.
[12]	李鹍. 基于数字图学技术的建筑专业教学思考[J]. 图学学报, 2013, 34(3): 152-156.
[13]	方泳泽，谢书童，陈建海，吴旭. 基于有向图的工程变更传播分析的研究[J]. 图学学报, 2012, 33(5): 132-136.
[14]	苗腾，赵春江，陆声链，郭新宇. 基于拉普拉斯网格变形的三维植物叶片建模[J]. 图学学报, 2012, 33(3): 46-51.
[15]	王泽文，张小明. 基于p-Laplace方程的图像彩色化方法[J]. 图学学报, 2010, 31(6): 62-67.