图学学报 ›› 2025, Vol. 46 ›› Issue (1): 126-138.DOI: 10.11996/JG.j.2095-302X.2025010126
孟思弘1,2(), 刘浩1,2, 方昊天1,2, 僧冰枫1,2, 杜正君1,2(
)
收稿日期:
2024-07-01
接受日期:
2024-11-20
出版日期:
2025-02-28
发布日期:
2025-02-14
通讯作者:
杜正君(1989-),男,讲师,博士。主要研究方向为计算机图形学、图像视频处理。E-mail:dzj@qhu.edu.cn第一作者:
孟思弘(1999-),女,硕士研究生。主要研究方向为图像视频处理。E-mail:meng_sihong@163.com
基金资助:
MENG Sihong1,2(), LIU Hao1,2, FANG Haotian1,2, SENG Bingfeng1,2, DU Zhengjun1,2(
)
Received:
2024-07-01
Accepted:
2024-11-20
Published:
2025-02-28
Online:
2025-02-14
Contact:
DU Zhengjun (1989-), lecturer, Ph.D. His main research interests cover computer graphics, image and video processing. E-mail:dzj@qhu.edu.cnFirst author:
MENG Sihong (1999-),master student. Her main research interest covers image and video processing. E-mail:meng_sihong@163.com
Supported by:
摘要:
图像彩色化旨在将灰度图像转换为彩色图像,这一技术在计算机图形学和计算机视觉领域内长期受到研究者们的广泛关注,并在图像复原、医学成像、电影修复、艺术创作等诸多领域广泛应用,在实际应用中展现出巨大的潜力。经过数十年的发展,研究者们提出了大量基于交互、基于规则以及基于深度学习的算法来提升图像彩色化的效果。尽管如此,现有的图像彩色化算法仍然存在一些显著的缺陷,如计算效率偏低、交互繁琐、颜色饱和度偏低以及无法避免颜色溢出现象等问题。针对上述问题,提出了一种基于语义相似性传播的图像彩色化算法。算法首先利用深度神经网络提取输入灰度图像的语义特征,并构建特征空间。然后,将图像彩色化问题形式化为一个高效的、基于语义相似性传播的能量优化问题,通过优化能量函数求解灰度图像的色度值,从而将用户提供的笔触颜色传播到图像的其他区域。此外,还采用了三线性插值的方法加速能量优化和颜色传播,大幅提升了计算效率。为了验证算法的有效性,在收集的图像集上从多个角度进行了实验评估,包括图像视觉效果、生成图像的质量、算法的运行时间,以及用户交互体验。大量定性和定量实验结果表明,该算法在更少的用户交互下实现了更准确、高效、自然的彩色化效果。
中图分类号:
孟思弘, 刘浩, 方昊天, 僧冰枫, 杜正君. 基于语义相似性传播的图像彩色化[J]. 图学学报, 2025, 46(1): 126-138.
MENG Sihong, LIU Hao, FANG Haotian, SENG Bingfeng, DU Zhengjun. Image colorization via semantic similarity propagation[J]. Journal of Graphics, 2025, 46(1): 126-138.
图2 优化前后及添加亮度前后彩色化效果对比((a)输入图像及笔触;(b)初始语义可视化结果;(c)结合(b)的语义和亮度的彩色化结果;(d)优化后的语义可视化结果;(e)结合(d)的语义和亮度得到的彩色化结果;(f)仅根据(d)的语义得到的彩色化结果)
Fig. 2 Comparison of colorization results before and after optimization and before and after adding lightness ((a) Input image and strokes; (b) Initial semantic visualization results; (c) Results by combining the semantics and brightness of (b); (d) Optimized semantic visualization results; (e) Results obtained by combining the semantics and brightness of (d); (f) Results based only on the semantics of (d))
图4 维度的选择对彩色化结果以及运行时间的影响((a)输入图像及用户笔触;(b)二维;(c)三维;(d)五维)
Fig. 4 The effect of the choice of dimensions on our colorization results and running time ((a) Input image and strokes; (b) 2 dimension; (c) 3 dimensions; (d) 5 dimensions)
图5 参数b对彩色化结果以及运行时间的影响((a)输入图像及用户笔触;(b) b=2;(c) b=4;(d) b=8)
Fig. 5 Effect of parameter b on our colorization results and running time ((a) Input images and strokes; (b) b=2; (c) b=4; (d) b=8)
图6 本文算法与其他编辑传播算法彩色化结果以及运行时间对比((a)输入图像及用户笔触;(b)文献[1];(c)本文算法)
Fig. 6 Comparison of colorization results and running time of ours with other edit propagation algorithms ((a) Input images and strokes; (b) Reference [1]; (c) Ours)
图7 本文算法与其他基于笔触的深度学习算法对比((a)输入图像及用户笔触;(b)文献[23];(c)文献[33];(d)文献[25];(e)本文算法)
Fig. 7 Comparison of our algorithm with other stroke-based deep learning algorithms ((a) Input images and strokes; (b) Reference [23]; (d) Reference [33]; (d) Reference [25]; (e) Ours)
图8 与现有无需涂鸦的深度学习算法对比((a)输入图像;(b)文献[15];(c)文献[24];(d)文献[23];(e)本文算法)
Fig. 8 Comparison of our algorithm with other deep learning algorithms ((a) Input images; (b) Reference [15]; (c) Reference [24]; (d) Reference [23]; (e) Ours)
图9 本算法与其他彩色化算法的定量对比((a)输入图像;(b)文献[15];(c)文献[25];(d)文献[24];(e)本文算法)
Fig. 9 Quantitative comparison of our algorithm with other colorization algorithms ((a) Input images; (b) Reference [15]; (c) Reference [25]; (d) Reference [24]; (e) Ours)
图10 验证语义特征有效性的消融实验((a)输入图像;(b)仅亮度;(c)仅语义信息;(d)语义+亮度)
Fig. 10 Ablation study to verify the effectiveness of semantic feature ((a) Input images; (b) Luminance only; (c) Semantics only; (d) Semantics and luminance)
分辨率 | 示例 | 加速前/s | 加速后/s |
---|---|---|---|
90×60 | 卧室 | 19.14 | 0.02 |
背影 | 18.97 | 0.03 | |
田园 | 17.82 | 0.03 | |
树屋 | 19.64 | 0.03 | |
海边 | 18.74 | 0.03 | |
平均 | 18.86 | 0.03 | |
128×85 | 卧室 | 150.89 | 0.03 |
背影 | 143.87 | 0.02 | |
田园 | 155.31 | 0.03 | |
树屋 | 154.87 | 0.03 | |
海边 | 165.28 | 0.02 | |
平均 | 154.04 | 0.03 | |
180×120 | 卧室 | 1207.19 | 0.04 |
背影 | 1195.92 | 0.05 | |
田园 | 1183.51 | 0.04 | |
树屋 | 1164.31 | 0.05 | |
海边 | 1169.94 | 0.04 | |
平均 | 1184.17 | 0.04 | |
256×171 | 卧室 | 10398.67 | 0.10 |
背影 | 10274.71 | 0.07 | |
田园 | 10904.38 | 0.09 | |
树屋 | 10566.61 | 0.12 | |
海边 | 10568.34 | 0.09 | |
平均 | 10542.54 | 0.09 |
表1 使用加速策略前后的运行时间对比
Table 1 Running time before and after accelerating
分辨率 | 示例 | 加速前/s | 加速后/s |
---|---|---|---|
90×60 | 卧室 | 19.14 | 0.02 |
背影 | 18.97 | 0.03 | |
田园 | 17.82 | 0.03 | |
树屋 | 19.64 | 0.03 | |
海边 | 18.74 | 0.03 | |
平均 | 18.86 | 0.03 | |
128×85 | 卧室 | 150.89 | 0.03 |
背影 | 143.87 | 0.02 | |
田园 | 155.31 | 0.03 | |
树屋 | 154.87 | 0.03 | |
海边 | 165.28 | 0.02 | |
平均 | 154.04 | 0.03 | |
180×120 | 卧室 | 1207.19 | 0.04 |
背影 | 1195.92 | 0.05 | |
田园 | 1183.51 | 0.04 | |
树屋 | 1164.31 | 0.05 | |
海边 | 1169.94 | 0.04 | |
平均 | 1184.17 | 0.04 | |
256×171 | 卧室 | 10398.67 | 0.10 |
背影 | 10274.71 | 0.07 | |
田园 | 10904.38 | 0.09 | |
树屋 | 10566.61 | 0.12 | |
海边 | 10568.34 | 0.09 | |
平均 | 10542.54 | 0.09 |
示例 | 图像分辨率 | 运行时间/s |
---|---|---|
卧室( | 1280×853 | 0.58 |
背影( | 1280×853 | 0.53 |
田园( | 1280×853 | 0.40 |
小猫( | 300×199 | 0.03 |
牛群( | 1280×853 | 0.49 |
玫瑰( | 427×285 | 0.06 |
核桃( | 400×276 | 0.05 |
香蕉( | 640×960 | 0.24 |
花朵( | 853×1280 | 0.39 |
饼干( | 848×1280 | 0.44 |
陶瓷( | 1280×850 | 0.41 |
胶囊( | 1280×853 | 0.38 |
树屋( | 1280×853 | 0.44 |
月亮( | 853×1280 | 0.39 |
海边( | 1280×853 | 0.42 |
田野( | 1280×853 | 0.49 |
黄花( | 853×1280 | 0.40 |
平均 | 0.34 |
表2 所有示例的彩色化运行时间
Table 2 Running time of all examples
示例 | 图像分辨率 | 运行时间/s |
---|---|---|
卧室( | 1280×853 | 0.58 |
背影( | 1280×853 | 0.53 |
田园( | 1280×853 | 0.40 |
小猫( | 300×199 | 0.03 |
牛群( | 1280×853 | 0.49 |
玫瑰( | 427×285 | 0.06 |
核桃( | 400×276 | 0.05 |
香蕉( | 640×960 | 0.24 |
花朵( | 853×1280 | 0.39 |
饼干( | 848×1280 | 0.44 |
陶瓷( | 1280×850 | 0.41 |
胶囊( | 1280×853 | 0.38 |
树屋( | 1280×853 | 0.44 |
月亮( | 853×1280 | 0.39 |
海边( | 1280×853 | 0.42 |
田野( | 1280×853 | 0.49 |
黄花( | 853×1280 | 0.40 |
平均 | 0.34 |
图11 失败案例((a)输入图像;(b)初始语义;(c)优化后语义;(d)彩色化结果)
Fig. 11 Failure case ((a) Input image; (b) Initial semantics; (c) Optimized semantics; (d) Colorization results)
[1] | LEVIN A, LISCHINSKI D, WEISS Y. Colorization using optimization[C]// ACM SIGGRAPH 2004 Papers. New York: ACM, 2004: 689-694. |
[2] | HUANG Y C, TUNG Y S, CHEN J C, et al. An adaptive edge detection based colorization algorithm and its applications[C]// The 13th Annual ACM International Conference on Multimedia. New York: ACM, 2005: 351-354. |
[3] |
YATZIV L, SAPIRO G. Fast image and video colorization using chrominance blending[J]. IEEE Transactions on Image Processing, 2006, 15(5): 1120-1129.
PMID |
[4] | HEU J H, HYUN D Y, KIM C S, et al. Image and video colorization based on prioritized source propagation[C]// The 2009 16th IEEE International Conference on Image Processing. New York: IEEE Press, 2009: 465-468. |
[5] | QU Y G, WONG T T, HENG P A. Manga colorization[J]. ACM Transactions on Graphics (TOG), 2006, 25(3): 1214-1220. |
[6] | LUAN Q, WEN F, COHEN-OR D, et al. Natural image colorization[C]// The 18th Eurographics conference on Rendering Techniques. Goslar: Eurographics Association, 2007: 309-320. |
[7] | 王泽文, 张小明. 基于p-Laplace方程的图像彩色化方法[J]. 工程图学学报, 2010, 31(6): 62-67. |
WANG Z W, ZHANG X M. Image colorization based on p-Laplace equation[J]. Journal of Engineering Graphics, 2010, 31(6): 62-67 (in Chinese). | |
[8] |
田建勇, 石林江. 局部线性嵌入与模糊C-均值聚类的红外图像彩色化算法[J]. 图学学报, 2018, 39(5): 917-925.
DOI |
TIAN J Y, SHI L J. An infrared image colorization algorithm based on local linear embedding and fuzzy C-means clustering[J]. Journal of Graphics, 2018, 39(5): 917-925 (in Chinese). | |
[9] | REINHARD E, ADHIKHMIN M, GOOCH B, et al. Color transfer between images[J]. IEEE Computer Graphics and Applications, 2001, 21(5): 34-41. |
[10] | WELSH T, ASHIKHMIN M, MUELLER K. Transferring color to greyscale images[C]// The 29th Annual Conference on Computer Graphics and Interactive Techniques. New York: ACM, 2002: 277-280. |
[11] | IRONY R, COHEN-OR D, LISCHINSKI D. Colorization by example[C]// The 16th Eurographics Conference on Rendering Techniques. Goslar: Eurographics Association, 2005: 201-210. |
[12] | CHIA A Y S, ZHUO S J, GUPTA R K, et al. Semantic colorization with internet images[J]. ACM Transactions on Graphics (TOG), 2011, 30(6): 1-8. |
[13] | GUPTA R K, CHIA A Y S, RAJAN D, et al. Image colorization using similar images[C]// The 20th ACM International Conference on Multimedia. New York: ACM, 2012: 369-378. |
[14] | CHENG Z Z, YANG Q X, SHENG B. Deep colorization[C]// 2015 IEEE International Conference on Computer Vision. New York: IEEE Press, 2015: 415-423. |
[15] | ZHANG R, ISOLA P, EFROS A A. Colorful image colorization[C]// The 14th European Conference on Computer Vision. Cham: Springer, 2016: 649-666. |
[16] | ZHAO J J, HAN J G, SHAO L, et al. Pixelated semantic colorization[J]. International Journal of Computer Vision, 2020, 128(4): 818-834. |
[17] | GOODFELLOW I, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial networks[J]. Communications of the ACM, 2020, 63(11): 139-144. |
[18] | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// The 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 6000-6010. |
[19] | HO J, JAIN A, ABBEEL P. Denoising diffusion probabilistic models[C]// The 34th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2020: 574. |
[20] | WU Y Z, WANG X T, LI Y, et al. Towards vivid and diverse image colorization with generative color prior[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 14357-14366. |
[21] | KIM G, KANG K, KIM S, et al. BigColor: colorization using a generative color prior for natural images[C]// The 17th European Conference on Computer Vision. Cham: Springer, 2022: 350-366. |
[22] | WANG Y, XIA M H, QI L, et al. PalGAN: image colorization with palette generative adversarial networks[C]// The 17th European Conference on Computer Vision. Cham: Springer, 2022: 271-288. |
[23] | HUANG Z T, ZHAO N X, LIAO J. UniColor: a unified framework for multi-modal colorization with transformer[J]. ACM Transactions on Graphics (TOG), 2022, 41(6): 205. |
[24] | KANG X Y, YANG T, OUYANG W Q, et al. DDColor: towards photo-realistic image colorization via dual decoders[C]// 2023 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2023: 328-338. |
[25] | ZHANG R, ZHU J Y, ISOLA P, et al. Real-time user-guided image colorization with learned deep priors[J]. ACM Transactions on Graphics (TOG), 2017, 36(4): 119. |
[26] | YUN J, LEE S, PARK M, et al. iColoriT: towards propagating local hints to the right region in interactive colorization by leveraging vision transformer[C]// 2023 IEEE/CVF Winter Conference on Applications of Computer Vision. New York: IEEE Press, 2023: 1787-1796. |
[27] | DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: transformers for image recognition at scale[EB/OL]. [2024-05-01]. https://dblp.org/db/conf/iclr/iclr2021.html#DosovitskiyB0WZ21. |
[28] | BAHNG H, YOO S, CHO W, et al. Coloring with words: Guiding image colorization through text-based palette generation[C]// The 15th European Conference on Computer Vision. Cham: Springer, 2018: 443-459. |
[29] | WENG S C, WU H, CHANG Z, et al. L-code: language-based colorization using color-object decoupled conditions[C]// The 36th AAAI Conference on Artificial Intelligence. Washington: AAAI, 2022: 2677-2684. |
[30] | CHANG Z, WENG S C, LI Y, et al. L-CoDer: language-based colorization with color-object decoupling transformer[C]// The 17th European Conference on Computer Vision. Cham: Springer, 2022: 360-375. |
[31] | HE M M, CHEN D D, LIAO J, et al. Deep exemplar-based colorization[J]. ACM Transactions on Graphics (TOG), 2018, 37(4): 47. |
[32] | CONG X Y, WU Y, CHEN Q F, et al. Automatic controllable colorization via imagination[C]// 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2024: 2609-2619. |
[33] | LIANG Z X, LI Z C, ZHOU S C, et al. Control color: multimodal diffusion-based interactive image colorization[J]. [2024-05-07]. https://arxiv.org/abs/2402.10855. |
[34] | ZHANG L M, RAO A, AGRAWALA M. Adding conditional control to text-to-image diffusion models[C]// 2023 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2023: 3813-3824. |
[35] | AKSOY Y, OH T H, PARIS S, et al. Semantic soft segmentation[J]. ACM Transactions on Graphics (TOG), 2018, 37(4): 72. |
[36] | KIRILLOV A, MINTUN E, RAVI N, et al. Segment anything[C]// 2023 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2023: 3992-4003. |
[1] | 周婧怡, 张栖桐, 冯结青. 基于混合结构的多视图三维场景重建[J]. 图学学报, 2024, 45(1): 199-208. |
[2] | 芦楠楠, 刘一雄, 邱铭恺. 基于随机传播图卷积模型的零样本图像分类[J]. 图学学报, 2022, 43(4): 624-631. |
[3] | 刘肖健, 彭 程, 吕芸芸. 配色设计智能优化的非随机种群方法[J]. 图学学报, 2021, 42(6): 1035-1042. |
[4] | 查东东, 张 迪, 刘华勇. 三参曲线光滑融合的构造及参数的优化[J]. 图学学报, 2020, 41(5): 725-732. |
[5] | 胡 珊 1, 蒋 旭 1, 符凯杰 1, 周明煜 1, 罗亦鸣 2, 乔忠林 3 . 基于 SEM 的交互式公共导识系统体验设计研究[J]. 图学学报, 2020, 41(2): 204-209. |
[6] | 王 华 1, 李绅绅 1, 何晓宇 1, 朱付保 1, 姚 妮 1, 徐明亮 2 . 基于个性化定制的交互式蒙版擦除动画 设计与实现[J]. 图学学报, 2019, 40(3): 473-480. |
[7] | 戴 莎 1,2, 司伟鑫 2,3, 钱银玲 2, 郑 睿 2, 王 琼 2, 徐东亮 1, 彭延军 4, 王平安 3,2 . 虚拟显微白内障手术系统人机交互接口设计[J]. 图学学报, 2019, 40(3): 565-573. |
[8] | 孙 昭 1, 柳有权 1, 张彩荣 1, 石 剑 2, 陈彦云 3. 一种场景内容分布的交互式渲染系统[J]. 图学学报, 2019, 40(1): 87-91. |
[9] | 刘子建, 艾东旭, 段彦杰, 张 励. 基于信息传播的绥德石雕艺术元素数据平台开发[J]. 图学学报, 2017, 38(6): 925-930. |
[10] | 刘玉杰1, 封江力1, 李宗民1, 李 华2. 基于扩散的图像显著性检测[J]. 图学学报, 2017, 38(2): 204-210. |
[11] | 许晓丽. 基于多代表点近邻传播的大数据图像分割算法[J]. 图学学报, 2016, 37(1): 91-96. |
[12] | 李 鹍. 基于数字图学技术的建筑专业教学思考[J]. 图学学报, 2013, 34(3): 152-156. |
[13] | 方泳泽, 谢书童, 陈建海, 吴 旭. 基于有向图的工程变更传播分析的研究[J]. 图学学报, 2012, 33(5): 132-136. |
[14] | 苗 腾, 赵春江, 陆声链, 郭新宇. 基于拉普拉斯网格变形的三维植物叶片建模[J]. 图学学报, 2012, 33(3): 46-51. |
[15] | 王泽文, 张小明. 基于p-Laplace方程的图像彩色化方法[J]. 图学学报, 2010, 31(6): 62-67. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||