Journal of Graphics ›› 2025, Vol. 46 ›› Issue (6): 1316-1326.DOI: 10.11996/JG.j.2095-302X.2025061316
• Image Processing and Computer Vision • Previous Articles Next Articles
Received:2025-02-12
Accepted:2025-04-25
Online:2025-12-30
Published:2025-12-27
Contact:
LU Peng
About author:First author contact:CAO Lujing (2000-), master student. Her main research interests cover computer vision and video colorization. E-mail:Una@bupt.edu.cn
CLC Number:
CAO Lujing, LU Peng. A video colorization method based on multiple reference images[J]. Journal of Graphics, 2025, 46(6): 1316-1326.
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.txxb.com.cn/EN/10.11996/JG.j.2095-302X.2025061316
Fig. 5 Typical scenarios and examples of lighting conditions ((a) Typical scenes under different themes; (b) Samples under different lighting conditions)
| 方法 | PSNR | LPIPS | FID | SSIM | CF | tOF | tLP |
|---|---|---|---|---|---|---|---|
| 文献[ | 20.90 | 0.192 | 28.88 | 0.855 | 82.43 | 0.682 2 | 0.729 1 |
| 文献[ | 18.51 | 0.354 | 33.45 | 0.610 | 83.06 | 0.176 5 | 0.748 8 |
| 文献[ | 17.69 | 0.494 | 227.50 | 0.547 | 74.29 | 1.048 3 | 7.265 2 |
| 文献[ | 27.71 | 0.143 | 47.60 | 0.896 | 84.24 | 0.158 1 | 0.688 0 |
| 本文方法 | 30.27 | 0.071 | 24.75 | 0.894 | 86.11 | 0.069 2 | 0.535 4 |
Table 1 Comparison with the state-of-the-art methods
| 方法 | PSNR | LPIPS | FID | SSIM | CF | tOF | tLP |
|---|---|---|---|---|---|---|---|
| 文献[ | 20.90 | 0.192 | 28.88 | 0.855 | 82.43 | 0.682 2 | 0.729 1 |
| 文献[ | 18.51 | 0.354 | 33.45 | 0.610 | 83.06 | 0.176 5 | 0.748 8 |
| 文献[ | 17.69 | 0.494 | 227.50 | 0.547 | 74.29 | 1.048 3 | 7.265 2 |
| 文献[ | 27.71 | 0.143 | 47.60 | 0.896 | 84.24 | 0.158 1 | 0.688 0 |
| 本文方法 | 30.27 | 0.071 | 24.75 | 0.894 | 86.11 | 0.069 2 | 0.535 4 |
Fig. 6 Experimental comparison and qualitative analysis of different methods in multiple types of video clips ((a) Reference image; (b) Grayscale video frames; (c) Results of reference [24]; (d) Results of reference [37]; (e) Results of reference [9]; (f) Results of reference [10]; (g) Ours)
| 方法 | 文献[ | 文献[ | 文献[ | 文献[ | Ours |
|---|---|---|---|---|---|
| 文献[ | - | 1.48×10-3 | 1.37×10-10 | 6.98×10-6 | 7.56×10-31 |
| 文献[ | - | - | 4.79×10-25 | 2.01×10-17 | 1.89×10-53 |
| 文献[ | - | - | - | 7.20×10-1 | 6.25×10-6 |
| 文献[ | - | - | - | - | 1.18×10-10 |
| 本文方法 | - | - | - | - | - |
Table 2 Dunn test results
| 方法 | 文献[ | 文献[ | 文献[ | 文献[ | Ours |
|---|---|---|---|---|---|
| 文献[ | - | 1.48×10-3 | 1.37×10-10 | 6.98×10-6 | 7.56×10-31 |
| 文献[ | - | - | 4.79×10-25 | 2.01×10-17 | 1.89×10-53 |
| 文献[ | - | - | - | 7.20×10-1 | 6.25×10-6 |
| 文献[ | - | - | - | - | 1.18×10-10 |
| 本文方法 | - | - | - | - | - |
| 方法 | PSNR | LPIPS | FID | SSIM |
|---|---|---|---|---|
| 去除时序色彩 特征推荐模块 | 19.99 | 0.290 | 70.51 | 0.877 |
| 去除色彩融合网络 | 19.48 | 0.325 | 84.35 | 0.880 |
| 本文方法 | 28.62 | 0.084 | 27.11 | 0.894 |
Table 3 Multi-reference image video colorization model ablation experiments
| 方法 | PSNR | LPIPS | FID | SSIM |
|---|---|---|---|---|
| 去除时序色彩 特征推荐模块 | 19.99 | 0.290 | 70.51 | 0.877 |
| 去除色彩融合网络 | 19.48 | 0.325 | 84.35 | 0.880 |
| 本文方法 | 28.62 | 0.084 | 27.11 | 0.894 |
| L1损失 | LS-GAN损失 | 循环一致性损失 | TV正则损失 | PSNR | LPIPS | FID | SSIM |
|---|---|---|---|---|---|---|---|
| √ | √ | √ | 12.92 | 0.560 | 86.69 | 0.335 | |
| √ | √ | √ | 14.17 | 0.459 | 59.72 | 0.342 | |
| √ | √ | √ | 15.27 | 0.294 | 49.58 | 0.347 | |
| √ | √ | √ | 15.86 | 0.221 | 47.52 | 0.347 | |
| √ | √ | √ | √ | 28.62 | 0.084 | 27.11 | 0.894 |
Table 4 Loss function ablation experiment
| L1损失 | LS-GAN损失 | 循环一致性损失 | TV正则损失 | PSNR | LPIPS | FID | SSIM |
|---|---|---|---|---|---|---|---|
| √ | √ | √ | 12.92 | 0.560 | 86.69 | 0.335 | |
| √ | √ | √ | 14.17 | 0.459 | 59.72 | 0.342 | |
| √ | √ | √ | 15.27 | 0.294 | 49.58 | 0.347 | |
| √ | √ | √ | 15.86 | 0.221 | 47.52 | 0.347 | |
| √ | √ | √ | √ | 28.62 | 0.084 | 27.11 | 0.894 |
| [1] | WU Y Z, WANG X T, LI Y, et al. Towards vivid and diverse image colorization with generative color prior[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 14357-14366. |
| [2] |
PAN X G, ZHAN X H, DAI B, et al. Exploiting deep generative prior for versatile image restoration and manipulation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(11): 7474-7489.
DOI URL |
| [3] |
CHENG Z Z, YANG Q X, SHENG B. Colorization using neural network ensemble[J]. IEEE Transactions on Image Processing, 2017, 26(11): 5491-5505.
DOI PMID |
| [4] | REINHARD E, ADHIKHMIN M, GOOCH B, et al. Color transfer between images[J]. IEEE Computer Graphics and Applications, 2001, 21(5): 34-41. |
| [5] | IRONY R, COHEN-OR D, LISCHINSKI D. Colorization by example[C]// The 16th Eurographics conference on Rendering Techniques. Aire-la-Ville: Eurographics Association Press, 2005: 201-210. |
| [6] | LEI C Y, CHEN Q F. Fully automatic video colorization with self-regularization and diversity[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 3748-3756. |
| [7] | LIU Y M, COHEN M, UYTTENDAELE M, et al. AutoStyle: automatic style transfer from image collections to users’ images[J]. Computer Graphics Forum, 2014, 33(4): 21-31. |
| [8] |
KHAN A, JIANG L, LI W, et al. Fast color transfer from multiple images[J]. Applied Mathematics-A Journal of Chinese Universities, 2017, 32(2): 183-200.
DOI URL |
| [9] |
WANG H Z, ZHAI D M, LIU X M, et al. Unsupervised deep exemplar colorization via pyramid dual non-local attention[J]. IEEE Transactions on Image Processing, 2023, 32: 4114-4127.
DOI URL |
| [10] |
YANG Y X, PAN J S, PENG Z Z, et al. BiSTNet: semantic image prior guided bidirectional temporal feature fusion for deep exemplar-based video colorization[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46(8): 5612-5624.
DOI URL |
| [11] | SINGH A, CHANANI A, KARNICK H. Video colorization using CNNs and keyframes extraction: an application in saving bandwidth[C]// The 4th International Conference on Computer Vision and Image Processing. Cham: Springer, 2020: 190-198. |
| [12] | MEYER S, CORNILLÈRE V, DJELOUAH A, et al. Deep video color propagation[EB/OL]. [2024-08-12]. http://bmvc2018.org/contents/papers/0521.pdf. |
| [13] | YAO C H, CHANG C Y, CHIEN S Y. Occlusion-aware video temporal consistency[C]// The 25th ACM International Conference on Multimedia. New York: ACM, 2017: 777-785. |
| [14] | BONNEEL N, TOMPKIN J, SUNKAVALLI K, et al. Blind video temporal consistency[J]. ACM Transactions on Graphics (TOG), 2015, 34(6): 196. |
| [15] | VONDRICK C, SHRIVASTAVA A, FATHI A, et al. Tracking emerges by colorizing videos[C]// The 15th European Conference on Computer Vision. Cham: Springer, 2018: 402-419. |
| [16] | ILG E, MAYER N, SAIKIA T, et al. FlowNet 2.0: evolution of optical flow estimation with deep networks[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 1647-1655. |
| [17] | JAMPANI V, GADDE R, GEHLER P V. Video propagation networks[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 3154-3164. |
| [18] |
PAUL S, BHATTACHARYA S, GUPTA S. Spatiotemporal colorization of video using 3D steerable pyramids[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2017, 27(8): 1605-1619.
DOI URL |
| [19] | WU R Z, LIN H J, QI X J, et al. Memory selection network for video propagation[C]// The 16th European Conference on Computer Vision. Cham: Springer, 2020: 175-190. |
| [20] | EILERTSEN G, MANTIUK R K, UNGER J. Single-frame regularization for temporally stable CNNs[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 11168-11177. |
| [21] | LAI W S, HUANG J B, WANG O, et al. Learning blind video temporal consistency[C]// The 15th European Conference on Computer Vision. Cham: Springer, 2018: 179-195. |
| [22] | SHI X J, CHEN Z R, WANG H, et al. Convolutional LSTM network: a machine learning approach for precipitation nowcasting[C]// The 29th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2015: 802-810. |
| [23] | LEI C Y, XING Y Z, CHEN Q S. Blind video temporal consistency via deep video prior[C]//// The 34th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2020: 92. |
| [24] | ZHANG B, HE M M, LIAO J, et al. Deep exemplar-based video colorization[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 8044-8053. |
| [25] |
ZHAO Y Z, PO L M, LIU K C, et al. SVCNet: scribble-based video colorization network with temporal aggregation[J]. IEEE Transactions on Image Processing, 2023, 32: 4443-4458.
DOI PMID |
| [26] |
HORN B K P, SCHUNCK B G. Determining optical flow[J]. Artificial Intelligence, 1981, 17(1/3): 185-203.
DOI URL |
| [27] |
ZHANG H, XU T, LI H S, et al. StackGAN++: realistic image synthesis with stacked generative adversarial networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(8): 1947-1962.
DOI PMID |
| [28] | MAO X D, LI Q, XIE H R, et al. Least squares generative adversarial networks[C]// 2017 IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 2813-2821. |
| [29] | WANG X L, JABRI A, EFROS A A. Learning correspondence from the cycle-consistency of time[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 2561-2571. |
| [30] | SANGKLOY P, LU J W, FANG C, et al. Scribbler: controlling deep image synthesis with sketch and color[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 6836-6845. |
| [31] | MARSZALEK M, LAPTEV I, SCHMID C. Actions in context[C]// 2009 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2009: 2929-2936. |
| [32] |
WANG Z, BOVIK A C, SHEIKH H R, et al. Image quality assessment: from error visibility to structural similarity[J]. IEEE Transactions on Image Processing, 2004, 13(4): 600-612.
DOI PMID |
| [33] | ZHANG R, ISOLA P, EFROS A A, et al. The unreasonable effectiveness of deep features as a perceptual metric[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 586-595. |
| [34] | HEUSEL M, RAMSAUER H, UNTERTHINER T, et al. GANs trained by a two time-scale update rule converge to a Nash equilibrium[EB/OL]. [2024-08-12]. https://arxiv.org/abs/1706.08500v1. |
| [35] | HASLER D, SUESSTRUNK S E. Measuring colorfulness in natural images[C]// 2003 Human Vision and Electronic Imaging VIII. Bellingham: SPIE, 2003: 87-95. |
| [36] | CHU M Y, THUEREY N. Data-driven synthesis of smoke flows with CNN-based feature descriptors[J]. ACM Transactions on Graphics (TOG), 2017, 36(4): 69. |
| [37] | LU P, YU J B, PENG X J, et al. Gray2ColorNet: transfer more colors from reference image[C]// The 28th ACM International Conference on Multimedia. New York: ACM, 2020: 3210-3218. |
| [1] | LIU Yuanyuan, FANG Youjiang, MENG Tianyu, MENG Zhengyu, LUO Pengwei, YANG Peigen, JIANG Yutong, WEI Xiaopeng, ZHANG Qiang, YANG Xin. Geometry hypergraph aware 3D scene graph generation [J]. Journal of Graphics, 2025, 46(6): 1337-1345. |
| [2] | FAN Lexiang, MA Ji, ZHOU Dengwen. Lightweight blind super-resolution network based on degradation separation [J]. Journal of Graphics, 2025, 46(6): 1304-1315. |
| [3] | LI Xingchen, LI Zongmin, YANG Chaozhi. Test-time adaptation algorithm based on trusted pseudo-label fine-tuning [J]. Journal of Graphics, 2025, 46(6): 1292-1303. |
| [4] | ZHANG Xinyun, ZHANG Liwen, ZHOU Li, LUO Xiaonan. Coffee fruit maturity prediction model based on image blocking interaction [J]. Journal of Graphics, 2025, 46(6): 1274-1280. |
| [5] | YU Nannan, MENG Zhengyu, FANG Youjiang, SUN Chuanyu, YIN Xuefeng, ZHANG Qiang, WEI Xiaopeng, YANG Xin. Frequency-aware hypergraph fusion for event-based semantic segmentation [J]. Journal of Graphics, 2025, 46(6): 1267-1273. |
| [6] | HE Mengmeng, ZHANG Xiaoyan, LI Hongan. Lightweight skin lesion image segmentation network based on Mamba structure [J]. Journal of Graphics, 2025, 46(6): 1257-1266. |
| [7] | YUE Zijia, WANG Wensong, CHEN Shuangmin, XIN Shiqing, TU Changhe. Geodesic distance propagation across open boundaries [J]. Journal of Graphics, 2025, 46(5): 1042-1049. |
| [8] | HUANG Kaiqi, WU Meiqi, CHEN Honghao, FENG Xiaokun, ZHANG Dailing. The three realms of visual turing: from seeing to imagining in the LLM era [J]. Journal of Graphics, 2025, 46(5): 919-930. |
| [9] | HUANG Jing, SHI Ruihao, SONG Wenming, GUO Hepan, WEI Huang, WEI Xiaosong, YAO Jian. A review of autonomous driving image synthesis methods: from simulators to new paradigms [J]. Journal of Graphics, 2025, 46(5): 931-949. |
| [10] | ZHAI Yongjie, ZHAI Bangchao, HU Zhedong, YANG Ke, WANG Qianming, ZHAO Xiaoyu. Adaptive feature fusion pyramid and attention mechanism-based method for transmission line insulator defect detection [J]. Journal of Graphics, 2025, 46(5): 950-959. |
| [11] | LENG Shuo, WANG Wei, OU Jiayong, XUE Zhigang, SONG Yinglong, MO Sijun. On-Site construction safety monitoring based on large vision language models [J]. Journal of Graphics, 2025, 46(5): 960-968. |
| [12] | YE Wenlong, CHEN Bin. PanoLoRA: an efficient finetuning method for panoramic image generation based on Stable Diffusion [J]. Journal of Graphics, 2025, 46(5): 980-989. |
| [13] | ZHU Hongmiao, ZHONG Guojie, ZHANG Yanci. Semantic segmentation of small-scale point clouds based on integration of mean shift and deep learning [J]. Journal of Graphics, 2025, 46(5): 998-1009. |
| [14] | GUO Ruidong, LAN Guiwen, FAN Donglin, ZHONG Zhan, XU Zirui, REN Xinyue. An object detection algorithm for powerline inspection based on the feature focus & diffusion network [J]. Journal of Graphics, 2025, 46(4): 719-726. |
| [15] | LEI Songlin, ZHAO Zhengpeng, YANG Qiuxia, PU Yuanyuan, GU Jinjing, XU Dan. Zero-shot style transfer based on decoupled diffusion models [J]. Journal of Graphics, 2025, 46(4): 727-738. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||
