一种对光照和遮挡鲁棒的单应性估计方法

doi:10.11996/JG.j.2095-302X.2023010166

图学学报 ›› 2023, Vol. 44 ›› Issue (1): 166-176.DOI: 10.11996/JG.j.2095-302X.2023010166

• 计算机图形学与虚拟现实 • 上一篇下一篇

一种对光照和遮挡鲁棒的单应性估计方法

范震(), 刘晓静(), 李小波, 崔亚超

青海大学计算机技术与应用系，青海西宁 810016

收稿日期:2022-06-16 修回日期:2022-07-20 出版日期:2023-10-31 发布日期:2023-02-16
通讯作者: 刘晓静
作者简介:范震(1998-)，男，硕士研究生。主要研究方向为计算机视觉与人工智能。E-mail：772591989@qq.com
基金资助:
国家自然科学基金项目(61862053);国家自然科学基金项目(61863031)

A homography estimation method robust to illumination and occlusion

FAN Zhen(), LIU Xiao-jing(), LI Xiao-bo, CUI Ya-chao

Department of Computer Technology and Application, Qinghai University, Xining Qinghai 810016, China

Received:2022-06-16 Revised:2022-07-20 Online:2023-10-31 Published:2023-02-16
Contact: LIU Xiao-jing
About author:FAN Zhen (1998-), master student. His main research interests cover computer vision and artificial intelligence. E-mail：772591989@qq.com
Supported by:
National Natural Science Foundation of China(61862053);National Natural Science Foundation of China(61863031)

摘要/Abstract

摘要：

单应性估计是计算机视觉领域中的一项基本任务。为了提高单应性估计对光照和遮挡的鲁棒性，提出了一个基于无监督学习的单应性估计模型，该模型以2幅堆叠的图像为输入，以估计所得单应矩阵为输出。提出双向单应性估计平均光度损失；然后，为了增加感受野和提高网络模型对形变、位置变化等的抗性，为网络模型引入空间转换网络(STN)模块和变形卷积；最后，通过插入随机遮挡形状，首次将遮挡因素引入单应性估计任务的合成数据集，使训练出的模型对遮挡具有鲁棒性。与传统方法相比，该方法保持了相当或更好的准确性，且在估计低纹理或光照变化大的图像对的单应性时表现更好；与基于学习的单应性估计方法相比，该方法对遮挡具有鲁棒性，且在真实数据集上具有更好的表现。

关键词: 单应性估计, 深度学习, 无监督, 数据增强

Abstract:

Homography estimation is a basic task in the field of computer vision. In order to improve the robustness of homography estimation to illumination and occlusion, a homography estimation model based on unsupervised learning was proposed. This model took two stacked images as input and the estimated homography matrix as output. The bidirectional homography was proposed to estimate the average photometric loss. Then, in order to increase the receptive field and improve the resistance of the network model to deformation and position change, we introduced the spatial transformer networks (STN) module and deformation convolution to the network model. Finally, by inserting random occlusion shapes, the occlusion factors were introduced into the synthetic dataset of the homography estimation task for the first time, thus making the trained model robust to occlusion. Compared with the traditional methods, the proposed method could maintain the same or achieve better accuracy, and give superior performance in estimating the homography of image pairs with low texture or large illumination changes. Compared with the learning-based homography estimation method, the proposed method is robust to occlusion and performs better on real datasets.

Key words: homography estimation, deep learning, unsupervised, data augmentation

中图分类号:

TP391

范震, 刘晓静, 李小波, 崔亚超. 一种对光照和遮挡鲁棒的单应性估计方法[J]. 图学学报, 2023, 44(1): 166-176.

FAN Zhen, LIU Xiao-jing, LI Xiao-bo, CUI Ya-chao. A homography estimation method robust to illumination and occlusion[J]. Journal of Graphics, 2023, 44(1): 166-176.

图/表 17

图1 无监督单应性估计方法示意图

Fig. 1 Schematic diagram of unsupervised homography estimation method

图2 带有STN与变形卷积的CNN网络模型

Fig. 2 CNN network model with STN and deformation convolution

图3 不同卷积示例((a)普通卷积；(b)一般变形卷积；(c)空洞卷积；(d)特殊变形卷积)

Fig. 3 Examples of different convolutions ((a) Ordinary convolution; (b) General deformation convolution; (c) Dilated convolution; (d) Special deformation convolution)

图4 S-COCO数据集生成算法((a)从一张图片中随机获取正方形图像块Patch A；(b)对正方形4个角点进行随机扰动；(c)根据步骤2中扰动获得的(Δxi,Δyi)计算HAB；(d)计算HAB逆矩阵并将其应用于整张图片，然后在相同位置获取相同大小的正方形图像块)

Fig. 4 S-COCO dataset generation algorithm ((a) Randomly obtain a square image block named Patch A from a picture; (b) Randomly perturb 4 corners of the square; (c) Calculate HAB according to (Δxi,Δyi) from step 2; (d) Calculate the inverse HAB matrix and apply it to the whole picture, and then obtain square image blocks of the same size at the same location)

图5 人工插入随机遮挡形状示例(对图像中的每幅图像插入遮挡形状)

Fig. 5 Examples of manually inserted random occlusion shapes (insert occlusion shape for each image in the image pair)

图6 遮挡形状插入策略示意图((a)原始数据集生成算法生成的图像对；(b)加入随机遮挡插入策略后生成的图像对；(c)遮挡插入策略具体过程)

Fig. 6 Schematic diagram of occlusion shape insertion strategy ((a) Image pairs generated by the original dataset generation algorithm; (b) Image pairs generated by adding random occlusion insertion strategy; (c) Specific process of occlusion insertion strategy)

表1 PDSO-COCO与其他合成数据集的比较

Table 1 Comparison of PDSO-COCO with other synthetic dataset

因素	数据集
因素	S-COCO	PDS-COCO	PDSO-COCO
光照	×	√	√
噪声	×	√	√
位移	√	√	√
视差	×	√	√
遮挡	×	×	√

图7 真实数据集示例((a)不同视差情况；(b)不同位移程度；(c)不同场景)

Fig. 7 Example of real dataset ((a) Different parallax; (b) Different displacement degree; (c) Different scenarios)

表2 各个模型在WarpedMS-COCO数据集上的RMSE

Table 2 RMSE of each model on WarpedMS-COCO dataset

排名	方法
排名	SIFT+RANSAC	PFNet	HomographyNet	CAUDHEN	UDHEN	Ours
Top 0~30%	0.533	2.013	3.277	14.867	2.227	2.243
31%~60%	1.174	3.768	4.919	18.066	3.361	2.671
61%~100%	19.017	5.437	7.688	23.421	6.374	3.095
平均	9.738	3.857	5.673	18.798	4.176	2.781

图8 用于拼接的图片的重叠率说明((a)极低重叠率图片对；(b)相对较高重叠率图片对)

Fig. 8 Description of the overlap rate of the pictures used for stitching ((a) Image pairs with very low overlap; (b) Image pairs with relatively high overlap)

表3 各个模型在真实数据集上拼接而成的图片的Laplacian

Table 3 Laplacian of pictures assembled from various models on the real dataset

排名	方法
排名	SIFT+RANSAC	PFNet	HomographyNet	CAUDHEN	UDHEN	Ours
Top 0~30%	1133.175	962.593	898.766	-	933.278	1074.563
31%~60%	721.158	654.946	578.645	-	664.295	698.279
61%~100%	425.337	392.551	367.527	-	381.527	474.325
平均	724.475	645.341	590.234	-	632.784	715.443

表4 回传不同损失函数时模型在WarpedMS-COCO数据集上的RMSE

Table 4 RMSE of the model on WarpedMS-COCO dataset when different loss functions are backpropagated

排名	回传双向单应性估计平均光度损失	回传普通光度损失
Top 0~30%	2.243	2.218
31%~60%	2.671	3.074
61%~100%	3.095	5.983
平均	2.781	4.056

表5 回传不同损失函数时，模型利用估计的真实图片单应性拼接而成的图片的Laplacian

Table 5 Laplacian of images obtained from image stitching using the homography estimated by model on real dataset when different loss functions are backpropagated

排名	回传双向单应性估计平均光度损失	回传普通光度损失
Top 0~30%	1074.563	974.263
31%~60%	698.279	652.379
61%~100%	474.325	399.281
平均	715.443	649.883

图9 图像拼接质量的视觉比较((a)少纹理；(b)夜晚；(c)白天；(d)较大视差；(e)重复纹理)

Fig. 9 Visual comparison of the image stitching quality ((a) Less texture; (b) Night; (c) Day; (d) Large parallax; (e) Repeated texture)

表6 引入和未引入STN与变形卷积时模型在WarpedMS-COCO数据集上的RMSE

Table 6 RMSE of model on WarpedMS-COCO dataset with and without STN and deformation convolution

排名	引入STN与变形卷积	未引入STN与变形卷积
Top 0~30%	2.243	2.219
31%~60%	2.671	3.278
61%~100%	3.095	6.221
平均	2.781	4.109

表7 引入和未引入STN与变形卷积时，模型利用估计的真实图片单应性拼接而成的图片的Laplacian

Table 7 Laplacian of images obtained from image stitching using the homography estimated by model on real dataset when with and without STN and deformation convolution

排名	引入STN与变形卷积	未引入STN与变形卷积
Top 0~30%	1074.563	946.674
31%~60%	698.279	668.151
61%~100%	474.325	392.463
平均	715.443	647.386

图10 图像拼接质量的视觉比较((a)少纹理；(b)夜晚；(c)白天；(d)较大视差；(e)重复纹理)

Fig. 10 Visual comparison of the image stitching quality ((a) Less texture; (b) Night; (c) Day; (d) Large parallax; (e) Repeated texture)

参考文献 24

[1]	LE H, LIU F, ZHANG S, et al. Deep homography estimation for dynamic scenes[C]//202 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 7652-7661.
[2]	BAKER S, MATTHEWS I. Lucas-kanade 20 years on: a unifying framework[J]. International Journal of Computer Vision, 2004, 56(3): 221-255. DOI URL
[3]	LOWE D G. Object recognition from local scale-invariant features[C]//The 7th IEEE International Conference on Computer Vision. New York: IEEE Press, 1999: 1150-1157.
[4]	RUBLEE E, RABAUD V, KONOLIGE K, et al. ORB: an efficient alternative to SIFT or SURF[C]//2011 IEEE International Conference on Computer Vision. New York: IEEE Press, 2011: 2564-2571.
[5]	BAY H, TUYTELAARS T, GOOL L. Surf: speeded up robust features[C]//European Conference on Computer Vision. Cham: Springer International Publishin, 2006: 404-417.
[6]	BARONE F, MARRAZZO M, OTON C J. Camera calibration with weighted direct linear transformation and anisotropic uncertainties of image control points[J]. Sensors, 2020, 20(4): E1175.
[7]	DETONE D, MALISIEWICZ T, RABINOVICH A. Deep image homography estimation[EB/OL]. [2022-01-20]. https://doi.org/10.48550/arXiv.1606.03798.
[8]	ZENG R, DENMAN S, SRIDHARAN S, et al. Rethinking planar homography estimation using perspective fields[C]// Asian Conference on Computer Vision. Cham: Springer International Publishin, 2018: 571-586.
[9]	NGUYEN T, CHEN S, SHIVAKUMAR S. Unsupervised deep homography: a fast and robust homography estimation model[J]. IEEE Robotics and Automation Letters, 2018, 3(3): 2346-2353.
[10]	ZHANG J, WANG C, LIU S, et al. Content-aware unsupervised deep homography estimation[C]//European Conference on Computer Vision. Cham: Springer International Publishin, 2020: 653-669.
[11]	ZHANG S, NG W, ZHANG J, et al. Human activity recognition using radial basis function neural network trained via a minimization of localized generalization error[C]// International Conference on Ubiquitous Computing and Ambient Intelligence. Cham: Springer International Publishin, 2017: 498-507.
[12]	EVANGELIDIS G D, PSARAKIS E Z. Parametric image alignment using enhanced correlation coefficient maximization[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 30(10): 1858-1865. DOI PMID
[13]	FISCHLER M, BOLLES R. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography[J]. Communications of the ACM, 1981, 24(6): 381-395. DOI URL
[14]	BARATH D, MATAS J, NOSKOVA J. MAGSAC: marginalizing sample consensus[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 10189-10197.
[15]	JADERBERG M, SIMONYAN K, ZISSERMAN A. Spatial transformer networks[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2015: 2017-2025.
[16]	DAI J F, QI H Z, XIONG Y W, et al. Deformable convolutional networks[C]//2017 IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 764-773.
[17]	HE K M, SUN J. Convolutional neural networks at constrained time cost[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2015: 5353-5360.
[18]	HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 770-778.
[19]	GODARD C, AODHA O M, BROSTOW G J. Unsupervised monocular depth estimation with left-right consistency[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 6602-6611.
[20]	AMIRI A J, YAN LOO S, ZHANG H. Semi-supervised monocular depth estimation with left-right consistency using deep neural network[C]// 2019 IEEE International Conference on Robotics and Biomimetics. New York: IEEE Press, 2019: 602-607.
[21]	KOGUCIUK D, ARANI E, ZONOOZ B. Perceptual loss for robust unsupervised homography estimation[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). New York: IEEE Press, 2021: 4269-4278.
[22]	NIE L, LIN C Y, LIAO K, et al. Unsupervised deep image stitching: reconstructing stitched features to images[J]. IEEE Transactions on Image Processing, 2021, 30: 6184-6197. DOI URL
[23]	CHEN Y, GAO Y. Image denoising via steerable directional Laplacian regularizer[J]. Circuits, Systems, and Signal Processing, 2021, 40(12): 6265-6283. DOI
[24]	LI Z G, SHU H Y, ZHENG C B. Multi-scale single image dehazing using Laplacian and Gaussian Pyramids[J]. IEEE Transactions on Image Processing, 2021, 30: 9270-9279. DOI URL

一种对光照和遮挡鲁棒的单应性估计方法

A homography estimation method robust to illumination and occlusion

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 17

参考文献 24

相关文章 15

编辑推荐

Metrics

本文评价

[1]	杨陈成, 董秀成, 侯兵, 张党成, 向贤明, 冯琪茗. 基于参考的Transformer纹理迁移深度图像超分辨率重建[J]. 图学学报, 2023, 44(5): 861-867.
[2]	党宏社, 许怀彪, 张选德. 融合结构信息的深度学习立体匹配算法[J]. 图学学报, 2023, 44(5): 899-906.
[3]	翟永杰, 郭聪彬, 王乾铭, 赵宽, 白云山, 张冀. 基于隐含空间知识融合的输电线路多金具检测方法[J]. 图学学报, 2023, 44(5): 918-927.
[4]	杨红菊, 高敏, 张常有, 薄文, 武文佳, 曹付元. 一种面向图像修复的局部优化生成模型[J]. 图学学报, 2023, 44(5): 955-965.
[5]	宋焕生, 文雅, 孙士杰, 宋翔宇, 张朝阳, 李旭. 基于改进教师学生网络的隧道火灾检测[J]. 图学学报, 2023, 44(5): 978-987.
[6]	毕春艳, 刘越. 基于深度学习的视频人体动作识别综述[J]. 图学学报, 2023, 44(4): 625-639.
[7]	曹义亲, 周一纬, 徐露. 基于E-YOLOX的实时金属表面缺陷检测算法[J]. 图学学报, 2023, 44(4): 677-690.
[8]	王道累, 康博, 朱瑞. 基于深度学习的电力设备铭牌文本检测方法[J]. 图学学报, 2023, 44(4): 691-698.
[9]	邵俊棋, 钱文华, 徐启豪. 基于条件残差生成对抗网络的风景图生成[J]. 图学学报, 2023, 44(4): 710-717.
[10]	余伟群, 刘佳涛, 张亚萍. 融合注意力的拉普拉斯金字塔单目深度估计[J]. 图学学报, 2023, 44(4): 728-738.
[11]	郭印宏, 王立春, 李爽. 基于重复性和特异性约束的图像特征匹配[J]. 图学学报, 2023, 44(4): 739-746.
[12]	毛爱坤, 刘昕明, 陈文壮, 宋绍楼. 改进YOLOv5算法的变电站仪表目标检测方法[J]. 图学学报, 2023, 44(3): 448-455.
[13]	王佳婧, 王晨, 朱媛媛, 王笑梅. 基于民国纸币的图元素匹配检索[J]. 图学学报, 2023, 44(3): 492-501.
[14]	杨柳, 吴晓群. 基于深度学习的三维形状补全研究综述[J]. 图学学报, 2023, 44(2): 201-215.
[15]	曾武, 朱恒亮, 邢树礼, 林江宏, 毛国君. 显著性检测引导的图像数据增强方法[J]. 图学学报, 2023, 44(2): 260-270.