图学学报 ›› 2023, Vol. 44 ›› Issue (1): 166-176.DOI: 10.11996/JG.j.2095-302X.2023010166
收稿日期:
2022-06-16
修回日期:
2022-07-20
出版日期:
2023-10-31
发布日期:
2023-02-16
通讯作者:
刘晓静
作者简介:
范震(1998-),男,硕士研究生。主要研究方向为计算机视觉与人工智能。E-mail:772591989@qq.com
基金资助:
FAN Zhen(), LIU Xiao-jing(
), LI Xiao-bo, CUI Ya-chao
Received:
2022-06-16
Revised:
2022-07-20
Online:
2023-10-31
Published:
2023-02-16
Contact:
LIU Xiao-jing
About author:
FAN Zhen (1998-), master student. His main research interests cover computer vision and artificial intelligence. E-mail:772591989@qq.com
Supported by:
摘要:
单应性估计是计算机视觉领域中的一项基本任务。为了提高单应性估计对光照和遮挡的鲁棒性,提出了一个基于无监督学习的单应性估计模型,该模型以2幅堆叠的图像为输入,以估计所得单应矩阵为输出。提出双向单应性估计平均光度损失;然后,为了增加感受野和提高网络模型对形变、位置变化等的抗性,为网络模型引入空间转换网络(STN)模块和变形卷积;最后,通过插入随机遮挡形状,首次将遮挡因素引入单应性估计任务的合成数据集,使训练出的模型对遮挡具有鲁棒性。与传统方法相比,该方法保持了相当或更好的准确性,且在估计低纹理或光照变化大的图像对的单应性时表现更好;与基于学习的单应性估计方法相比,该方法对遮挡具有鲁棒性,且在真实数据集上具有更好的表现。
中图分类号:
范震, 刘晓静, 李小波, 崔亚超. 一种对光照和遮挡鲁棒的单应性估计方法[J]. 图学学报, 2023, 44(1): 166-176.
FAN Zhen, LIU Xiao-jing, LI Xiao-bo, CUI Ya-chao. A homography estimation method robust to illumination and occlusion[J]. Journal of Graphics, 2023, 44(1): 166-176.
图3 不同卷积示例((a)普通卷积;(b)一般变形卷积;(c)空洞卷积;(d)特殊变形卷积)
Fig. 3 Examples of different convolutions ((a) Ordinary convolution; (b) General deformation convolution; (c) Dilated convolution; (d) Special deformation convolution)
图4 S-COCO数据集生成算法((a)从一张图片中随机获取正方形图像块Patch A;(b)对正方形4个角点进行随机扰动;(c)根据步骤2中扰动获得的(Δxi,Δyi)计算HAB;(d)计算HAB逆矩阵并将其应用于整张图片,然后在相同位置获取相同大小的正方形图像块)
Fig. 4 S-COCO dataset generation algorithm ((a) Randomly obtain a square image block named Patch A from a picture; (b) Randomly perturb 4 corners of the square; (c) Calculate HAB according to (Δxi,Δyi) from step 2; (d) Calculate the inverse HAB matrix and apply it to the whole picture, and then obtain square image blocks of the same size at the same location)
图5 人工插入随机遮挡形状示例(对图像中的每幅图像插入遮挡形状)
Fig. 5 Examples of manually inserted random occlusion shapes (insert occlusion shape for each image in the image pair)
图6 遮挡形状插入策略示意图((a)原始数据集生成算法生成的图像对;(b)加入随机遮挡插入策略后生成的图像对;(c)遮挡插入策略具体过程)
Fig. 6 Schematic diagram of occlusion shape insertion strategy ((a) Image pairs generated by the original dataset generation algorithm; (b) Image pairs generated by adding random occlusion insertion strategy; (c) Specific process of occlusion insertion strategy)
因素 | 数据集 | ||
---|---|---|---|
S-COCO | PDS-COCO | PDSO-COCO | |
光照 | × | √ | √ |
噪声 | × | √ | √ |
位移 | √ | √ | √ |
视差 | × | √ | √ |
遮挡 | × | × | √ |
表1 PDSO-COCO与其他合成数据集的比较
Table 1 Comparison of PDSO-COCO with other synthetic dataset
因素 | 数据集 | ||
---|---|---|---|
S-COCO | PDS-COCO | PDSO-COCO | |
光照 | × | √ | √ |
噪声 | × | √ | √ |
位移 | √ | √ | √ |
视差 | × | √ | √ |
遮挡 | × | × | √ |
图7 真实数据集示例((a)不同视差情况;(b)不同位移程度;(c)不同场景)
Fig. 7 Example of real dataset ((a) Different parallax; (b) Different displacement degree; (c) Different scenarios)
排名 | 方法 | |||||
---|---|---|---|---|---|---|
SIFT+RANSAC | PFNet | HomographyNet | CAUDHEN | UDHEN | Ours | |
Top 0~30% | 0.533 | 2.013 | 3.277 | 14.867 | 2.227 | 2.243 |
31%~60% | 1.174 | 3.768 | 4.919 | 18.066 | 3.361 | 2.671 |
61%~100% | 19.017 | 5.437 | 7.688 | 23.421 | 6.374 | 3.095 |
平均 | 9.738 | 3.857 | 5.673 | 18.798 | 4.176 | 2.781 |
表2 各个模型在WarpedMS-COCO数据集上的RMSE
Table 2 RMSE of each model on WarpedMS-COCO dataset
排名 | 方法 | |||||
---|---|---|---|---|---|---|
SIFT+RANSAC | PFNet | HomographyNet | CAUDHEN | UDHEN | Ours | |
Top 0~30% | 0.533 | 2.013 | 3.277 | 14.867 | 2.227 | 2.243 |
31%~60% | 1.174 | 3.768 | 4.919 | 18.066 | 3.361 | 2.671 |
61%~100% | 19.017 | 5.437 | 7.688 | 23.421 | 6.374 | 3.095 |
平均 | 9.738 | 3.857 | 5.673 | 18.798 | 4.176 | 2.781 |
图8 用于拼接的图片的重叠率说明((a)极低重叠率图片对;(b)相对较高重叠率图片对)
Fig. 8 Description of the overlap rate of the pictures used for stitching ((a) Image pairs with very low overlap; (b) Image pairs with relatively high overlap)
排名 | 方法 | |||||
---|---|---|---|---|---|---|
SIFT+RANSAC | PFNet | HomographyNet | CAUDHEN | UDHEN | Ours | |
Top 0~30% | 1133.175 | 962.593 | 898.766 | - | 933.278 | 1074.563 |
31%~60% | 721.158 | 654.946 | 578.645 | - | 664.295 | 698.279 |
61%~100% | 425.337 | 392.551 | 367.527 | - | 381.527 | 474.325 |
平均 | 724.475 | 645.341 | 590.234 | - | 632.784 | 715.443 |
表3 各个模型在真实数据集上拼接而成的图片的Laplacian
Table 3 Laplacian of pictures assembled from various models on the real dataset
排名 | 方法 | |||||
---|---|---|---|---|---|---|
SIFT+RANSAC | PFNet | HomographyNet | CAUDHEN | UDHEN | Ours | |
Top 0~30% | 1133.175 | 962.593 | 898.766 | - | 933.278 | 1074.563 |
31%~60% | 721.158 | 654.946 | 578.645 | - | 664.295 | 698.279 |
61%~100% | 425.337 | 392.551 | 367.527 | - | 381.527 | 474.325 |
平均 | 724.475 | 645.341 | 590.234 | - | 632.784 | 715.443 |
排名 | 回传双向单应性估计 平均光度损失 | 回传普通 光度损失 |
---|---|---|
Top 0~30% | 2.243 | 2.218 |
31%~60% | 2.671 | 3.074 |
61%~100% | 3.095 | 5.983 |
平均 | 2.781 | 4.056 |
表4 回传不同损失函数时模型在WarpedMS-COCO数据集上的RMSE
Table 4 RMSE of the model on WarpedMS-COCO dataset when different loss functions are backpropagated
排名 | 回传双向单应性估计 平均光度损失 | 回传普通 光度损失 |
---|---|---|
Top 0~30% | 2.243 | 2.218 |
31%~60% | 2.671 | 3.074 |
61%~100% | 3.095 | 5.983 |
平均 | 2.781 | 4.056 |
排名 | 回传双向单应性估计 平均光度损失 | 回传普通 光度损失 |
---|---|---|
Top 0~30% | 1074.563 | 974.263 |
31%~60% | 698.279 | 652.379 |
61%~100% | 474.325 | 399.281 |
平均 | 715.443 | 649.883 |
表5 回传不同损失函数时,模型利用估计的真实图片单应性拼接而成的图片的Laplacian
Table 5 Laplacian of images obtained from image stitching using the homography estimated by model on real dataset when different loss functions are backpropagated
排名 | 回传双向单应性估计 平均光度损失 | 回传普通 光度损失 |
---|---|---|
Top 0~30% | 1074.563 | 974.263 |
31%~60% | 698.279 | 652.379 |
61%~100% | 474.325 | 399.281 |
平均 | 715.443 | 649.883 |
图9 图像拼接质量的视觉比较((a)少纹理;(b)夜晚;(c)白天;(d)较大视差;(e)重复纹理)
Fig. 9 Visual comparison of the image stitching quality ((a) Less texture; (b) Night; (c) Day; (d) Large parallax; (e) Repeated texture)
排名 | 引入STN与 变形卷积 | 未引入STN与 变形卷积 |
---|---|---|
Top 0~30% | 2.243 | 2.219 |
31%~60% | 2.671 | 3.278 |
61%~100% | 3.095 | 6.221 |
平均 | 2.781 | 4.109 |
表6 引入和未引入STN与变形卷积时模型在WarpedMS-COCO数据集上的RMSE
Table 6 RMSE of model on WarpedMS-COCO dataset with and without STN and deformation convolution
排名 | 引入STN与 变形卷积 | 未引入STN与 变形卷积 |
---|---|---|
Top 0~30% | 2.243 | 2.219 |
31%~60% | 2.671 | 3.278 |
61%~100% | 3.095 | 6.221 |
平均 | 2.781 | 4.109 |
排名 | 引入STN与 变形卷积 | 未引入STN与 变形卷积 |
---|---|---|
Top 0~30% | 1074.563 | 946.674 |
31%~60% | 698.279 | 668.151 |
61%~100% | 474.325 | 392.463 |
平均 | 715.443 | 647.386 |
表7 引入和未引入STN与变形卷积时,模型利用估计的真实图片单应性拼接而成的图片的Laplacian
Table 7 Laplacian of images obtained from image stitching using the homography estimated by model on real dataset when with and without STN and deformation convolution
排名 | 引入STN与 变形卷积 | 未引入STN与 变形卷积 |
---|---|---|
Top 0~30% | 1074.563 | 946.674 |
31%~60% | 698.279 | 668.151 |
61%~100% | 474.325 | 392.463 |
平均 | 715.443 | 647.386 |
图10 图像拼接质量的视觉比较((a)少纹理;(b)夜晚;(c)白天;(d)较大视差;(e)重复纹理)
Fig. 10 Visual comparison of the image stitching quality ((a) Less texture; (b) Night; (c) Day; (d) Large parallax; (e) Repeated texture)
[1] | LE H, LIU F, ZHANG S, et al. Deep homography estimation for dynamic scenes[C]//202 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 7652-7661. |
[2] |
BAKER S, MATTHEWS I. Lucas-kanade 20 years on: a unifying framework[J]. International Journal of Computer Vision, 2004, 56(3): 221-255.
DOI URL |
[3] | LOWE D G. Object recognition from local scale-invariant features[C]//The 7th IEEE International Conference on Computer Vision. New York: IEEE Press, 1999: 1150-1157. |
[4] | RUBLEE E, RABAUD V, KONOLIGE K, et al. ORB: an efficient alternative to SIFT or SURF[C]//2011 IEEE International Conference on Computer Vision. New York: IEEE Press, 2011: 2564-2571. |
[5] | BAY H, TUYTELAARS T, GOOL L. Surf: speeded up robust features[C]//European Conference on Computer Vision. Cham: Springer International Publishin, 2006: 404-417. |
[6] | BARONE F, MARRAZZO M, OTON C J. Camera calibration with weighted direct linear transformation and anisotropic uncertainties of image control points[J]. Sensors, 2020, 20(4): E1175. |
[7] | DETONE D, MALISIEWICZ T, RABINOVICH A. Deep image homography estimation[EB/OL]. [2022-01-20]. https://doi.org/10.48550/arXiv.1606.03798. |
[8] | ZENG R, DENMAN S, SRIDHARAN S, et al. Rethinking planar homography estimation using perspective fields[C]// Asian Conference on Computer Vision. Cham: Springer International Publishin, 2018: 571-586. |
[9] | NGUYEN T, CHEN S, SHIVAKUMAR S. Unsupervised deep homography: a fast and robust homography estimation model[J]. IEEE Robotics and Automation Letters, 2018, 3(3): 2346-2353. |
[10] | ZHANG J, WANG C, LIU S, et al. Content-aware unsupervised deep homography estimation[C]//European Conference on Computer Vision. Cham: Springer International Publishin, 2020: 653-669. |
[11] | ZHANG S, NG W, ZHANG J, et al. Human activity recognition using radial basis function neural network trained via a minimization of localized generalization error[C]// International Conference on Ubiquitous Computing and Ambient Intelligence. Cham: Springer International Publishin, 2017: 498-507. |
[12] |
EVANGELIDIS G D, PSARAKIS E Z. Parametric image alignment using enhanced correlation coefficient maximization[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 30(10): 1858-1865.
DOI PMID |
[13] |
FISCHLER M, BOLLES R. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography[J]. Communications of the ACM, 1981, 24(6): 381-395.
DOI URL |
[14] | BARATH D, MATAS J, NOSKOVA J. MAGSAC: marginalizing sample consensus[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 10189-10197. |
[15] | JADERBERG M, SIMONYAN K, ZISSERMAN A. Spatial transformer networks[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2015: 2017-2025. |
[16] | DAI J F, QI H Z, XIONG Y W, et al. Deformable convolutional networks[C]//2017 IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 764-773. |
[17] | HE K M, SUN J. Convolutional neural networks at constrained time cost[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2015: 5353-5360. |
[18] | HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 770-778. |
[19] | GODARD C, AODHA O M, BROSTOW G J. Unsupervised monocular depth estimation with left-right consistency[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 6602-6611. |
[20] | AMIRI A J, YAN LOO S, ZHANG H. Semi-supervised monocular depth estimation with left-right consistency using deep neural network[C]// 2019 IEEE International Conference on Robotics and Biomimetics. New York: IEEE Press, 2019: 602-607. |
[21] | KOGUCIUK D, ARANI E, ZONOOZ B. Perceptual loss for robust unsupervised homography estimation[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). New York: IEEE Press, 2021: 4269-4278. |
[22] |
NIE L, LIN C Y, LIAO K, et al. Unsupervised deep image stitching: reconstructing stitched features to images[J]. IEEE Transactions on Image Processing, 2021, 30: 6184-6197.
DOI URL |
[23] |
CHEN Y, GAO Y. Image denoising via steerable directional Laplacian regularizer[J]. Circuits, Systems, and Signal Processing, 2021, 40(12): 6265-6283.
DOI |
[24] |
LI Z G, SHU H Y, ZHENG C B. Multi-scale single image dehazing using Laplacian and Gaussian Pyramids[J]. IEEE Transactions on Image Processing, 2021, 30: 9270-9279.
DOI URL |
[1] | 杨陈成, 董秀成, 侯兵, 张党成, 向贤明, 冯琪茗. 基于参考的Transformer纹理迁移深度图像超分辨率重建[J]. 图学学报, 2023, 44(5): 861-867. |
[2] | 党宏社, 许怀彪, 张选德. 融合结构信息的深度学习立体匹配算法[J]. 图学学报, 2023, 44(5): 899-906. |
[3] | 翟永杰, 郭聪彬, 王乾铭, 赵宽, 白云山, 张冀. 基于隐含空间知识融合的输电线路多金具检测方法[J]. 图学学报, 2023, 44(5): 918-927. |
[4] | 杨红菊, 高敏, 张常有, 薄文, 武文佳, 曹付元. 一种面向图像修复的局部优化生成模型[J]. 图学学报, 2023, 44(5): 955-965. |
[5] | 宋焕生, 文雅, 孙士杰, 宋翔宇, 张朝阳, 李旭. 基于改进教师学生网络的隧道火灾检测[J]. 图学学报, 2023, 44(5): 978-987. |
[6] | 毕春艳, 刘越. 基于深度学习的视频人体动作识别综述[J]. 图学学报, 2023, 44(4): 625-639. |
[7] | 曹义亲, 周一纬, 徐露. 基于E-YOLOX的实时金属表面缺陷检测算法[J]. 图学学报, 2023, 44(4): 677-690. |
[8] | 王道累, 康博, 朱瑞. 基于深度学习的电力设备铭牌文本检测方法[J]. 图学学报, 2023, 44(4): 691-698. |
[9] | 邵俊棋, 钱文华, 徐启豪. 基于条件残差生成对抗网络的风景图生成[J]. 图学学报, 2023, 44(4): 710-717. |
[10] | 余伟群, 刘佳涛, 张亚萍. 融合注意力的拉普拉斯金字塔单目深度估计[J]. 图学学报, 2023, 44(4): 728-738. |
[11] | 郭印宏, 王立春, 李爽. 基于重复性和特异性约束的图像特征匹配[J]. 图学学报, 2023, 44(4): 739-746. |
[12] | 毛爱坤, 刘昕明, 陈文壮, 宋绍楼. 改进YOLOv5算法的变电站仪表目标检测方法[J]. 图学学报, 2023, 44(3): 448-455. |
[13] | 王佳婧, 王晨, 朱媛媛, 王笑梅. 基于民国纸币的图元素匹配检索[J]. 图学学报, 2023, 44(3): 492-501. |
[14] | 杨柳, 吴晓群. 基于深度学习的三维形状补全研究综述[J]. 图学学报, 2023, 44(2): 201-215. |
[15] | 曾武, 朱恒亮, 邢树礼, 林江宏, 毛国君. 显著性检测引导的图像数据增强方法[J]. 图学学报, 2023, 44(2): 260-270. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||