基于数据表示不变性的域泛化研究

doi:10.11996/JG.j.2095-302X.2024040705

图学学报 ›› 2024, Vol. 45 ›› Issue (4): 705-713.DOI: 10.11996/JG.j.2095-302X.2024040705

• 图像处理与计算机视觉 • 上一篇下一篇

基于数据表示不变性的域泛化研究

倪云昊¹^,²(), 黄雷¹^,²()

1.北京航空航天大学人工智能研究院，北京 100191
2.北京航空航天大学复杂关键软件环境全国重点实验室，北京 100191

收稿日期:2024-02-27 接受日期:2024-06-20 出版日期:2024-08-31 发布日期:2024-09-03
通讯作者:黄雷(1987-)，男，副教授，博士。主要研究方向为机器学习和计算机视觉等。E-mail：huangleiAI@buaa.edu.cn
第一作者:倪云昊(2002-)，男，硕士研究生。主要研究方向为机器学习。E-mail：musicath@buaa.edu.cn
基金资助:
科技创新2030新一代人工智能重大项目(2021ZD0112901);国家自然科学基金项目(62106012);中央高校基本科研业务费专项资金

Domain generalization based on data representation invariance

NI Yunhao¹^,²(), HUANG Lei¹^,²()

1. Institute of Artificial Intelligence, Beihang University, Beijing 100191, China
2. State Key Laboratory of Complex & Critical Software Environment, Beihang University, Beijing 100191, China

Received:2024-02-27 Accepted:2024-06-20 Published:2024-08-31 Online:2024-09-03
Contact: HUANG Lei (1987-), associate professor, Ph.D. His main research interests cover machine learning, computer vision, etc. E-mail：huangleiAI@buaa.edu.cn
First author：NI Yunhao (2002-), master student. His main research interest covers machine learning. E-mail：musicath@buaa.edu.cn
Supported by:
National Key Research and Development Plan of China under Grant(2021ZD0112901);National Natural Science Foundation of China(62106012);The Fundamental Research Funds for the Central Universities

摘要/Abstract

摘要：

域泛化是人工智能近几年非常热门的一个研究方向，希望在不同的数据分布中学习到与任务相关的不变表征，即移除不同域在学习任务中的影响，从而提升模型的域泛化能力。为提升模型域泛化能力，利用基于不变性风险最小化的思想，具体将神经网络分为特征提取器和不变性分类器进行分别探究。在特征提取器上，采用了基于牛顿迭代的组白化方法来控制激活值的分布，从而使得不同的图像经过神经网络后能够去除部分域信息，以求达到域泛化的目的；在不变性分类器上，探究了特征和权重的规范化方法对模型域泛化效果的影响，并提出了基于余弦相似度损失函数的雪花算法，该算法提升了模型域泛化的准确率。此外，提供了关于雪花算法的理论推导并做了深入分析，为实验提供了理论支撑。

关键词: 域泛化, 不变风险最小化, 组白化, 迭代白化, 雪花算法

Abstract:

Domain generalization has become a prominent research direction in artificial intelligence, aiming to learn task-related invariant representations from different data distributions. It seeks to remove the impact of varying domains on learning tasks, thereby enhancing the model’s domain generalization capabilities. Based on the idea of minimizing the risk of invariance, this paper divided neural networks into feature extractors and invariance classifiers for exploration. For the feature extractor, a group whitening method based on Newtonian iteration was utilized to control the distribution of activation values. This allowed different images to remove part of the domain information after passing through the neural network, thus achieving the purpose of domain generalization. For the invariance classifier, the effects of the normalization method of features and weights on the generalization effect of the model domain were explored, and a snowflake algorithm based on the cosine similarity loss function was proposed. This algorithm improved the accuracy of model domain generalization. In addition, extensive theoretical derivations about the snowflake algorithm and in-depth analyses were provided, offering sufficient theoretical support for the experiment.

Key words: domain generalization, invariant risk minimization, group whitening, iterative whitening, snowflake algorithm

中图分类号:

TP391
TP18

倪云昊, 黄雷. 基于数据表示不变性的域泛化研究[J]. 图学学报, 2024, 45(4): 705-713.

NI Yunhao, HUANG Lei. Domain generalization based on data representation invariance[J]. Journal of Graphics, 2024, 45(4): 705-713.

图/表 6

图1 PACS数据集

Fig. 1 PACS datasets

图2 ResNet-18在PACS数据集上的实验结果

Fig. 2 The results of ResNet-18 on PACS

图3 ResNet-50在PACS数据集上的实验结果

Fig. 3 The results of ResNet-50 on PACS

图4 VGG-16在PACS数据集上的实验结果

Fig. 4 The results of VGG-16 on PACS

图5 ResNet-18在Office-Caltech-10数据集上的实验结果

Fig. 5 The results of ResNet-18 on Office-Caltech-10

图6 雪花算法中不同超参数α的实验结果

Fig. 6 The results of different α in Snow Algorithm

参考文献 32

[1]	ARJOVSKY M, BOTTOU L, GULRAJANI I, et al. Invariant risk minimization[EB/OL]. [2023-12-19]. http://arxiv.org/abs/1907.02893.
[2]	ZHOU K Y, LIU Z W, QIAO Y, et al. Domain generalization: a survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(4): 4396-4415.
[3]	YUE X Y, ZHANG Y, ZHAO S C, et al. Domain randomization and pyramid consistency: simulation-to-real generalization without accessing target domain data[C]// 2019 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2019: 2100-2110.
[4]	PENG X B, ANDRYCHOWICZ M, ZAREMBA W, et al. Sim-to-real transfer of robotic control with dynamics randomization[C]// 2018 IEEE International Conference on Robotics and Automation. New York: IEEE Press, 2018: 3803-3810.
[5]	SHANKAR S, PIRATLA V, CHAKRABARTI S, et al. Generalizing across domains via cross-gradient training[EB/OL]. [2023-12-19]. http://arxiv.org/abs/1804.10745.
[6]	GILLES B, ANAND D A, URUN D, et al. Domain generalization by marginal transfer learning[J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2021, 22: (1): 46-100.
[7]	LIU A S, TANG S Y, LIU X L, et al. Towards defending multiple adversarial perturbations via gated batch normalization[EB/OL]. [2023-12-19]. http://arxiv.org/abs/2012.01654.
[8]	GUO R C, ZHANG P C, LIU H, et al. Out-of-distribution prediction with invariant risk minimization: the limitation and an effective fix[EB/OL]. [2023-12-19]. http://arxiv.org/abs/2101.07732.
[9]	AHUJA K, CABALLERO E, ZHANG D H, et al. Invariance principle meets information bottleneck for out-of-distribution generalization[J]. Advances in Neural Information Processing Systems, 2021, 34: 3438-3450.
[10]	MANCINI M, BULÒ S R, CAPUTO B, et al. Best sources forward: domain generalization through source-specific nets[C]// 2018 25th IEEE International Conference on Image Processing. New York: IEEE Press, 2018: 1353-1357.
[11]	LI D, YANG Y X, SONG Y Z, et al. Learning to generalize: meta-learning for domain generalization[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2018, 32(1).
[12]	KIM D, YOO Y, PARK S, et al. SelfReg: self-supervised contrastive regularization for domain generalization[C]// 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2021: 9599-9608.
[13]	IOFFE S, SZEGEDY C. Batch normalization: accelerating deep network training by reducing internal covariate shift[EB/OL]. [2023-12-19]. https://arxiv.org/abs/1502.03167.
[14]	HUANG L, QIN J, ZHOU Y, et al. Normalization techniques in training DNNs: methodology, analysis and application[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(8): 10173-10196.
[15]	HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 770-778.
[16]	XIE S N, GIRSHICK R, DOLLÁR P, et al. Aggregated residual transformations for deep neural networks[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 5987-5995.
[17]	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[J]. Advances in Neural Information Processing Systems, 2017, 30.
[18]	BA J L, KIROS J R, HINTON G E. Layer normalization[EB/OL]. [2023-12-19]. http://arxiv.org/abs/1607.06450.
[19]	MIYATO T, KATAOKA T, KOYAMA M, et al. Spectral normalization for generative adversarial networks[EB/OL]. [2023-12-19]. http://arxiv.org/abs/1802.05957.
[20]	WU Y X, HE K M. Group normalization[J]. International Journal of Computer Vision, 2020, 128(3): 742-755.
[21]	ULYANOV D, VEDALDI A, LEMPITSKY V. Instance normalization: the missing ingredient for fast stylization[EB/OL]. [2023-12-19]. http://arxiv.org/abs/1607.08022.
[22]	HUANGI L, HUANGI L, YANG D W, et al. Decorrelated batch normalization[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 791-800.
[23]	HUANG L, ZHOU Y, ZHU F, et al. Iterative normalization: beyond standardization towards efficient whitening[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 4869-4878.
[24]	HUANG L, ZHOU Y, LIU L, et al. Group whitening: balancing learning efficiency and representational capacity[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 9507-9516.
[25]	邓盈盈, 唐帆, 董未名. 图像艺术风格化的研究现状[J]. 南京信息工程大学学报: 自然科学版, 2017, 9(6): 593-598.
	DENG Y Y, TANG F, DONG W M. A survey of image artistic stylization[J]. Journal of Nanjing University of Information Science & Technology: Natural Science Edition, 2017, 9(6): 593-598 (in Chinese).
[26]	刘洪麟, 帅仁俊. 一种具有空间约束的快速神经风格迁移方法[J]. 计算机科学, 2019, 46(3): 283-286. DOI
	LIU H L, SHUAI R J. Method of fast neural style transfer with spatial constraint[J]. Computer Science, 2019, 46(3): 283-286 (in Chinese).
[27]	林晓, 屈时操, 黄伟, 等. 显著区域保留的图像风格迁移算法[J]. 图学学报, 2021, 42(2): 190-197.
	LIN X, QU S C, HUANG W, et al. Style transfer algorithm for salient region preservation[J]. Journal of Graphics, 2021, 42(2): 190-197 (in Chinese).
[28]	LI Y H, WANG N Y, SHI J P, et al. Revisiting batch normalization for practical domain adaptation[EB/OL]. [2023-12-19]. http://arxiv.org/abs/1603.04779.
[29]	ZHANG J W, WANG X, BAI X, et al. Revisiting domain generalized stereo matching networks from a feature consistency perspective[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 12991-13001.
[30]	DUMOULIN V, SHLENS J, KUDLUR M. A learned representation for artistic style[EB/OL]. [2023-12-19]. http://arxiv.org/abs/1610.07629.
[31]	HUANG L, LIU X L, LANG B, et al. Orthogonal weight normalization: solution to optimization over multiple dependent stiefel manifolds in deep neural networks[EB/OL]. [2023-12-19]. https://arxiv.org/pdf/1709.06079v1.
[32]	WANG M Z, WANG S S, WANG W, et al. Reducing bi-level feature redundancy for unsupervised domain adaptation[J]. Pattern Recognition, 2023, 137: 109319.

基于数据表示不变性的域泛化研究

Domain generalization based on data representation invariance

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 6

参考文献 32

相关文章 2

编辑推荐

Metrics

本文评价

[1]	李松洋, 王雪婷, 陈相龙, 陈恩庆. 基于骨骼点动态时域滤波的人体动作识别[J]. 图学学报, 2024, 45(4): 760-769.
[2]	梁成武, 杨杰, 胡伟, 蒋松琪, 钱其扬, 侯宁. 基于时间动态帧选择与时空图卷积的可解释骨架行为识别[J]. 图学学报, 2024, 45(4): 791-803.