Cross-domain structured deep dictionary learning for image classification

doi:10.11996/JG.j.2095-302X.2026020341

Abstract

Abstract:

Image classification plays a fundamental role in computer vision, yet conventional deep learning-based approaches typically rely on large-scale annotated datasets, which are difficult to obtain in many small-scale scenarios, especially when labeled samples in the target domain are scarce. To address this challenge, a Cross-Domain Structured Deep Dictionary Learning (CD-SDDL) method for image classification was presented. CD-SDDL constructed multilayer dictionaries in the source and target domains and introduced a cross-domain dictionary regularization to achieve structural-level soft alignment, thereby reducing domain shift. In addition, intra-class compactness, inter-class separability, and Laplacian locality-preserving constraints were incorporated to enhance geometric consistency and discriminability of learned representations. A layer-wise unfolded deep dictionary framework was further adopted to integrate structural constraints with nonlinear transformations, enabling the model to capture more complex cross-domain feature patterns. Experimental results demonstrated that CD-SDDL exhibited superior generalization ability and significantly improved classification performance compared with existing methods on cross-domain tasks.

Key words: cross-domain learning, structured dictionary learning, deep learning, domain adaptation, sparse representation

CLC Number:

YAN Kang, ZENG Li, GU Xiaoqing. Cross-domain structured deep dictionary learning for image classification[J]. Journal of Graphics, 2026, 47(2): 341-350.

Figures/Tables 14

References 24

[1]	黄凯奇, 武美奇, 陈宏昊, 等. 视觉图灵三境界: 大模型时代下视觉智能进展与展望[J]. 图学学报, 2025, 46(5): 919-930. DOI
	HUANG K Q, WU M Q, CHEN H H, et al. The three realms of visual turing: from seeing to imagining in the LLM era[J]. Journal of Graphics, 2025, 46(5): 919-930 (in Chinese). DOI
[2]	时妙文, 范琳伟, 王桦, 等. 基于四元数组稀疏的彩色图像去噪[J]. 图学学报, 2023, 44(2): 298-303. DOI
	SHI M W, FAN L W, WANG H, et al. Quaternion patch-group sparse coding for color image denoising[J]. Journal of Graphics, 2023, 44(2): 298-303 (in Chinese).
[3]	GOU J P, HE X, DU L, et al. Deep class-weighted and class-shared dictionary learning for image classification[J]. Expert Systems with Applications, 2026, 299: 130042. DOI URL
[4]	YANG M, LING J, CHEN J M, et al. Discriminative semi-supervised learning via deep and dictionary representation for image classification[J]. Pattern Recognition, 2023, 140: 109521. DOI URL
[5]	TAN B Y, LIN J, QIN Y, et al. Accelerated deep nonlinear dictionary learning[C]// The 17th Asian Conference on Computer Vision. Cham: Springer, 2024: 111-127.
[6]	蔡益武, 张雨佳, 张永飞. 面向跨域行人再识别的虚拟数据生成与选择[J]. 图学学报, 2023, 44(4): 775-783. DOI
	CAI Y W, ZHANG Y J, ZHANG Y F. Generation and selection of synthetic data for cross-domain person re-identification[J]. Journal of Graphics, 2023, 44(4): 775-783 (in Chinese).
[7]	LI M Y, LI Y, LI Z M. A comprehensive survey of transfer dictionary learning[J]. Neurocomputing, 2025, 623: 129322. DOI URL
[8]	ZHAO D D, ZHANG P, YIN H P, et al. A novel multi-layer discriminative dictionary learning approach for image classification[J]. Signal Processing, 2025, 226: 109670. DOI URL
[9]	ZHENG X, LIN L Y, LIU B, et al. A multi-task transfer learning method with dictionary learning[J]. Knowledge-Based Systems, 2020, 191: 105233. DOI URL
[10]	FAN Z Z, SHI L R, LIU Q, et al. Discriminative fisher embedding dictionary transfer learning for object recognition[J]. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34(1): 64-78. DOI URL
[11]	CHEN D L, SONG P, ZHENG W M. Learning transferable sparse representations for cross-corpus facial expression recognition[J]. IEEE Transactions on Affective Computing, 2023, 14(2): 1322-1333. DOI URL
[12]	YUAN X, GOU J P, YU B S, et al. Deep dictionary learning with an intra-class constraint[C]// 2022 IEEE International Conference on Multimedia and Expo. New York: IEEE Press, 2022: 1-6.
[13]	DHAINI M, BERAR M, HONEINE P, et al. Unsupervised domain adaptation for regression using dictionary learning[J]. Knowledge-Based Systems, 2023, 267: 110439. DOI URL
[14]	SCETBON M, ELAD M, MILANFAR P. Deep K-SVD denoising[J]. IEEE Transactions on Image Processing, 2021, 30: 5944-5955. DOI URL
[15]	LIU S J, MA J J, CUI C K. FPGA implementation of threshold projection orthogonal matching pursuit algorithm for compressed sensing reconstruction[J]. IEEE Transactions on Circuits and Systems I: Regular Papers, 2024, 71(3): 1184-1197. DOI URL
[16]	SAENKO K, KULIS B, FRITZ M, et al. Adapting visual category models to new domains[C]// The 11th European Conference on Computer Vision. Cham: Springer, 2010: 213-226.
[17]	VENKATESWARA H, EUSEBIO J, CHAKRABORTY S, et al. Deep hashing network for unsupervised domain adaptation[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 5385-5394.
[18]	GOU J P, HE X, DU L, et al. Hierarchical locality-aware deep dictionary learning for classification[J]. IEEE Transactions on Multimedia, 2024, 26: 447-461. DOI URL
[19]	LONG M S, ZHU H, WANG J M, et al. Deep transfer learning with joint adaptation networks[EB/OL]. [2025-06-26]. https://dl.acm.org/doi/10.5555/3305890.3305909.https://dl.acm.org/doi/10.5555/3305890.3305909.
[20]	WANG J D, FENG W J, CHEN Y Q, et al. Visual domain adaptation with manifold embedded distribution alignment[C]// The 26th ACM International Conference on Multimedia. New York: ACM, 2018: 402-410.
[21]	ZHU Y C, ZHUANG F Z, WANG J D, et al. Deep subdomain adaptation network for image classification[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(4): 1713-1722. DOI URL
[22]	ZHU Y C, ZHUANG F Z, WANG J D, et al. Multi- representation adaptation network for cross-domain image classification[J]. Neural Networks, 2019, 119: 214-221. DOI URL
[23]	CHEN C, CHEN Z H, JIANG B Y, et al. Joint domain alignment and discriminative feature learning for unsupervised deep domain adaptation[EB/OL]. [2025-06-26]. https://dl. acm.org/doi/10.1609/aaai.v33i01.33013296.https://dl.acm.org/doi/10.1609/aaai.v33i01.33013296.
[24]	XU B R, YIN J H, LIAN C, et al. Low-rank optimal transport for robust domain adaptation[J]. IEEE/CAA Journal of Automatica Sinica, 2024, 11(7): 1667-1680. DOI URL

数据集	域(Domain)	类别数	样本数	迁移任务数	数据特点
Office31	Amazon (A)	31	2 817	6	网络商品图像，背景简洁
	Webcam (W)		795		摄像头拍摄，光照变化大
	DSLR (D)		498		单反拍摄，图像质量高
OfficeHome	Art (Ar)	65	2 427	12	手绘风格，差异显著
	Clipart (Cl)		4 365		简化二维图像
	Product (Pr)		4 439		商品图片，背景干净
	Real World (Rw)		4 357		真实场景图像，复杂多样

数据集	域(Domain)	类别数	样本数	迁移任务数	数据特点
Office31	Amazon (A)	31	2 817	6	网络商品图像，背景简洁
	Webcam (W)		795		摄像头拍摄，光照变化大
	DSLR (D)		498		单反拍摄，图像质量高
OfficeHome	Art (Ar)	65	2 427	12	手绘风格，差异显著
	Clipart (Cl)		4 365		简化二维图像
	Product (Pr)		4 439		商品图片，背景干净
	Real World (Rw)		4 357		真实场景图像，复杂多样

对比方法	核心思想	参数列表
JAN^[19]	通过联合最大均值差异对齐多个层级的特征分布	参数${{\eta }_{0}}=0.01$，$\alpha =10$，$\beta =0.75$，惩罚项参数$\lambda :\{0.01,0.02,0.05,0.1,0.2,0.5,1\}$，动量=0.9
MEDA^[20]	在再生核希尔伯特空间中动态对齐特征并利用流形结构进行学习	流形子空间维度$d:\{10,20,\cdots,100\}$，邻居数$p:\{2,4,\cdots,64\}$；$\eta :\{0.01,1\}$，$\rho :\{0.01,5\}$，正则化参数$\lambda \in [0.5,\text{1 000}]$
DSAN^[21]	通过局部最大均值差异对齐相关但分布不同的子域分布	参数$\gamma =10$，${{\eta }_{0}}=0.01$，$\alpha =10$，$\beta =0.75$；动量=0.9
MRAN^[22]	学习多个域不变表示，并利用注意力机制自适应地聚合	多重自适应损失参数$\lambda :\{0.01,0.02,0.05,0.1,0.2,0.5,1,2\}$，${{\eta }_{0}}=0.01$，$\alpha =10$，$\beta =0.75$，$\gamma =10$，动量=0.9
JDDA^[23]	联合进行域对齐和判别性特征学习，通过中心损失等增强类内紧凑性	参数$\mu =10$，约束阈值${{m}_{1}}=0$,${{m}_{2}}=100$，学习率$\eta ={{10}^{-4}}$，动量=0.9，判别性损失函数$\lambda :\{0.000\text{ }1,0.001,0.003,0.01,0.03,0.1,1,10\}$
LROT^[24]	利用低秩约束的最优传输机制对齐源域与目标域分布，在消除噪声干扰的同时保持类间判别性，实现鲁棒域适应	惩罚参数$\lambda $:$\{0.1,0.5,1,10,100,\text{1 000}\}$，权衡参数$\delta $:$\{0.01,0.1,1,10,100,\text{1 000}\}$，学习率$\eta =0.001$，动量=0.9，权重衰减=0.001，批量大小=32，惩罚参数$\mu $=100

对比方法	核心思想	参数列表
JAN^[19]	通过联合最大均值差异对齐多个层级的特征分布	参数${{\eta }_{0}}=0.01$，$\alpha =10$，$\beta =0.75$，惩罚项参数$\lambda :\{0.01,0.02,0.05,0.1,0.2,0.5,1\}$，动量=0.9
MEDA^[20]	在再生核希尔伯特空间中动态对齐特征并利用流形结构进行学习	流形子空间维度$d:\{10,20,\cdots,100\}$，邻居数$p:\{2,4,\cdots,64\}$；$\eta :\{0.01,1\}$，$\rho :\{0.01,5\}$，正则化参数$\lambda \in [0.5,\text{1 000}]$
DSAN^[21]	通过局部最大均值差异对齐相关但分布不同的子域分布	参数$\gamma =10$，${{\eta }_{0}}=0.01$，$\alpha =10$，$\beta =0.75$；动量=0.9
MRAN^[22]	学习多个域不变表示，并利用注意力机制自适应地聚合	多重自适应损失参数$\lambda :\{0.01,0.02,0.05,0.1,0.2,0.5,1,2\}$，${{\eta }_{0}}=0.01$，$\alpha =10$，$\beta =0.75$，$\gamma =10$，动量=0.9
JDDA^[23]	联合进行域对齐和判别性特征学习，通过中心损失等增强类内紧凑性	参数$\mu =10$，约束阈值${{m}_{1}}=0$,${{m}_{2}}=100$，学习率$\eta ={{10}^{-4}}$，动量=0.9，判别性损失函数$\lambda :\{0.000\text{ }1,0.001,0.003,0.01,0.03,0.1,1,10\}$
LROT^[24]	利用低秩约束的最优传输机制对齐源域与目标域分布，在消除噪声干扰的同时保持类间判别性，实现鲁棒域适应	惩罚参数$\lambda $:$\{0.1,0.5,1,10,100,\text{1 000}\}$，权衡参数$\delta $:$\{0.01,0.1,1,10,100,\text{1 000}\}$，学习率$\eta =0.001$，动量=0.9，权重衰减=0.001，批量大小=32，惩罚参数$\mu $=100

迁移任务	CAN	JDDA	MEDA	MRAN	LROT	CD-SDDL
A->W	81.5	82.6	86.2	91.4	94.6	94.1
D->W	98.2	95.2	97.2	96.9	97.5	97.9
W->D	99.7	99.7	99.4	99.8	100.0	99.0
A->D	85.5	79.8	85.3	86.4	86.8	94.6
D->A	65.9	57.4	72.4	68.3	76.2	77.1
W->A	63.4	66.7	74.0	70.9	77.5	79.3
Avg	82.4	80.2	85.8	85.6	88.8	90.3