Journal of Graphics ›› 2023, Vol. 44 ›› Issue (3): 521-530.DOI: 10.11996/JG.j.2095-302X.2023030521
Previous Articles Next Articles
SHI Cai-juan1,2(), SHI Ze1,2, YAN Jin-wei1,2, BI Yang-yang1,2
Received:
2022-09-26
Accepted:
2022-12-06
Online:
2023-06-30
Published:
2023-06-30
About author:
SHI Cai-juan (1977-), professor, Ph.D. Her main research interests cover computer vision, image processing and deep learning. E-mail:scj-blue@163.com
Supported by:
CLC Number:
SHI Cai-juan, SHI Ze, YAN Jin-wei, BI Yang-yang. Bi-directionally aligned VAE based on double semantics for generalized zero-shot learning[J]. Journal of Graphics, 2023, 44(3): 521-530.
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.txxb.com.cn/EN/10.11996/JG.j.2095-302X.2023030521
模型 | AwA1 | AwA2 | CUB | |||||||
---|---|---|---|---|---|---|---|---|---|---|
U | S | H | U | S | H | U | S | H | ||
基于空间嵌入方法 | DAP[ | 0.0 | 88.7 | 0.0 | 0.0 | 84.7 | 0.0 | 1.7 | 67.9 | 3.3 |
ESZSL[ | 6.6 | 75.6 | 12.1 | 5.9 | 77.8 | 11.0 | 12.6 | 63.8 | 21.0 | |
LATEM[ | 7.3 | 71.7 | 13.3 | 11.5 | 77.3 | 20.0 | 15.2 | 57.3 | 24.0 | |
SJE[ | 11.3 | 74.6 | 19.6 | 8.0 | 73.9 | 14.4 | 23.5 | 59.2 | 33.6 | |
基于生成模型方法 | f-CLSWGAN[ | 57.9 | 61.4 | 59.6 | 52.1 | 68.9 | 59.4 | 43.7 | 57.7 | 49.7 |
SE[ | 56.6 | 67.8 | 61.5 | 58.9 | 68.1 | 62.8 | 41.5 | 53.3 | 46.7 | |
M2GAN[ | 42.1 | 80.7 | 54.7 | 38.3 | 85.0 | 52.8 | 20.7 | 58.2 | 30.5 | |
MGM-VAE[ | - | - | - | 61.4 | 67.8 | 64.5 | 49.9 | 57.9 | 53.6 | |
LVA[ | 54.4 | 70.0 | 61.2 | - | - | - | 45.6 | 53.2 | 48.5 | |
SMF-VAE[ | - | - | 63.8 | - | - | - | - | - | 52.3 | |
CADA-VAE[ | 57.3 | 72.8 | 64.1 | 55.8 | 75.0 | 63.9 | 51.6 | 53.5 | 52.4 | |
本文 | BAVAE-DS | 59.5 | 74.6 | 66.2 | 56.9 | 77.2 | 65.5 | 52.9 | 57.5 | 55.1 |
Table 1 Performance comparison of GZSL on three benchmark datasets (%)
模型 | AwA1 | AwA2 | CUB | |||||||
---|---|---|---|---|---|---|---|---|---|---|
U | S | H | U | S | H | U | S | H | ||
基于空间嵌入方法 | DAP[ | 0.0 | 88.7 | 0.0 | 0.0 | 84.7 | 0.0 | 1.7 | 67.9 | 3.3 |
ESZSL[ | 6.6 | 75.6 | 12.1 | 5.9 | 77.8 | 11.0 | 12.6 | 63.8 | 21.0 | |
LATEM[ | 7.3 | 71.7 | 13.3 | 11.5 | 77.3 | 20.0 | 15.2 | 57.3 | 24.0 | |
SJE[ | 11.3 | 74.6 | 19.6 | 8.0 | 73.9 | 14.4 | 23.5 | 59.2 | 33.6 | |
基于生成模型方法 | f-CLSWGAN[ | 57.9 | 61.4 | 59.6 | 52.1 | 68.9 | 59.4 | 43.7 | 57.7 | 49.7 |
SE[ | 56.6 | 67.8 | 61.5 | 58.9 | 68.1 | 62.8 | 41.5 | 53.3 | 46.7 | |
M2GAN[ | 42.1 | 80.7 | 54.7 | 38.3 | 85.0 | 52.8 | 20.7 | 58.2 | 30.5 | |
MGM-VAE[ | - | - | - | 61.4 | 67.8 | 64.5 | 49.9 | 57.9 | 53.6 | |
LVA[ | 54.4 | 70.0 | 61.2 | - | - | - | 45.6 | 53.2 | 48.5 | |
SMF-VAE[ | - | - | 63.8 | - | - | - | - | - | 52.3 | |
CADA-VAE[ | 57.3 | 72.8 | 64.1 | 55.8 | 75.0 | 63.9 | 51.6 | 53.5 | 52.4 | |
本文 | BAVAE-DS | 59.5 | 74.6 | 66.2 | 56.9 | 77.2 | 65.5 | 52.9 | 57.5 | 55.1 |
Fig. 3 T-SNE visualization analysis of pseudo visual feature generation module ((a) SE based on single semantic and unidirectional VAE; (b) CADA-VAE based on single semantic and bi-directional VAE; (c) BAVAE-DS based on double semantic and bi-directional VAE)
模型 | 损失 函数 | AwA1 (%) | AwA2 (%) | CUB (%) |
---|---|---|---|---|
BAVAE-DS#1 | L1 | 64.1 | 63.9 | 52.4 |
BAVAE-DS#2 | L2 | 58.4 | 55.7 | 50.9 |
BAVAE-DS | L1 L2 L3 | 66.2 | 65.5 | 55.1 |
Table 2 Ablation study on three benchmark datasets
模型 | 损失 函数 | AwA1 (%) | AwA2 (%) | CUB (%) |
---|---|---|---|---|
BAVAE-DS#1 | L1 | 64.1 | 63.9 | 52.4 |
BAVAE-DS#2 | L2 | 58.4 | 55.7 | 50.9 |
BAVAE-DS | L1 L2 L3 | 66.2 | 65.5 | 55.1 |
[1] |
LECUN Y, BENGIO Y, HINTON G. Deep learning[J]. Nature, 2015, 521(7553): 436-444.
DOI |
[2] | LAMPERT C H, NICKISCH H, HARMELING S. Learning to detect unseen object classes by between-class attribute transfer[C]// 2009 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2009: 951-958. |
[3] | 芦楠楠, 刘一雄, 邱铭恺. 基于随机传播图卷积模型的零样本图像分类[J]. 图学学报, 2022, 43(4): 624-632. |
LU N N, LIU Y X, QIU M K. Zero-shot image classification based on random propagation graph convolution model[J]. Journal of Graphics, 2022, 43(4): 624-632. (in Chinese) | |
[4] | ROMERA-PAREDES B, TORR P H. An embarrassingly simple approach to zero-shot learning[C]// The 32nd International Conference on Machine Learning. New York: ACM, 2015: 2152-2161. |
[5] |
LI X. Learning unseen visual prototypes for zero-shot classification[J]. Knowledge-Based Systems, 2018, 160: 176-187.
DOI URL |
[6] | XIAN Y Q, SCHIELE B, AKATA Z. Zero-shot learning—the good, the bad and the ugly[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 3077-3086. |
[7] |
DING J Y, HU X, ZHONG X R. A semantic encoding out-of-distribution classifier for generalized zero-shot learning[J]. IEEE Signal Processing Letters, 2021, 28: 1395-1399.
DOI URL |
[8] | GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial nets[C]// The 27th International Conference on Neural Information Processing Systems. New York: ACM, 2014, 2: 2672-2680. |
[9] | KINGMA D P, WELLING M. Auto-encoding variational Bayes[EB/OL]. [2022-03-05]. https://arxiv.org/abs/1312.6114. |
[10] | XIAN Y Q, SHARMA S, SCHIELE B, et al. F-VAEGAN-D2: a feature generating framework for any-shot learning[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 10267-10276. |
[11] | SCHÖNFELD E, EBRAHIMI S, SINHA S, et al. Generalized zero- and few-shot learning via aligned variational autoencoders[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 8239-8247. |
[12] | FELIX R, VIJAY KUMAR B G, REID I, et al. Multi-modal cycle-consistent generalized zero-shot learning[M]//Computer Vision - ECCV 2018. Cham: Springer International Publishing, 2018: 21-37. |
[13] |
JI Z. Multi-modal generative adversarial network for zero-shot learning[J]. Knowledge-Based Systems, 2020, 197: 105847.
DOI URL |
[14] | KOBYZEV I, PRINCE S J D, BRUBAKER M A. Normalizing flows: an introduction and review of current methods[EB/OL]. [2022-03-05]. https://arxiv.org/abs/1908.09257. |
[15] | XIAN Y Q, LORENZ T, SCHIELE B, et al. Feature generating networks for zero-shot learning[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 5542-5551. |
[16] |
VERMA V K, BRAHMA D, RAI P. Meta-learning for generalized zero-shot learning[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(4): 6062-6069.
DOI URL |
[17] |
LIU H. Dual-stream generative adversarial networks for distributionally robust zero-shot learning[J]. Information Sciences, 2020, 519: 407-422.
DOI URL |
[18] |
GAO R, HOU X S, QIN J, et al. Zero-VAE-GAN: generating unseen features for generalized and transductive zero-shot learning[J]. IEEE Transactions on Image Processing, 2020, 29: 3665-3680.
DOI URL |
[19] |
ZHANG Z L, LI Y J, YANG J, et al. Cross-layer autoencoder for zero-shot learning[J]. IEEE Access, 2019, 7: 167584-167592.
DOI |
[20] | MISHRA A, REDDY M S K, MITTAL A, et al. A generative model for zero shot learning using conditional variational autoencoders[EB/OL]. [2022-03-05]. https://arxiv.org/abs/1709.00663. |
[21] | VERMA V K, ARORA G, MISHRA A, et al. Generalized zero-shot learning via synthesized examples[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 4281-4289. |
[22] | YU H, LEE B. Zero-shot learning via simultaneous generating and learning[EB/OL]. [2022-03-05]. https://arxiv.org/abs/1910.09446. |
[23] |
MA P R, HU X. A variational autoencoder with deep embedding model for generalized zero-shot learning[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 11733-11740.
DOI URL |
[24] | GU Y C, ZHANG L, LIU Y, et al. Generalized zero-shot learning via VAE-conditioned generative flow[EB/OL]. [2022-03-05]. https://arxiv.org/abs/2009.00303. |
[25] | AKATA Z, REED S, WALTER D, et al. Evaluation of output embeddings for fine-grained image classification[C]// 2015 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2015: 2927-2936. |
[26] | XIAN Y Q, AKATA Z, SHARMA G, et al. Latent embeddings for zero-shot classification[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 69-77. |
[27] |
XIE C. Cross knowledge-based generative zero-shot learning approach with taxonomy regularization[J]. Neural Networks, 2021, 139: 168-178.
DOI PMID |
[28] | XIANG H X, XIE C, ZENG T, et al. Multi-knowledge fusion for new feature generation in generalized zero-shot learning[EB/OL]. [2022-03-05]. https://arxiv.org/abs/2102.11566. |
[29] | MALL U, HARIHARAN B, BALA K. Zero-shot learning using multimodal descriptions[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. New York: IEEE Press, 2022: 3930-3938. |
[30] | MERCEA O B, RIESCH L, KOEPKE A S, et al. Audiovisual generalised zero-shot learning with cross-modal attention and language[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2022: 10543-10553. |
[31] | WAH C, BRANSON S, WELINDER P, et al. The caltech-UCSD Birds-200-2011 Dataset[EB/OL]. [2022-03-05]. https://www.researchgate.net/publication/251734721_The_Caltech-UCSD_Birds200-2011_Dataset. |
[32] |
LAMPERT C H, NICKISCH H, HARMELING S. Attribute-based classification for zero-shot visual object categorization[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(3): 453-465.
DOI PMID |
[33] | 邵洁, 李晓瑞. 基于混合高斯分布的广义零样本识别[J]. 上海电力大学学报, 2021, 37(5): 475-480. |
SHAO J, LI X R. A method for generalized zero-shot learning based on Gaussian mixture distribution[J]. Journal of Shanghai University of Electric Power, 2021, 37(5): 475-480. (in Chinese) | |
[34] |
钟小容, 胡晓, 丁嘉昱. 基于潜层向量对齐的持续零样本学习算法[J]. 模式识别与人工智能, 2021, 34(12): 1152-1159.
DOI |
ZHONG X R, HU X, DING J Y. Continual zero-shot learning algorithm based on latent vectors alignment[J]. Pattern Recognition and Artificial Intelligence, 2021, 34(12): 1152-1159. (in Chinese)
DOI |
|
[35] |
林爽, 王晓军. 运用模态融合的半监督广义零样本学习[J]. 计算机工程与应用, 2022, 58(5): 163-171.
DOI |
LIN S, WANG X J. Semi-supervised generalized zero-shot learning using modal fusion[J]. Computer Engineering and Applications, 2022, 58(5): 163-171. (in Chinese)
DOI |
[1] | ZHAO Yu-kun, REN Shuang, ZHANG Xin-yun. A 3D point cloud defense framework combined with adversarial examples detection and reconstruction [J]. Journal of Graphics, 2023, 44(3): 560-569. |
[2] | ZHU Tian-xiao, YAN Feng-ting, SHI Zhi-cai. Regional hierarchical mesh simplification algorithm for feature retention [J]. Journal of Graphics, 2023, 44(3): 570-578. |
[3] | SHEN Wan-qiang. Preliminary study of density modeling method [J]. Journal of Graphics, 2023, 44(3): 579-587. |
[4] | LUO Yue-tong, YANG Meng-nan, PENG Jun, ZHOU Bo, ZHANG Yan-kong. Study on multi-scale visual analysis method of activated corrosion products in fusion reactor [J]. Journal of Graphics, 2023, 44(3): 588-598. |
[5] | WANG Peng-fei, TAO Ti-wei, JIAO Dian, SHEN Yan-ming, ZHOU Dong-sheng, ZHANG Qiang. Exploration on the modeling method of complex dynamic system integrating multi-agent and hypergraph [J]. Journal of Graphics, 2023, 44(3): 599-608. |
[6] | TANG Peng, SA Guo-dong, LIU Zhen-yu, TAN Jian-rong. Design of digital twin system for forging hydraulic press [J]. Journal of Graphics, 2023, 44(3): 609-615. |
[7] | HU Xin, ZHOU Yun-qiang, XIAO Jian, YANG Jie. Surface defect detection of threaded steel based on improved YOLOv5 [J]. Journal of Graphics, 2023, 44(3): 427-437. |
[8] | LI Gang, ZHANG Yun-tao, WANG Wen-kai, ZHANG Dong-yang. Defect detection method of transmission line bolts based on DETR and prior knowledge fusion [J]. Journal of Graphics, 2023, 44(3): 438-447. |
[9] | MAO Ai-kun, LIU Xin-ming, CHEN Wen-zhuang, SONG Shao-lou. Improved substation instrument target detection method for YOLOv5 algorithm [J]. Journal of Graphics, 2023, 44(3): 448-455. |
[10] | HAO Peng-fei, LIU Li-qun, GU Ren-yuan. YOLO-RD-Apple orchard heterogenous image obscured fruit detection model [J]. Journal of Graphics, 2023, 44(3): 456-464. |
[11] | LUO Wen-yu, FU Ming-yue. On-site monitoring technology of illegal swimming and fishing based on YoloX-ECA [J]. Journal of Graphics, 2023, 44(3): 465-472. |
[12] | LI Yu, YAN Tian-tian, ZHOU Dong-sheng, WEI Xiao-peng. Natural scene text detection based on attention mechanism and deep multi-scale feature fusion [J]. Journal of Graphics, 2023, 44(3): 473-481. |
[13] | XIAO Tian-xing, WU Jing-jing. Segmentation of laser coding characters based on residual and feature-grouped attention [J]. Journal of Graphics, 2023, 44(3): 482-491. |
[14] | WANG Jia-jing, WANG Chen, ZHU Yuan-yuan, WANG Xiao-mei. Graph element detection matching based on Republic of China banknotes [J]. Journal of Graphics, 2023, 44(3): 492-501. |
[15] | SUN Long-fei, LIU Hui, YANG Feng-chang, LI Pan. Research on cyclic generative network oriented to inter-layer interpolation of medical images [J]. Journal of Graphics, 2023, 44(3): 502-512. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||