基于多分支注意网络与相似度学习策略的无监督行人重识别

doi:10.11996/JG.j.2095-302X.2023020280

图学学报 ›› 2023, Vol. 44 ›› Issue (2): 280-290.DOI: 10.11996/JG.j.2095-302X.2023020280

• 图像处理与计算机视觉 • 上一篇下一篇

基于多分支注意网络与相似度学习策略的无监督行人重识别

冯尊登(), 王洪元(), 林龙, 孙博言, 陈海琴

常州大学计算机与人工智能学院，江苏常州 213164

收稿日期:2022-07-27 接受日期:2022-11-15 出版日期:2023-04-30 发布日期:2023-05-01
通讯作者: 王洪元(1960-)，男，教授，博士。主要研究方向为图像处理、人工智能、模式识别等。E-mail：hywang@cczu.edu.cn
作者简介:冯尊登(1997-)，男，硕士研究生。主要研究方向为计算机视觉、行人重识别。E-mail：1106351887@qq.com
基金资助:
国家自然科学基金项目(61976028);江苏省研究生科研与实践创新计划项目(KYCX22_3067)

Unsupervised person re-identification with multi-branch attention network and similarity learning strategy

FENG Zun-deng(), WANG Hong-yuan(), LIN Long, SUN Bo-yan, CHEN Hai-qin

School of Computer Science & Artificial Intelligence, Changzhou Jiangsu 213164, China

Received:2022-07-27 Accepted:2022-11-15 Online:2023-04-30 Published:2023-05-01
Contact: WANG Hong-yuan (1960-), professor, Ph.D. His main research interests cover image processing, computer vision, pattern recognition, etc. E-mail：hywang@cczu.edu.cn
About author:FENG Zun-deng (1997-), master student. His main research interests cover computer vision and person re-identification. E-mail：1106351887@qq.com
Supported by:
National Natural Science Foundation of China(61976028);Postgraduate Research & Practice Innovation Program of Jiangsu Province(KYCX22_3067)

摘要/Abstract

摘要：

无监督行人重识别的挑战在于学习没有真实标签的行人的判别性特征。为增强网络对行人特征的表达能力，进一步从空间和通道维度上提取更丰富的特征信息，提出了一种基于多分支注意网络的行人重识别特征提取方法。该方法通过捕获空间维度和通道维度上不同分支之间的交互信息，能够学习到更具判别性的行人特征表示。此外，针对噪声标签会对聚类质心产生干扰的问题，提出了相似度学习策略(SLS)。该策略先计算每个聚类中样本特征之间的相似性，然后选取相似性分数最高的特征向量所对应的样本进行对比学习，有效地缓解了聚类噪声导致的累积训练误差。实验结果表明，和无监督场景下的自步对比学习方法(SPCL)相比，在Market-1501，DukeMTMC-reID和MSMT17等3个数据集上的rank-1准确度分别提升了4.6%，3.3%和16.3%，显著地提高了无监督行人重识别的检索精度。

关键词: 无监督行人重识别, 多分支注意网络, 聚类质心, 相似度学习策略, 对比学习

Abstract:

The challenge facing the unsupervised person re-identification (Re-ID) lies in learning discriminative features without true labels. To address this, a person re-identification feature extraction method based on multi-branch attention network was proposed, in order to enhance the ability of the network to express pedestrian features and capture more abundant feature information from spatial and channel dimensions. This method could learn a more discriminative representation of pedestrian features by capturing the interaction information between different branches on the spatial dimension and the channel dimension. In addition, to tackle the issue of noisy labels interfering with cluster centroids, a similarity learning strategy (SLS) was proposed. This strategy first calculated the similarity between the sample features in each cluster, and then selected the samples corresponding to the feature vector with the highest similarity score for contrastive learning, thereby effectively mitigating the cumulative training error caused by noisy labels. The experimental results revealed that compared with the self-paced contrastive learning (SPCL) method in the unsupervised scenarios, the rank-1 precision on the three datasets Market1501, DukeMTMC-reID, and MSMT17 was increased by 4.6%, 3.3%, and 16.3%, respectively, significantly enhancing the retrieval accuracy of unsupervised person re-identification.

Key words: unsupervised person re-identification, multi-branch attention network, cluster centroid, similarity learning strategy, contrastive learning

中图分类号:

TP391

冯尊登, 王洪元, 林龙, 孙博言, 陈海琴. 基于多分支注意网络与相似度学习策略的无监督行人重识别[J]. 图学学报, 2023, 44(2): 280-290.

FENG Zun-deng, WANG Hong-yuan, LIN Long, SUN Bo-yan, CHEN Hai-qin. Unsupervised person re-identification with multi-branch attention network and similarity learning strategy[J]. Journal of Graphics, 2023, 44(2): 280-290.

图/表 11

参考文献 44

[1]	张云鹏, 王洪元, 张继, 等. 近邻中心迭代策略的单标注视频行人重识别[J]. 软件学报, 2021, 32(12): 4025-4035.
	ZHANG Y P, WANG H Y, ZHANG J, et al. One-shot video-based person re-identification based on neighborhood center iteration strategy[J]. Journal of Software, 2021, 32(12): 4025-4035. (in Chinese)
[2]	YAN S L, TANG H, ZHANG L Y, et al. Image-specific information suppression and implicit local alignment for text-based person search[EB/OL]. [2022-01-23]. https://arxiv.org/abs/2208.14365.
[3]	丁宗元, 王洪元, 陈付华, 等. 基于距离中心化与投影向量学习的行人重识别[J]. 计算机研究与发展, 2017, 54(8): 1785-1794.
	DING Z Y, WANG H Y, CHEN F H, et al. Person re-identification based on distance centralization and projection vectors learning[J]. Journal of Computer Research and Development, 2017, 54(8): 1785-1794. (in Chinese)
[4]	曹亮, 王洪元, 戴臣超, 等. 基于多样性约束和离散度分层聚类的无监督视频行人重识别[J]. 南京航空航天大学学报, 2020, 52(5): 752-759.
	CAO L, WANG H Y, DAI C C, et al. Unsupervised video-based person re-identification based on diversity constraint and dispersion hierarchical clustering[J]. Journal of Nanjing University of Aeronautics & Astronautics, 2020, 52(5): 752-759. (in Chinese)
[5]	徐志晨, 王洪元, 齐鹏宇, 等. 基于图模型与加权损失策略的视频行人重识别研究[J]. 计算机应用研究, 2022, 39(2): 598-603.
	XU Z C, WANG H Y, QI P Y, et al. Video-based person re-identification based on graph model and weighted loss strategy[J]. Application Research of Computers, 2022, 39(2): 598-603. (in Chinese)
[6]	SUN X X, ZHENG L. Dissecting person re-identification from the viewpoint of viewpoint[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 608-617.
[7]	MIAO J X, WU Y, LIU P, et al. Pose-guided feature alignment for occluded person re-identification[C]// 2019 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2020: 542-551.
[8]	CHEN G Y, GU T P, LU J W, et al. Person re-identification via attention pyramid[J]. IEEE Transactions on Image Processing, 2021, 30: 7663-7676. DOI URL
[9]	MISRA D, NALAMADA T, ARASANIPALAI A U, et al. Rotate to attend: convolutional triplet attention module[C]// 2021 IEEE Winter Conference on Applications of Computer Vision. New York: IEEE Press, 2021: 3138-3147.
[10]	WU Z R, XIONG Y J, YU S X, et al. Unsupervised feature learning via non-parametric instance discrimination[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 3733-3742.
[11]	NIKHAL K, RIGGAN B S. Unsupervised attention based instance discriminative learning for person re-identification[C]// 2021 IEEE Winter Conference on Applications of Computer Vision. New York: IEEE Press, 2021: 2421-2430.
[12]	王粉花, 赵波, 黄超, 等. 基于多尺度和注意力融合学习的行人重识别[J]. 电子与信息学报, 2020, 42(12): 3045-3052.
	WANG F H, ZHAO B, HUANG C, et al. Person re-identification based on multi-scale network attention fusion[J]. Journal of Electronics & Information Technology, 2020, 42(12): 3045-3052. (in Chinese)
[13]	WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[M]//Computer Vision - ECCV 2018. Cham: Springer International Publishing, 2018: 3-19.
[14]	ZHANG Z Z, LAN C L, ZENG W J, et al. Relation-aware global attention for person re-identification[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 3183-3192.
[15]	YU H X, ZHENG W S, WU A C, et al. Unsupervised person re-identification by soft multilabel learning[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 2143-2152.
[16]	FU Y, WEI Y C, WANG G S, et al. Self-similarity grouping: a simple unsupervised cross domain adaptation approach for person re-identification[C]// 2019 IEEE/CVF International Conference on Computer Vision. New York: IEEE Press, 2020: 6111-6120.
[17]	ZHONG Z, ZHENG L, LUO Z M, et al. Invariance matters: exemplar memory for domain adaptive person re-identification[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 598-607.
[18]	GE Y X, CHEN D P, LI H S. Mutual mean-teaching: pseudo label refinery for unsupervised domain adaptation on person re-identification[EB/OL]. [2022-01-26]. https://arxiv.org/abs/2001.01526.
[19]	戴臣超, 王洪元, 倪彤光, 等. 基于深度卷积生成对抗网络和拓展近邻重排序的行人重识别[J]. 计算机研究与发展, 2019, 56(8): 1632-1641.
	DAI C C, WANG H Y, NI T G, et al. Person re-identification based on deep convolutional generative adversarial network and expanded neighbor reranking[J]. Journal of Computer Research and Development, 2019, 56(8): 1632-1641. (in Chinese)
[20]	HAN X M, YU X H, LI G R, et al. Rethinking sampling strategies for unsupervised person re-identification[EB/OL]. [2022-01-11]. https://arxiv.org/abs/2107.03024.
[21]	WANG D K, ZHANG S L. Unsupervised person re-identification via multi-label classification[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 10978-10987.
[22]	GE Y X, ZHU F, CHEN D P, et al. Self-paced contrastive learning with hybrid memory for domain adaptive object re-ID[C]// The 34th International Conference on Neural Information Processing Systems. New York: ACM, 2020: 11309-11321.
[23]	LI M K, LI C G, GUO J. Cluster-guided asymmetric contrastive learning for unsupervised person re-identification[J]. IEEE Transactions on Image Processing: a Publication of the IEEE Signal Processing Society, 2022, 31: 3606-3617. DOI URL
[24]	DAI Z Z, WANG G Y, YUAN W H, et al. Cluster contrast for unsupervised person re-identification[EB/OL]. [2022-01-23]. https://arxiv.org/abs/2103.11568.
[25]	HE K M, FAN H Q, WU Y X, et al. Momentum contrast for unsupervised visual representation learning[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 9726-9735.
[26]	HÉNAFF O J, SRINIVAS A, DE FAUW J, et al. Data-efficient image recognition with contrastive predictive coding[C]// The 37th International Conference on Machine Learning. New York: ACM, 2020: 4182-4192.
[27]	HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2016: 770-778.
[28]	ESTER M, KRIEGEL H P, SANDER J, et al. A density-based algorithm for discovering clusters in large spatial databases with noise[C]// The 2nd International Conference on Knowledge Discovery and Data Mining. Palo Alto: AAAI Press, 1996, 96(34): 226-231.
[29]	LI M X, ZHU X T, GONG S G. Unsupervised person re-identification by deep learning tracklet association[M]// Computer Vision - ECCV 2018. Cham: Springer International Publishing, 2018: 772-788.
[30]	张宝华, 朱思雨, 吕晓琪, 等. 软多标签和深度特征融合的无监督行人重识别[J]. 光电工程, 2020, 47(12): 15-24.
	ZHANG B H, ZHU S Y, LV X Q, et al. Soft multilabel learning and deep feature fusion for unsupervised person re-identification[J]. Opto-Electronic Engineering, 2020, 47(12): 15-24. (in Chinese)
[31]	ZHENG L, SHEN L Y, TIAN L, et al. Scalable person re-identification: a benchmark[C]// 2015 IEEE International Conference on Computer Vision. New York: IEEE Press, 2016: 1116-1124.
[32]	ZHENG Z D, ZHENG L, YANG Y. Unlabeled samples generated by GAN improve the person re-identification baseline in vitro[C]// 2017 IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 3774-3782.
[33]	WEI L H, ZHANG S L, GAO W, et al. Person transfer GAN to bridge domain gap for person re-identification[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 79-88.
[34]	LIN Y T, DONG X Y, ZHENG L, et al. A bottom-up clustering approach to unsupervised person re-identification[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33(1): 8738-8745. DOI URL
[35]	DING G D, KHAN S, TANG Z M, et al. Towards better Validity: dispersion based Clustering for Unsupervised Person re-identification[EB/OL]. [2022-01-23]. https://arxiv.org/abs/1906.01308.
[36]	LIN Y T, XIE L X, WU Y, et al. Unsupervised person re-identification via softened similarity learning[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 3387-3396.
[37]	ZENG K W, NING M N, WANG Y H, et al. Hierarchical clustering with hard-batch triplet loss for person re-identification[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 13654-13662.
[38]	YANG F X, ZHONG Z, LUO Z M, et al. Joint noise-tolerant learning and meta camera shift adaptation for unsupervised person re-identification[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 4853-4862.
[39]	CHEN H, WANG Y H, LAGADEC B, et al. Joint generative and contrastive learning for unsupervised person re-identification[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 2004-2013.
[40]	LI J N, ZHANG S L. Joint visual and temporal consistency for unsupervised domain adaptive person re-identification[M]// Computer Vision - ECCV 2020. Cham: Springer International Publishing, 2020: 483-499.
[41]	DAI Y X, LIU J, BAI Y, et al. Dual-refinement: joint label and feature refinement for unsupervised domain adaptive person re-identification[J]. IEEE Transactions on Image Processing, 2021, 30: 7815-7829. DOI URL
[42]	ZHENG K C, LAN C L, ZENG W J, et al. Exploiting sample uncertainty for domain adaptive person re-identification[EB/OL]. [2022-01-23]. https://arxiv.org/abs/2012.08733.
[43]	ZHENG K C, LIU W, HE L X, et al. Group-aware label transfer for domain adaptive person re-identification[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 5306-5315.
[44]	WANG F, LIU H P. Understanding the behavior of contrastive loss[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 2495-2504.

数据集	摄像机	训练集		测试集
数据集	摄像机	行人身份	图片	行人身份	图片
Market-1501	6	751	12 936	750	19 732
DukeMTMC-reID	8	702	16 522	702	19 889
MSMT17	15	4 101	32 621	3 060	93 820

数据集	摄像机	训练集		测试集
数据集	摄像机	行人身份	图片	行人身份	图片
Market-1501	6	751	12 936	750	19 732
DukeMTMC-reID	8	702	16 522	702	19 889
MSMT17	15	4 101	32 621	3 060	93 820

Methods	Market-1501				DukeMTMC-reID				MSMT17
Methods	mAP	rank-1	rank-5	rank-10	mAP	rank-1	rank-5	rank-10	mAP	rank-1	rank-5	rank-10
BUC^[34]	38.3	66.2	79.6	84.5	27.5	47.4	62.6	68.4	-	-	-	-
DBC^[35]	41.3	69.1	83.0	87.8	30.0	51.5	64.6	70.1	-	-	-	-
SSL^[36]	37.8	71.7	83.8	87.4	28.6	52.5	63.5	68.9	-	-	-	-
MMCL^[21]	45.5	80.3	89.4	92.3	40.2	65.2	75.9	80.0	11.2	35.4	44.8	49.8
HCT^[37]	56.4	80.0	91.6	95.2	50.7	69.6	83.4	87.4	-	-	-	-
JNTL^[38]	61.7	83.9	92.3	-	53.8	73.8	84.2	-	15.5	35.2	48.3	-
JGCL^[39]	66.8	87.3	93.5	95.5	62.8	82.9	87.1	88.5	21.3	45.7	58.6	64.5
SPCL^[22]	73.1	88.1	95.1	97.0	65.3	81.2	90.3	92.2	19.1	42.3	55.6	61.2
RSS^[20]	79.2	92.3	96.6	97.5	69.1	82.7	91.1	93.5	-	-	-	-
CACL^[23]	80.9	92.7	97.4	98.5	69.6	83.3	91.5	94.1	21.0	45.4	58.2	63.6
SLS+MBA	82.2	92.7	97.5	98.5	71.7	84.5	91.9	94.2	30.0	58.6	69.3	73.7

Methods	Market-1501				DukeMTMC-reID				MSMT17
Methods	mAP	rank-1	rank-5	rank-10	mAP	rank-1	rank-5	rank-10	mAP	rank-1	rank-5	rank-10
BUC^[34]	38.3	66.2	79.6	84.5	27.5	47.4	62.6	68.4	-	-	-	-
DBC^[35]	41.3	69.1	83.0	87.8	30.0	51.5	64.6	70.1	-	-	-	-
SSL^[36]	37.8	71.7	83.8	87.4	28.6	52.5	63.5	68.9	-	-	-	-
MMCL^[21]	45.5	80.3	89.4	92.3	40.2	65.2	75.9	80.0	11.2	35.4	44.8	49.8
HCT^[37]	56.4	80.0	91.6	95.2	50.7	69.6	83.4	87.4	-	-	-	-
JNTL^[38]	61.7	83.9	92.3	-	53.8	73.8	84.2	-	15.5	35.2	48.3	-
JGCL^[39]	66.8	87.3	93.5	95.5	62.8	82.9	87.1	88.5	21.3	45.7	58.6	64.5
SPCL^[22]	73.1	88.1	95.1	97.0	65.3	81.2	90.3	92.2	19.1	42.3	55.6	61.2
RSS^[20]	79.2	92.3	96.6	97.5	69.1	82.7	91.1	93.5	-	-	-	-
CACL^[23]	80.9	92.7	97.4	98.5	69.6	83.3	91.5	94.1	21.0	45.4	58.2	63.6
SLS+MBA	82.2	92.7	97.5	98.5	71.7	84.5	91.9	94.2	30.0	58.6	69.3	73.7

Methods	Market-1501				DukeMTMC-reID				MSMT17
Methods	source	mAP	rank-1	rank-5	source	mAP	rank-1	rank-5	source	mAP	rank-1	rank-5
ECN^[17]	Duke	43.0	75.1	87.6	Market	40.4	63.3	75.8	Duke	10.2	30.2	41.5
SSG^[16]	Duke	58.3	80.0	90.0	Market	53.4	73.0	80.6	Duke	13.3	32.2	-
MMCL^[21]	Duke	60.4	84.4	92.8	Market	51.4	72.4	82.9	Duke	16.2	43.6	54.3
JVTC^[40]	Duke	61.1	83.8	93.0	Market	56.2	75.0	85.1	Duke	20.3	45.4	58.4
MMT^[18]	MSMT17	75.6	89.3	95.8	Market	65.1	78.9	88.8	Market	24.0	50.1	63.5
SPCL^[22]	MSMT17	77.5	89.7	96.1	Market	68.8	82.9	90.1	Market	26.8	53.7	65.0
JLFR^[41]	Duke	78.0	90.9	96.4	Market	67.7	82.1	90.1	Market	25.1	53.3	66.1
UNRN^[42]	Duke	78.1	91.9	96.1	Market	69.1	82.0	90.7	Market	25.3	52.4	64.7
GLT^[43]	Duke	79.5	92.2	96.5	Market	69.2	82.0	90.2	Market	26.5	56.6	67.5
SLS+MBA	None	82.2	92.7	97.5	None	71.7	84.5	91.9	None	30.0	58.6	69.3

基于多分支注意网络与相似度学习策略的无监督行人重识别

Unsupervised person re-identification with multi-branch attention network and similarity learning strategy

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 11

参考文献 44

相关文章 1

编辑推荐

Metrics

本文评价

训练批次 (BatchSize)	P×Z	Market-1501		DukeMTMC-reID
训练批次 (BatchSize)	P×Z	mAP (%)	rank-1 (%)	mAP (%)	rank-1 (%)
32	16×2	77.9	90.2	67.6	80.8
64	16×4	82.2	92.7	71.7	84.5
128	16×8	82.8	93.2	71.4	84.6
256	16×16	82.1	92.5	71.2	84.4