Cross-modal Pedestrian Re-identification based on Spatially Enhanced Dual-stream Network

Authors

  • Zheyu Fu
  • Xiaoguang Hu
  • Xu Wang

DOI:

https://doi.org/10.54691/5gwfwz21

Keywords:

Pedestrian Re-identification; Deep Learning; Cross-modality; Feature Learning; Convolutional Neural Network (CNN); Space Enhancement.

Abstract

In the current research on cross-modal pedestrian re-identification, the main difficulty lies in the low recognition accuracy caused by the difference between modalities, in order to solve this problem. In this paper, we propose a new spatially enhanced dual-stream network, which is called Multivariate Extended Network (DEN). This method can embed different learning features to reduce the gap between modalities. The network is composed of a spatial embedding module (CPM) and a multi-feature aggregation module (SEM), in which the spatial embedding module can embed diversified information to improve performance, and the multi-feature aggregation module can aggregate features at different stages to mine channel and spatial information, thereby improving the ability of the network to mine different embeddings at different levels.

Downloads

Download data is not yet available.

References

[1] Seokeon Choi, Sumin Lee, Y oungeun Kim, Taekyung Kim,and Changick Kim. Hi-cmd: Hierarchical cross-modality disentanglement for visible-infrared person re-identification.In Proceedings of the CVPR, pages 10257–10266, 2020.

[2] Guan’an Wang, Tianzhu Zhang, Jian Cheng, Si Liu, YangYang, and Zengguang Hou. Rgb-infrared cross-modality per-son re-identification via joint pixel and feature alignment. InProceedings of the ICCV, pages 3623–3632, 2019.

[3] Guan-An Wang, Tianzhu Zhang Yang, Jian Cheng, Jian-long Chang, Xu Liang, Zengguang Hou, et al. Cross-modality paired-images generation for rgb-infrared personre-identification. In Proceedings of the AAAI, pages 12144–12151, 2020.

[4] Zhixiang Wang, Zheng Wang, Yinqiang Zheng, Y ung-Y uChuang, and Shin’ichi Satoh. Learning to reduce dual-level discrepancy for infrared-visible person re-identification. InProceedings of the CVPR, pages 618–626, 2019.

[5] Diangang Li, Xing Wei, Xiaopeng Hong, and Yihong Gong.Infrared-visible cross-modal person re-identification with an x modality. In Proceedings of the AAAI, pages 4610–4617,2020.

[6] Ziyu Wei, Xi Yang, Nannan Wang, and Xinbo Gao. Syn-cretic modality collaborative learning for visible infraredperson re-identification. In Proceedings of the ICCV, pages225–234, 2021.

[7] Y ukang Zhang, Yan Yan, Yang Lu, and Hanzi Wang. To-wards a unified middle modality learning for visible-infrared person re-identification. In Proceedings of the ACM MM,pages 788–796, 2021.

[8] Jialun Liu, Yifan Sun, Feng Zhu, Hongbin Pei, Yi Yang,and Wenhui Li. Learning memory-augmented unidirectional metrics for cross-modality person re-identification. In Pro-ceedings of the CVPR, pages 19366–19375, 2022.

[9] Lei Tan, Y ukang Zhang, Shengmei Shen, Yan Wang,Pingyang Dai, Xianming Lin, Y ongjian Wu, and RongrongJi.Exploring invariant representation for visible-infraredperson re-identification. ArXiv, 2023.

[10] Hanzhe Sun, Jun Liu, Zhizhong Zhang, Chengjie Wang,Yanyun Qu, Y uan Xie, and Lizhuang Ma. Not all pixels arematched: Dense contrastive learning for cross-modality per-son re-identification. In Proceedings of the ACM MM, page5333–5341, 2022.

[11] Yajun Gao, Tengfei Liang, Yi Jin, Xiaoyan Gu, Wu Liu,Yidong Li, and Congyan Lang. Mso: Multi-featurespace jointoptimization network for rgb-infrared person re-identification. In Proceedings of the ACM MM, pages 5257–5265, 2021.

[12] Ziling Miao, Hong Liu, Wei Shi, Wanlu Xu, and HanrongYe. Modality-aware style adaptation for rgb-infrared personre-identification. In Proceedings of the IJCAI, pages 19–27,2021.

[13] Nan Pu, Wei Chen, Y u Liu, Erwin M Bakker, and Michael SLew. Dual gaussian-based variational subspace disentangle-ment for visible-infrared person re-identification. In Pro-ceedings of the ACM MM, pages 2149–2158, 2020.

[14] Xudong Tian, Zhizhong Zhang, Shaohui Lin, Yanyun Qu,Y uan Xie, and Lizhuang Ma. Farewell to mutual infor-mation: V ariational distillation for cross-modal person re-identification. In Proceedings of the CVPR, pages 1522–1531, 2021.

[15] Ye, M., Wang, Z., Lan, X., Yuen, P.C.: Visible thermal person re-identification via dual-constrained top-ranking. In: International Joint Conference on Artificial Intelligence,vol. 1, p. 2 (2018).

[16] Xuesong Chen, Canmiao Fu, Y ong Zhao, Feng Zheng,Jingkuan Song, Rongrong Ji, and Yi Yang. Salience-guided cascaded suppression network for person re-identification. InProceedings of the CVPR, pages 3300–3310, 2020.

[17] Sanping Zhou, Fei Wang, Zeyi Huang, and Jinjun Wang.Discriminative feature learning with consistent attention reg-ularization for person re-identification. In Proceedings of theICCV, pages 8040–8049, 2019. 4

[18] Zhen Zhu, Mengde Xu, Song Bai, Tengteng Huang, and Xi-ang Bai. Asymmetric non-local neural networks for semanticsegmentation. In Proceedings of the ICCV, pages 593–602,2019.

[19] Chaoyou Fu, Yibo Hu, Xiang Wu, Hailin Shi, Tao Mei, andRan He. Cm-nas: Cross-modality neural architecture search for visible-infrared person re-identification. In Proceedingsof the ICCV, pages 11823–11832, 2021.

[20] Yehansen Chen, Lin Wan, Zhihang Li, Qianyan Jing, andZongyuan Sun. Neural feature search for rgb-infrared person re-identification. In Proceedings of the CVPR, pages 587–597, 2021.

[21] Hyunjong Park, Sanghoon Lee, Junghyup Lee, and Bum-sub Ham. Learning by aligning: Visible-infrared person re-identification using cross-modal correspondences. In Pro-ceedings of the ICCV, pages 12046–12055, 2021.

[22] Mang Ye, Jianbing Shen, David J Crandall, Ling Shao, andJiebo Luo. Dynamic dual-attentive aggregation learning forvisible-infrared person re-identification. In Proceedings ofthe ECCV, pages 229–247, 2020.

[23] Mang Ye, Zheng Wang, Xiangyuan Lan, and Pong C Y uen.Visible thermal person re-identification via dual-constrainedtop-ranking. In Proceedings of the IJCAI, pages 1092–1099,2018.

[24] Zhixiang Wang, Zheng Wang, Yinqiang Zheng, Y ung-Y uChuang, and Shin’ichi Satoh. Learning to reduce dual-leveldiscrepancy for infrared-visible person re-identification. InProceedings of the CVPR, pages 618–626, 2019.

[25] J. Zhang, X, Li, C. Chen, M. Qi, J. Wu, and J. Jiang, “Global-localgraph convolutional network for cross-modality person re-identification,” Neurocomputing, vol. 452, pp. 137-146, 2021.

[26] G. Wang, T. Zhang, J. Cheng, S. Liu, Y. Yang, and Z. Hou, “Rgb-infrared cross-modality person re-identification via joint pixel and feature alignment.” IEEE, 2019, pp. 3622–3631.

[27] M. Ye, J. Shen, and L. Shao, “Visible-infrared person re-identification via homogeneous augmented tri-modal learning,” IEEE Trans. Inf. Forensics Secur, vol. 16, pp. 728-739, 2021.

[28] M. Ye, J. Shen, D. J. Crandall, L. Shao, and J. Luo, “Dynamic dual-attentive aggregation learning for visible-infrared person reidentification,” ser. Lecture Notes in Computer Science, A. Vedaldi, H. Bischof, T. Brox, and J. Frahm, Eds., vol. 12362. Springer, 2020, pp. 229–247.

Downloads

Published

2024-09-29

Issue

Section

Articles

How to Cite

Fu, Zheyu, Xiaoguang Hu, and Xu Wang , trans. 2024. “Cross-Modal Pedestrian Re-Identification Based on Spatially Enhanced Dual-Stream Network”. Scientific Journal of Intelligent Systems Research 6 (9): 13-23. https://doi.org/10.54691/5gwfwz21.