Cross‐modal retrieval based on deep regularized hashing constraints
2022; Wiley; Volume: 37; Issue: 9 Linguagem: Inglês
10.1002/int.22853
ISSN1098-111X
AutoresAsad Khan, Sakander Hayat, Muhammad Ahmad, Jinyu Wen, Muhammad Umar Farooq, Meie Fang, Wenchao Jiang,
Tópico(s)Image Retrieval and Classification Techniques
ResumoInternational Journal of Intelligent SystemsEarly View RESEARCH ARTICLE Cross-modal retrieval based on deep regularized hashing constraints Asad Khan, Asad Khan orcid.org/0000-0002-1261-0418 School of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou, China Asad Khan and Sakander Hayat contributed equally to this study.Search for more papers by this authorSakander Hayat, Sakander Hayat orcid.org/0000-0002-6842-7604 School of Mathematics and Information Sciences, Guangzhou University, Guangzhou, China Asad Khan and Sakander Hayat contributed equally to this study.Search for more papers by this authorMuhammad Ahmad, Muhammad Ahmad orcid.org/0000-0002-3320-2261 Department of Computer Science, National University of Computer and Emerging Sciences (NUCES-FAST), Faisalabad Campus, Chiniot, PakistanSearch for more papers by this authorJinyu Wen, Jinyu Wen School of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou, ChinaSearch for more papers by this authorMuhammad Umar Farooq, Muhammad Umar Farooq orcid.org/0000-0002-5545-3379 School of Computer Science and Technology, University of Science and Technology of China, Hefei, ChinaSearch for more papers by this authorMeie Fang, Corresponding Author Meie Fang fme@gzhu.edu.cn orcid.org/0000-0003-4292-8889 School of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou, China Correspondence Meie Fang, School of Computer Science and Cyber Engineering, Guangzhou University, 510006 Guangzhou, China. Email: fme@gzhu.edu.cnSearch for more papers by this authorWenchao Jiang, Wenchao Jiang orcid.org/0000-0002-6300-1962 School of Computers, Guangdong University of Technology, Guangzhou, ChinaSearch for more papers by this author Asad Khan, Asad Khan orcid.org/0000-0002-1261-0418 School of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou, China Asad Khan and Sakander Hayat contributed equally to this study.Search for more papers by this authorSakander Hayat, Sakander Hayat orcid.org/0000-0002-6842-7604 School of Mathematics and Information Sciences, Guangzhou University, Guangzhou, China Asad Khan and Sakander Hayat contributed equally to this study.Search for more papers by this authorMuhammad Ahmad, Muhammad Ahmad orcid.org/0000-0002-3320-2261 Department of Computer Science, National University of Computer and Emerging Sciences (NUCES-FAST), Faisalabad Campus, Chiniot, PakistanSearch for more papers by this authorJinyu Wen, Jinyu Wen School of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou, ChinaSearch for more papers by this authorMuhammad Umar Farooq, Muhammad Umar Farooq orcid.org/0000-0002-5545-3379 School of Computer Science and Technology, University of Science and Technology of China, Hefei, ChinaSearch for more papers by this authorMeie Fang, Corresponding Author Meie Fang fme@gzhu.edu.cn orcid.org/0000-0003-4292-8889 School of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou, China Correspondence Meie Fang, School of Computer Science and Cyber Engineering, Guangzhou University, 510006 Guangzhou, China. Email: fme@gzhu.edu.cnSearch for more papers by this authorWenchao Jiang, Wenchao Jiang orcid.org/0000-0002-6300-1962 School of Computers, Guangdong University of Technology, Guangzhou, ChinaSearch for more papers by this author First published: 09 February 2022 https://doi.org/10.1002/int.22853 Read the full textAboutPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onFacebookTwitterLinked InRedditWechat Abstract Cross-modal retrieval has attracted great attention due to the increasing demand for tremendous amounts of multimodal data in recent years. These retrievals could either be text-to-image or image-to-text. To address the problem of inappropriate information included between images and texts, we propose two cross-modal recovery techniques established on a dual-branch neural network defined on a common subspace and the hashing learning method. First, a cross-modal recovery technique established on a multilabel information deep ranking model (MIDRM) is provided. In this method, we introduce a triplet-loss function into the dual-branch neural network model. This function takes advantage of the semantic information of the bimodal components, focusing on not only the similarities between similar images and text features but also the distances between dissimilar images and texts. Second, we establish a new cross-modal hashing technique said to be the deep regularized hashing constraint (DRHC). In this method, the regularized function is used to replace the binary constraint, and the discrete value is constrained to a certain numerical range so that the network can achieve end-to-end training. Overall, the time complexity is greatly improved, and the occupied storage space is also greatly reduced. Different experiments on our proposed MIDRM and DRHC models demonstrate their superior performance to those of the state-of-the-art methods on two widely used data sets. The experimental results show that our approach also increases the mean average precision of cross-modal recovery. Early ViewOnline Version of Record before inclusion in an issue RelatedInformation
Referência(s)