It is a typical problem that using a text query to search for image and using a image query to search for text. Convolutional neural network and generative adversarial network have been applied in a wide variety of fields, including image processing and natural language processing and so on, provides a strong basis for modal information retrieval technology. This project study two kinds of model data information retrieval task. This project takes image and text modal data as the research object. This project's purpose is to effectively solve the problem of the construction of image-text common subspace and semantic association between features in subspace is not strong, and to enhance cross-modal retrieval performance. First of all, we used convolution neural network to establish image-text feature projection model and modal classification model, and then used generative adversarial network's idea to construct cross-modal retrieval model .This model can be used to construct image-text common subspace which can integrate the representation learning and the related learning.Secondly, the semantic classifier based on depth classification network and the negative sample constraints are introduced to cross-modal retrieval model for digging deeper into the semantic association features. This method can reduce the loss of semantic information of thie data features, and to ensure the semantic discrimination between the features in the modal in the common subspace at the same time ensure inter-modal semantic consistency.Thirdly, for training the cross-modal retrieval model, we have designed training method and objective function, in order to accomplish the training task of cross-modal retrieval model efficiently. In the end, we use public real data set to validate models in this project, and develop a cross modal retrieval demo system.
图文互检是跨模态检索任务中的典型问题。卷积神经网络、生成对抗网络在图像及自然语言处理等领域的成功应用,为跨模态检索技术提供了有力依据。本项目以图像和文本两种模态数据作为研究对象,期望能够有效解决图文共享子空间构建与子空间中特征间语义关联不强的问题,提升图文跨模态检索性能。为此首先采用卷积神经网络建立图文特征投影模型和模态特征分类器,随后应用生成对抗网络的思想融合二者搭建跨模态检索模型,用以构建将表示学习和关联学习有机统一的图文共享子空间;接着引入基于深度分类网络的语义分类器和负样本约束,深入挖掘图文模态数据的语义关联特征来降低数据特征语义信息的损失,以期在共享子空间中同时保证模态内特征间的语义区分性和模态间特征的语义一致性,并以此为目的设计对应的训练方法和目标函数,期望高效完成跨模态检索模型的训练任务;最后选取公开的真实数据集对本研究构建的各种模型进行验证,并研制一个跨模态检索演示系统。
近年来,文本、图像、视频和音频等不同模态的数据呈现爆炸式增长,单一的数据检索模式已渐渐无法满足用户日益丰富的检索需求。跨模态检索的目标是以某一模态的数据(如文本)作为输入,从数据库中查找与输入相似的其他模态数据(如图像)。然而,由于不同模态数据存在异构性,使得衡量它们之间的相似性变得非常困难。本项目着眼于图像和文本两个模态,针对图文跨模态检索研究中目前存在的问题,将原申请计划内容进行拓展,从单模态表示学习和多模态关联学习、图文共享语义特征的表示学习、模态间特征的一致性及模态内特征的区分性、细粒度图像分类和细粒度图文跨模态检索五个方面展开研究。并将上述方法在自建的超声图文、枸杞虫害图文跨模态检索数据集进行验证,为行业检索应用提供了新方法和新手段。. 依托本项目,累计发表学术论文15篇,包括EI期刊检索3篇,EI会议检索4篇,CSSCI期刊检索1篇,CSCD期刊前15% 3篇,CSCD核心期刊1篇,CSCD扩展版期刊3篇;申请发明专利2项(授权1项);申请获批软件著作权3项;进行成果登记1项;培养年青教师2名,硕士研究生7名。项目负责人刘立波教授在项目执行期间,2022年入选宁夏科技领军人才、晋升中国计算机学会高级会员。
{{i.achievement_title}}
数据更新时间:2023-05-31
玉米叶向值的全基因组关联分析
涡度相关技术及其在陆地生态系统通量研究中的应用
论大数据环境对情报学发展的影响
跨社交网络用户对齐技术综述
农超对接模式中利益分配问题研究
基于生成对抗网络的释义文本生成研究
面向网络图像检索的弱监督多模态跨域机器学习方法研究
面向文本生成的生成式对抗网络算法与理论研究
基于跨层次生成对抗网络的移动端图像超分辨率研究