异构媒体深度分析与跨领域理解

基本信息

批准号：61702136

项目类别：青年科学基金项目

资助金额：25.00

负责人：孙晓帅

学科分类：

依托单位：哈尔滨工业大学

批准年份：2017

结题年份：2020

起止时间：2018-01-01 - 2020-12-31

项目状态：已结题

项目参与者：周尚辰,谢文龙,侯雨欣,郑影,夏雨

关键词：

视语嵌入跨领域理解图像理解深度学习多媒体检索

结项摘要

Multimedia big data is becoming more socialized and isomerized. Considering such inevitable trends, we propose a set of computational theories for deep analysis of heterogeneous multimedia data, which can direct the further exploration of the scientific problems of multimedia content analysis and understanding under the background of the big data era. This project starts with the investigation of the inner correlations within heterogeneous multimedia data, trying to construct a theoretical framework for hierarchical content parsing, cross-domain feature representation and weakly supervised lingual-visual embedding. We reveal the semantic correlations and the statistical characteristics existed in the socialized and isomerized multimedia data, and investigate the co-analysis and joint modeling strategies for both visual and lingual contents. By fusing the spatio-temporal features and the contextual constraints from both visual and lingual data, we could achieve large-scale and fine-grained lingual-visual embedding, serving as the theoretical foundations for heterogeneous multimedia analysis. We discover close correspondences between visual contents and lingual fragments, and further establish a unified representation for both visual and lingual data in the embedding space, on top of which we can achieve technical breakthroughs such as multimedia retrieval with complex lingual queries as well as knowledge visualization and descriptions. The final achievement of this project will be a unified system for the analysis and deep understanding of heterogeneous multimedia contents, and also a wide range of applications in the domain of both industry and education.

针对多媒体大数据的社会化、异构化趋势，建立一套面向异构媒体数据深层次分析的计算理论，指引探索和解决大数据背景下多媒体内容分析与理解中存在的科学问题。本项目将探索异构多媒体数据的内在关联特性，建立视觉多媒体数据的层次化解析、异构关联数据的跨领域特征表达、弱监督的视觉 - 语言嵌入学习等理论框架，揭示社会化、异构化媒体数据中泛在的语义关联和统计特性，探索视觉内容与语言内容的协同分析和联合建模策略，将视觉与语言各自的时空特性和上下文约束结合起来，逐步构建面向异构大媒体分析的计算模型，实现大规模、细粒度的视语嵌入学习，建立视觉内容与语言片段的紧凑关联，得到视觉内容与语言内容在嵌入空间中的统一表示，进而突破面向复杂口语化查询的多媒体信息检索，及知识的可视化和口语化描述等关键技术，最终实现异构大媒体数据的深度分析与理解，及其在工业、教育等领域的应用。

项目摘要

项目探索了异构多媒体数据的内在关联特性，建立视觉多媒体数据的层次化解析、异构关联数据的跨领域特征表达、弱监督的视觉 - 语言嵌入学习等理论框架，揭示了社会化、异构化媒体数据中泛在的语义关联和统计特性，探索视觉内容与语言内容的协同分析和联合建模策略，将视觉与语言各自的时空特性和上下文约束结合起来，逐步构建面向异构大媒体分析的计算模型，实现大规模、细粒度的视语嵌入学习，建立视觉内容与语言片段的紧凑关联，得到视觉内容与语言内容在嵌入空间中的统一表示，进而突破面向复杂口语化查询的多媒体信息检索，及知识的可视化和口语化描述等关键技术。课题组发表学术论文20篇，包含IJCV等10篇SCI期刊和CVPR、NeurIPS等10篇CCF-A类国际会议。

项目成果

DOI：{{i.doi}}

发表时间：{{i.publish_year}}

暂无此项成果

数据更新时间：2023-05-31

其他相关文献

DOI：10.3778/j.issn.1002-8331.1911-0012

发表时间：2020

DOI：10.6041/j.issn.1000-1298.2022.07.022

发表时间：2022

DOI：10.3724/SP.J.1089.2019.17435

发表时间：2019

DOI：10.1360/SSM-2020-0035

发表时间：2020

DOI：10.13376/j.cbls/2021092

发表时间：2021

孙晓帅的其他基金

相似国自然基金

基于深度迁移学习的跨领域视觉特征融合与分类

批准号：61806207

批准年份：2018

负责人：潘杰

学科分类：F0604

资助金额：25.00

项目类别：青年科学基金项目

基于博弈论的社交媒体分析与理解

批准号：61672137

批准年份：2016

负责人：陈彦

学科分类：F0205

资助金额：63.00

项目类别：面上项目

冠状动脉CT影像机器理解与深度分析

批准号：61672260

批准年份：2016

负责人：车翔玖

学科分类：F0201

资助金额：64.00

项目类别：面上项目

基于深度迁移学习的跨领域文本情感分类方法研究

批准号：61906110

批准年份：2019

负责人：赵传君

学科分类：F0606

资助金额：24.00

项目类别：青年科学基金项目

异构媒体深度分析与跨领域理解

{{i.achievement_title}}

暂无此项成果

其他相关文献

针对弱边缘信息的左心室图像分割算法

基于改进LinkNet的寒旱区遥感图像河流识别方法

信息熵-保真度联合度量函数的单幅图像去雾方法

现代优化理论与应用

骨外器官来源外泌体对骨骼调控作用的研究进展

孙晓帅的其他基金

相似国自然基金