Nowadays, various videos arise in a large volume and the scale has been increasing in the last decade. Video event detection, as a very promising field, has comprehensive applications in a number of essential fields. Usually, these fields would have significant influences on economic development, the people’s livelihood and national security, such as traffic & transportation, security & surveillance, defense & military affairs. Hence, it has attracted much attention from both academia and industry. ..To summarize, it actually faces several challenges. First of all, the videos often originate from multiple sources, which make the risk of noise corruption greater and cause the more difficulty in detecting events. This is regarded as the contradiction between the multi-source visual scenes and the non-robustness of event detection methods. Second, labeling the events in the video is very time-consuming and requires domain knowledge, which adds to the cost of obtaining the labeled data. Thus, there exists the inconsistency between the small number of the labeled video instances and the strict demand of the event detection model. Last but not least, intelligent video understanding requests much more information derived from the visual features. As a matter of fact, it demonstrates the huge gap between the low-level visual features and the high-level video semantics.. .To address the above issues, we put much emphasis on the video event detection by deep multi-feature data representation for multi-source visual scenes, and explore the way to strengthen and improve the reliability, accuracy and computational efficiency of event detection. Specifically, we will delve into three aspects including multi-source visual feature extraction and encoding, constrained low-rank and low-dimensional representation learning, and the construction and prediction of the event detection model. Moreover, from the perspective of statistical machine learning, a series of relevant theories, methods and key technologies would be developed to adapt to video event detection for multi-source visual scenes. Furthermore, we attempt to form the preliminary architecture and build a unified platform for video event detection. Once the approval of this proposal and the implementation of this project, it will be very beneficial to rich the theories and the methods in the field of video event detection, while it is able to provide some sensible problem solutions in many real-world applications. Therefore, this project plays a very meaningful and important role in both scientific research and a wide range of applications.
视频事件检测在交通运输、安防监控、国防军事等影响经济民生和国家安全的关键领域具有重大应用潜力,已引起学术界和工业界的广泛兴趣。当前面临的主要挑战有:存在视觉场景多源化与事件检测方法欠鲁棒的矛盾、存在标记视频数据少和事件检测模型要求高的矛盾、存在低层视觉特征与高层视频语义之间的鸿沟。. 为了有效应对上述问题,项目重点研究多源视觉场景下基于深度多特征数据表示的视频事件检测,探讨如何提高事件检测的可靠性、准确性和计算效率。具体来说,拟在多源视觉特征提取与编码、约束低秩低维数据表示学习、事件检测模型构建与预测等三方面展开深入研究;从统计机器学习角度探索并提出适应于多源视觉场景下事件检测的一系列相关理论方法和关键技术,试图形成初步体系并建立统一视频事件检测平台。项目实施必将进一步丰富在视频事件检测领域的理论和方法,并为实际应用问题提供合理解决方案,因而具有重要的科学研究价值和广阔的应用前景。
视觉大数据的事件分析与检测是多媒体计算与计算机视觉领域具有挑战性的重要研究课题,在视频数据呈井喷式增长的互联网+时代显得尤为关键,亟需深入探索和研究。本项目围绕视觉数据理解主要研究了约束低秩视觉表示、视频事件检测模型、深度视频摘要、视频显著性预测等内容,提出了基于最小均方正则化的约束低秩学习算法用于获得更优的视觉数据低秩表示,在线鲁棒低秩张量建模方法用于快速处理视频流数据,基于循环压缩卷积网络的短视频事件检测模型;基于循环一致对抗长短时记忆网络的无监督视频摘要技术;摄像辅助机制下基于非负矩阵分解的视频显著区域预测模型。这些新理论和新方法将为智慧城市、数字物流、智能安防等实际应用场景的视觉理解问题提供解决方案和技术支撑。项目研究期间共发表和录用受资助论文7篇,其中IEEE Trans.一区5篇(TNNLS和TCYB)、CCF-A类顶会2篇(AAAI和IJCAI),国家发明专利授权3项、受理2项,1人晋升副教授。
{{i.achievement_title}}
数据更新时间:2023-05-31
基于 Kronecker 压缩感知的宽带 MIMO 雷达高分辨三维成像
低轨卫星通信信道分配策略
基于细粒度词表示的命名实体识别研究
基于分形维数和支持向量机的串联电弧故障诊断方法
基于二维材料的自旋-轨道矩研究进展
基于多视角深度特征融合的遥感视频运动目标检测跟踪
场景深度关系下的视频遮挡目标检测
多源异构数据中基于迁移学习的事件检测研究
基于深度学习的多源协同视觉显著性检测模型研究