Big data analytics stands at the forefront of modern science and industry. Beyond the sheer volume of data generated, big data usually encompasses a wide range of modalities (or “views”) from different sources of varying degrees of quality. For example, human behaviour data are composed of human speech signal, expression, pose, the scene where the behaviour happens, etc. Conventional multi-view learning (MVL) solutions are often employed to handle the variety property of big data (four V’s: volume, velocity, variety, and value). MVL tends to introduce one function to model each particular view towards the target label and then jointly optimise all the functions to improve learning performance. Although diverse views improve the capacity to comprehensively model examples, their rich semantics and complex structures can weaken the relatedness between views and the target, thus hampering efforts to learn meaningful individual view functions. For example, saying “hello” is weakly related to the behaviour “shaking hands” because it is also associated with other behaviours such as “phone talking” and “classroom teaching”. The demand for strong relatedness between views and the target impairs the efficacy of classical MVL algorithms. This project aims to tackle this conflict by developing a new learning scheme: multi-view synergistic learning (MVSL). Compared with MVL, MVSL: (1) suggests that several views must be integrated as a whole to improve their relatedness with the target; for example, human speech “hello” and human pose “stretching out the hands” together relate much more to the behaviour “shaking hands”; (2) learns a synergy function between views to formulate how views work synergistically towards the target instead of learning a mapping from each view to the target based on their weak relatedness; and (3) provides a systematic approach to handling rich semantics, complex structures, and various qualities of views, largely improving the practicability of algorithms in real-world applications.
大数据通常包含来自不同来源、不同质量的各种模态(或者视图)。传统的多视图学习通常使用一个函数来为每一个视图单独建模,然后联合优化所有函数来提高学习性能。多视图数据丰富的语义和复杂结构可能会削弱视图和目标之间的相关性,从而阻碍了学习有效的单个视图的函数。例如,说”你好”与“握手”的行为微弱相关,因为它还与其他行为相关,例如”电话交谈“和”课堂学习“。对视图和目标之间强相关性的需求削弱了经典多视图学习算法的效果。该项目旨在通过开发一种新的多视图协同学习方法来解决这一难题。与传统的多视图学习算法相比,多视图学习算法(1)强调多个视图必须整合在一起来提高它们和目标之间的相关性;(2)学习视图之间的协同作用,可以增强多视图学习的性能;(3)提供了一种系统处理丰富语义、复杂结构和质量不一的视图等问题的方法,大大提高了算法的实用性。
大数据通常包含来自不同来源、不同质量的各种模态(或者视图)。传统的多视图学习通常使用一个函数来为每一个视图单独建模,然后联合优化所有函数来提高学习性能。多视图数据丰富的语义和复杂结构可能会削弱视图和目标之间的相关性,从而阻碍了学习有效的单个视图的函数。本项目通过开发一种新的多视图协同学习方法来解决对视图和目标之间强相关性的需求问题。与传统的多视图学习算法相比,多视图学习算法强调多个视图必须整合在一起来提高它们和目标之间的相关性;学习视图之间的协同作用,可以增强多视图学习的性能;提供了一种系统处理丰富语义、复杂结构和质量不一的视图等问题的方法,大大提高了算法的实用性。本项目在顶尖国际期刊和会议上一共发表12篇高水平论文,包括IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), CVPR, AAAI等。本项目研究鲁棒(Robust)的多视图协同学习算法,充分考虑实际应用中人类行为视频数据普遍存在的视图复杂性、复杂噪声和视图缺失等问题。基于多视角协同学习的算法可广泛应用于行为分析、视频监控和自动驾驶汽车等方面。
{{i.achievement_title}}
数据更新时间:2023-05-31
基于多模态信息特征融合的犯罪预测算法研究
基于公众情感倾向的主题公园评价研究——以哈尔滨市伏尔加庄园为例
惯性约束聚变内爆中基于多块结构网格的高效辐射扩散并行算法
基于协同表示的图嵌入鉴别分析在人脸识别中的应用
一种改进的多目标正余弦优化算法
面向异构环境的多任务多视图学习算法研究
多视图深度学习的RGBD人体行为识别与理解
面向不完整信息的多视图数据表示、恢复与学习
道路交通系统行为的多视图学习辨识方法研究