The ability of robot on multi-modality perception of the environment and the natural interaction with human beings are fundamental topics in the research Coexisting-Cooperative-Cognitive Robots, both on basic theories and design methodologies. Robots with this capability are promising have a wide range of applications in the future. However, the state-of-the-art methods in multi-modality vision and intelligent recognition have difficulties in describing the modes of human beings in viewing the 3D world, and the modes include multiview perception, multi-channel sensing, and multi-mode synthesis. Because of that, robot so far can hardly to understand the scenario and perform natural interaction. Challenges of that arise from visual saliency computation for 3D scenario, visual saliency driven 3D space representation, visual selective mechanism guided action understanding, and others. In this project, we dedicate all our efforts to explore multiview depth fusion based visual saliency computation model. In this model, a temporal-spatial-interview consistent 3D visual saliency will be obtained through visual perception and pooling over multiview and multi-modality sensing over complex and dynamic scenarios. Based on the visual saliency model, a visual selective mechanism will be proposed to guide the hierarchical segmentation on temporal, spatial and view dimensions for dynamic scenarios, and a selective core region will be the result of this model. Finally, a joint action recognition model will be proposed on key motion point detection and multi-modality motion features, and the recognition will be performed just on the obtained selective core regions. Based on that, a high efficiency framework of visual selective mechanism driven natural 3D action recognition and interaction can be built up. Innovations will come out from this project in the perspective of fundamental theories and key technologies, and these can contribute to the breakthrough of the research Coexisting-Cooperative-Cognitive Robots in field of natural interaction.
机器人以视觉的方法对环境进行多模态感知以及与人形成自然交互,是当前共融机器人基础理论和设计方法研究的重大需求,具有广泛的应用前景。当前多模态视觉、智能识别方法难以刻画人类视觉对三维环境的多角度观察、多渠道感知、多方式融合的模式,使得机器人仍难以主动融入环境、实现人机自然交互,涉及到的挑战性问题包括三维视觉关注度计算、视觉显著度下的环境表示、注意选择机制下的行为理解等。本项目探索三维场景下多深度融合视觉显著度计算方法,通过对复杂、动态环境的多角度、多模态视觉感知与融合计算,获取“时-空-视”一致的三维视觉关注度;基于此构建视觉注意选择机制,驱动形成动态场景下的时域、空域、视域层次化分割,产生视觉选择的关键区域;在该区域中,建立关键动作点与多模态运动特征联合的动作识别方法,形成注意选择机制下的高效、自然三维动作识别交互。项目在理论模型、关键技术上形成创新,使共融机器人在自然交互领域形成突破。
本项目旨在针对共融机器人的人机自然交互问题,研究三维场景下的多深度融合视觉显著度计算,获取空-时-视角一致稳定的深度信息,并结合彩色图像序列进行多深度融合的多模态视觉显著度计算,获取三维场景中的联合视觉显著度;进而针对以视频序列表达的场景信息,研究动作序列-关键帧-关键区域的不同层面上的自适应优化分割,去除无关的环境干扰因素,为高精度的动作识别提供准确的数据;构建基于相似图卷积网络的多人交互性三维动作识别算法,利用视觉注意机制提升识别效率及精度,最终实现复杂环境的融合感知与人机自然交互。本项目充分发掘视觉选择机制在“机器人-环境-人”共融中的重要作用。力图在三维环境下的人机交互过程中实现多角度观察、多渠道感知、多方式融合,在结合视觉特性的人体动作识别问题上实现核心理论与关键技术上的突破, 建立基于视觉注意机制的机器人交互验证平台,形成理论与技术的创新,推动人-机-环境共融下的自然交互的研究与发展。
{{i.achievement_title}}
数据更新时间:2023-05-31
农超对接模式中利益分配问题研究
基于SSVEP 直接脑控机器人方向和速度研究
基于细粒度词表示的命名实体识别研究
基于图卷积网络的归纳式微博谣言检测新方法
地震作用下岩羊村滑坡稳定性与失稳机制研究
NLRP3炎症小体-IDO1-犬尿氨酸信号调控海马小胶质细胞活化参与抑郁症形成的机制研究及其治疗价值
面向动态非结构环境的共融机器人自然交互方法研究
共融机器人视觉编程及在线人机协作机制研究
人机共融的灵巧柔顺下肢康复机器人交互方法与应用
小样本约束下情境化自然语言驱动的人机交互式共融机器人自动任务编程研究