Multi-sensor image fusion is a technique, by which multiple images of the same scene acquired from different sensors are merged into a composite that helps to better understand the scene. The primary requirement for image fusion is that all of the important visual information should be extracted from the given input images and then preserved in the fused image. By doing so, the fused image will be better perceived by the human visual system or processed in the subsequent tasks. However, most of the current studies only employ the local features to define the "usefulness" for each region in an input image. This does not consider the global visual attention of the whole source images which indicates the "interesting" information of the scene. In addition, most of these methods focus on the fusion of static images. .This project will investigate multi-sensor image fusion and its related issues by using sparse representation and image/video saliency detection. (1). Considering different characteristics of input images to be fused, together with their "redundant" and "complementary" information, we will study single modality (e.g., infrared) image saliency detection and multimodality (e.g., infrared and visible light) images joint-saliency detection based on sparse representation. (2). We will next study multi-sensor image fusion based on sparse representation and human visual attention. We will employ the saliency values of input images computed by using the previously presented methods in (1) to guide the designs of sparse representation models, activity levels of local regions and fusion rules in the proposed methods. The proposed methods are expected to effectively extract more salient features from input images. As well, they are expected to be more robust to mis-registration between input images. (3). Based on the above, we will also investigate the fusion of videos with static backgrounds or videos with more complicated backgrounds, such as those that change dynamically. Important examples are videos containing occlusions or waving..The research carried out by this project will lead to new methods for multi-sensor image fusion and thereby facilitate new practical applications.
图像融合是把同一场景不同传感器获得的多幅图像合成一幅图像的过程,融合后图像应尽可能地保留输入图像的有用信息,以便人眼视觉感知和后续处理。多数研究主要根据输入图像中每个局部区域的特征定义其“有用”性,没有考虑输入图像或场景中吸引人眼注意的全局“感兴趣”信息,且主要针对静态图像融合设计的。本项目拟采用稀疏表示工具,结合图像/视频显著性检测技术,研究多传感器图像融合及相关问题。1. 根据待融合图像特点及它们之间的冗余和互补信息,研究基于稀疏表示的单模态图像显著性检测和多模态图像联合显著性检测;2. 研究基于稀疏表示和视觉注意机制的多传感器图像融合,利用1中获得的输入图像显著性值指导融合所需的稀疏表示模型、局部区域活跃度因子和融合规则的构建;3. 将上述研究内容扩展至视频融合,探讨包含静态背景、复杂或动态变化背景等不同情况下的视频融合。本项目将丰富多传感器图像融合方法,为其实用化奠定基础。
本项目围绕着预定的研究内容展开研究,完成了相应的研究目标。首先,我们研究了基于图像表示的多传感器图像融合,提出了多种基于鲁棒稀疏表示、非负稀疏表示和联合低秩表示的多聚焦图像融合算法;所提出的算法不仅考虑当前图像块的信息,还考虑其周围相邻图像块的信息,显著提升了融合性能。其次,研究了基于图像表示和深度学习的单模态图像、多模态图像显著目标检测,提出了多种基于鲁棒稀疏表示、局部树结构低秩表示和二阶段图的单模态图像显著目标检测方法,提出了多种基于卷积神经网络和胶囊网络的单模态图像深度显著目标检测模型,很好地提升了复杂场景下单模态图像显著目标检测的精确性和目标分割完整性;提出了多种基于跨模态图像特征融合、图像质量感知的可见光图像-红外图像(RGB-T)、可见光图像-深度图像(RGB-D)多模态图像显著性目标检测模型,在一定程度上解决了单模态图像在低光照、低对比度、复杂背景等场景下存在的目标检测不完整甚至无法检测的问题。再次,我们研究了多传感器视频融合相关问题,提出了一种基于射影不变描述算子的视频同步算法。最后,我们对多模态图像场景理解和医学图像分割进行了研究,提出了一种基于模态差异性缩减的RGB-T语义分割模型,提出了一种基于互补和干扰感知的RGB-T多模态图像目标跟踪模型,提出了多种基于多模态特征融合、多任务学习、轻量化等MRI脑瘤图像和胰腺图像分割模型。. 在该项目的资助下,课题组对多传感器图像融合及其应用进行了深入系统的研究,在IEEE-TPAMI、IEEE-TIP、IEEE-TMM、IEEE-TCSVT、Pattern Recognition、ICCV、CVPR等国际期刊和CCF A类会议发表相关学术论文29篇,新申请国家发明专利10项,新获批授权国家发明专利8项,荣获“吴文俊人工智能科学技术奖”(技术发明奖,二等奖)1项,共培养博士研究生5名,硕士研究生28名,其中21人已完成答辩。
{{i.achievement_title}}
数据更新时间:2023-05-31
农超对接模式中利益分配问题研究
环境类邻避设施对北京市住宅价格影响研究--以大型垃圾处理设施为例
基于SSVEP 直接脑控机器人方向和速度研究
内点最大化与冗余点控制的小型无人机遥感图像配准
基于多模态信息特征融合的犯罪预测算法研究
TGFβ/Smad3抑制lncRNA-ANCR表达在干细胞错误分化及钙化性肌腱炎中的作用及机制研究
基于视觉注意机制的多尺度图像融合的研究
基于视觉注意和稀疏表示的行人检测与跟踪方法研究
基于视觉显著性和稀疏表示的图像质量评价
基于统计建模和稀疏表示的图像视频增强表达和高效编码