In recent years, multi-task clustering has attracted a lot of research attentions in machine learning due to its ability in improving the clustering performance by learning multiple learning tasks jointly and simultaneously. Despite the success of traditional multi-task clustering models, they typically solve a non-convex optimization problem and thus are easy to stuck into local optima. In addition, they are sensitive to noisy data and outliers. Self-paced learning is an emerging machine learning method for solving non-convex optimization problems. This project will propose novel multi-task clustering models based on self-paced learning. Specifically, the research mainly focuses on the following aspects: (1) A self-paced multi-task clustering model is studied, which uses self-paced learning to select data examples with increasing complexity to train multi-task models, and uses soft weighting scenario to reduce the impact of noisy data and outliers; (2) Study a self-paced multi-task deep embedded clustering model, which uses deep neural network to obtain better representation learning ability and to further improve the clustering performance; (3) For clustering problem with multiple views for each task, a self-paced multi-task multi-view clustering model is studied to effectively utilize the relationships among both tasks and views. This project provides new ideas and methods for the study of multi-task clustering. The proposed models will be used in image clustering, text clustering, and data analysis of Alzheimer's disease, which has important theoretical and practical value.
多任务聚类能够同时学习多个任务来提升聚类性能,近年来在机器学习领域引起了很多关注。尽管已有多任务聚类模型取得了成功,但它们往往求解一个非凸优化问题而容易陷入局部最优,而且他们对噪声数据和离群点敏感。自步学习是一种新兴的用于解决非凸优化问题的机器学习方法。本项目将基于自步学习提出新的多任务聚类模型。具体研究内容:(1)研究一种自步多任务聚类模型,利用自步学习从易到难地选择样本进行多任务模型的训练,并使用软赋权方法减小噪声数据和离群点的影响。(2)研究一种自步多任务深度嵌入聚类模型,利用深度神经网络获取更强的表征学习能力,以进一步提升聚类性能。(3)针对每个任务存在多个视图的聚类问题,研究一种自步多任务多视图聚类模型,以有效地利用任务之间以及视图之间的关系。本项目为多任务聚类的研究提供了新的理念与方法,提出的模型将用于图像聚类、文本聚类以及阿兹海默病的数据分析中,具有重要的理论意义和实际价值。
多任务聚类由于求解非凸优化问题而容易陷入局部最优解,而且已有模型对噪声数据和离群点敏感。自步学习是一种解决非凸优化问题、提升模型泛化能力的机器学习方法。为此,本项目主要研究基于自步学习的多任务聚类方法,取得了以下成果:(1)针对单任务单视图聚类,提出了并行聚类、半监督深度嵌入聚类、基于层次特征采样的半监督集成聚类等新模型;(2)针对自步多视图聚类,提出了自步多视图聚类、自步自动赋权的多视图聚类、自步多视图聚类的非线性融合、双向自步多视图聚类等模型;(3)针对自步多任务聚类,提出了自步多任务聚类算法;(4)针对自步多任务多视图聚类,设计了基于capped-norm的自步多任务多视图聚类方法;(5)深度学习理论研究方面,提出了基于密度的深度图像聚类、基于协同训练的深度多视图聚类、基于变分自编码器的多视图聚类、深度不完全多视图聚类等模型。所有方法实验评测结果均为优良。本项目在TGRS、Neural Networks、KBS、Information Sciences、Neurocomputing等期刊,以及AAAI、ICCV、ACM MM、ICONIP、ISICDM等会议上发表高水平论文21篇,申报2项专利,超额完成了既定任务。本项目为多任务聚类、多视图聚类与自步学习的研究提供了新的理念与方法,提出的模型可用于图像聚类、文本聚类以及医学大数据分析中,具有重要的理论意义和应用前景。
{{i.achievement_title}}
数据更新时间:2023-05-31
基于多模态信息特征融合的犯罪预测算法研究
基于公众情感倾向的主题公园评价研究——以哈尔滨市伏尔加庄园为例
惯性约束聚变内爆中基于多块结构网格的高效辐射扩散并行算法
基于协同表示的图嵌入鉴别分析在人脸识别中的应用
一种改进的多目标正余弦优化算法
面向流式数据的在线自步多任务特征学习研究
基于相似度学习的异构数据聚类算法研究及其应用
基于抑制式竞争学习机制的模糊聚类算法研究
基于迁移学习的智能多任务高性能优化算法研究