Imbalanced data learning is one of the challenges in big data processing. This program aims at a systematic study on the primary problem, namely “What to learn?”, in the imbalanced data learning. In a theoretical level, we will explore what the specific learning targets will be required by the imbalanced data learning in both “linguistic” and “computational” levels, respectively. A study will be made on the intrinsic properties of the learning targets and evaluation criteria, so that we can reach a theoretical understanding why some measures are proper in dealing with imbalanced data learning, some are not. We will further explore the information-based learning targets and criteria in comparison with the non-information ones, and will derive the their relations with respect to the imbalance ratio. The goal of the analytical study is to provide the guidelines in the selections of learning targets and evaluation criteria. In the approach level, we will advance the current classifiers with the abstaining functions for wider applications. We will study on the optimization of reject threshold and its associated properties. We will further explore the information-based learning targets and criteria in comparison with the non-information ones. Their connections are investigated. A novel boosting classifier will be developed by setting the multiple learning targets for a classifier-example study towards a large-scale data process. These targets will include the adaptation of imbalance ratio in the data, abstaining and non-abstaining classification, and convexity optimization. The final goal of this program is to put forward on the new study theme of “learning target selection” in machine learning and to provide a study example in the abstaining classifier design in imbalanced data learning.
不平衡数据学习是大数据中的挑战之一。本课题旨在针对不平衡数据学习中首要问题“学习目标选择”进行系统性研究。在理论层面,探讨不平衡数据学习对“语义”与“计算”表达层面的特定学习目标;分析各种学习目标或评价准则的本质属性,解释为什么有些学习目标或准则能够完成不平衡数据学习任务,有些则无法胜任;推导各种常规性能类和信息类学习目标或评价准则与不平衡数据比的定量或定性关系。理论研究将为应用中选择学习目标或评价准则提供理论依据。在方法层面,扩展现有分类器包括拒识功能的应用,研究优化拒识学习目标及其拒识中优化门槛值性质;开展面向大规模数据的Boosting分类器研究,使其能够实现带拒识功能的学习,自适应于不平衡比的优化门槛值调节,并尽量兼容“凸优化”的学习目标。本课题的最终目标是推动以“学习目标选择”为主题的新视角研究方向,并为不平衡数据学习中包容拒识功能的分类器设计提供具体研究实例。
不平衡数据学习是大数据中的挑战之一。本课题针对不平衡数据学习中首要问题“学习目标选择”进行了系统性研究。在理论层面,探讨了不平衡数据学习对“语义”与“计算”表达层面的特定学习目标;分析了各种学习目标或评价准则的本质属性,对于学习目标或评价准则是否能胜任不平衡数据中的学习任务,以人脸图像为例进行了解释;推导了两种常规性能类和信息类学习目标或评价准则与不平衡数据比的定量或定性关系。理论研究为应用中选择学习目标或评价准则提供了理论依据。在方法层面,我们扩展了现有分类器包括拒识功能的应用,研究优化了拒识学习目标及其拒识中优化门槛值性质。本课题的研究成果推动了以“学习目标选择”为主题的新视角研究方向,并为不平衡数据学习中包容拒识功能的分类器设计提供了具体研究实例。
{{i.achievement_title}}
数据更新时间:2023-05-31
多能耦合三相不平衡主动配电网与输电网交互随机模糊潮流方法
带有滑动摩擦摆支座的500 kV变压器地震响应
基于腔内级联变频的0.63μm波段多波长激光器
新型树启发式搜索算法的机器人路径规划
智能煤矿建设路线与工程实践
高维不平衡数据的集成学习算法研究
基于半监督集成学习的不平衡数据研究
面向不平衡数据的学习算法及应用研究
基于集成学习的不平衡流数据分类问题研究