Intrusion detection is an effective technique to solve the problem of network security. Intrusion detection based on massive data stream mining faces several urgent problems and challenges, such as high-dimensional feature, class imbalance, concept drift and semi-supervised learning. Traditional models have been difficult to meet intrusion detection needs in large-scale complicated network environment. On the basis of this, the project intends to build intrusion detection scheme for massive data stream mining in the cloud computing environment. The project first pre-processes the intrusion detection data stream, and presents the dimension reduction method based on feature subset selection and sample reduction based on hierarchical clustering and re-sampling. Secondly, an intrusion detection classification model based on the double-weighting online sequential extreme learning machine is presented, which uses a dual-weighting strategy to solve the class imbalance problem and uses the selective ensemble strategy to handle concept drift. Thirdly, aiming to the labeled samples too little in intrusion detection, the project combines with active learning strategy and proposes a semi-supervised learning method based on possibilistic c-means clustering. Concept drift in intrusion detection is solved by ensemble learning on semi supervised clustering. Finally, to verify the effectiveness of the proposed scheme, the project develops and implements a MapReduce-based cloud platform for massive data stream mining. Thus, the project forms a complete intrusion detection solution for massive data stream mining in the cloud computing environment.
入侵检测是解决网络安全问题的一种有效技术,基于海量数据流挖掘的入侵检测面临着几个亟需解决的问题和挑战:高维度特征、类不平衡、概念漂移、半监督学习等。传统模型已很难适应大规模复杂网络环境下的入侵检测需求。基于此,本项目拟在云计算环境下建立一个面向海量数据流挖掘的入侵检测方案。首先对入侵检测数据流进行预处理,提出基于特征子集选择的维度约简方法和基于分层聚类及重采样的样本约简方法;其次,提出基于双加权在线序贯极限学习机的入侵检测分类模型,通过双加权策略解决类不平衡问题,通过选择性集成策略处理概念漂移;然后,针对入侵检测中已标注样本过少的问题,结合主动学习策略,提出基于可能性c均值聚类的半监督学习方法,并通过对半监督聚类的集成学习,解决概念漂移;最后,开发及实现一个基于MapReduce的海量数据流挖掘云平台,验证提出方案的有效性。从而在云环境下形成一个完整的面向海量数据流挖掘的入侵检测解决方案。
入侵检测是解决网络安全问题的一种有效技术,基于海量数据流挖掘的入侵检测面临高维度特征、类不平衡、概念漂移、半监督学习等几个问题。传统模型已很难适应大规模复杂网络环境下的入侵检测需求。本项目针对高维度特征选择问题,提出了基于信息增益和粗糙集的入侵检测属性简约方法、基于信息增益和ReliefF特征选择的入侵检测方法、基于模糊c均值和灰狼优化的特征选择方法等;针对海量不平衡入侵检测数据流分类问题,提出了基于神经网络的不均衡数据流集成分类方法、基于ReliefF和Borderline-SMOTE相结合的网络入侵检测模型、基于PSO-GWO混合优化支持向量机的入侵检测系统,以及基于多视图学习的云计算平台异常检测与排序方法等;针对面向海量数据流的聚类和半监督学习问题,提出了基于交叉熵的安全Tri-training算法、基于困难聚类中心点的增量聚类方法、基于复合核的可能性C均值聚类算法,以及基于空间信息和遗传算法的半监督分类方法等;针对概念漂移问题,引入了多标签学习策略,提出了基于类属特征和实例相关性的多标签学习算法,以及基于分层校验的多标签数据流概念漂移检测方法;此外,还在迁移学习方面提出了基于无监督域适应的可区分联合匹配算法。项目组执行期间发表了31篇论文,其中SCI检索9篇,EI检索8篇,授权和申请发明专利3项。培养9名已毕业硕士研究生。
{{i.achievement_title}}
数据更新时间:2023-05-31
跨社交网络用户对齐技术综述
基于公众情感倾向的主题公园评价研究——以哈尔滨市伏尔加庄园为例
城市轨道交通车站火灾情况下客流疏散能力评价
基于FTA-BN模型的页岩气井口装置失效概率分析
基于全模式全聚焦方法的裂纹超声成像定量检测
面向入侵检测的数据流挖掘研究
基于粒计算与异常点挖掘的网络入侵检测研究
基于多层免疫的网络入侵检测方法研究
基于SOM神经网络模型的网络入侵检测方法研究