面向多语法语义日志的数据中心网络设备异常检测机制研究

基本信息

批准号：61902200

项目类别：青年科学基金项目

资助金额：25.00

负责人：张圣林

学科分类：

依托单位：南开大学

批准年份：2019

结题年份：2022

起止时间：2020-01-01 - 2022-12-31

项目状态：已结题

项目参与者：

关键词：

主动测量网络测量网络性能测量

结项摘要

Detecting anomalies of network devices is an important aspect to keep datacenters stable, which is of vital importance to the safety and stability of a nation’s administration, finance, telecommunication, and Internet. Based on our previous investigations, we find that log-based anomaly detection for network devices can greatly improve the performance of further anomaly localization and root cause analysis. Consequently, in this project we try to propose a log-based anomaly detection mechanism for network devices, in order to accurately, efficiently and universally parse logs and detect anomalous single logs and log sequences. We propose a frequent-item based prefix-tree which dynamically adds branches to construct template sets, so as to accurately extract the events, incrementally learn templates, and quickly match logs to templates. Because the logs of different models of network devices are very different in syntax and semantics, we weight words based on both frequency and location in constructing feature vectors for the bag-of-words model. In addition, we use the PU learning model to learn anomalous patterns from partially labelled single anomalous logs. Moreover, we propose a universal anomaly detection method for log sequences, which aims to address the challenges of noisy signals and sample imbalance. Based on the semantics of words, we leverage the word embedding method to cluster message templates. This project will improve the efficiency and accuracy of anomaly detection for network devices in datacenter networks, and is helpful to further anomaly localization and root cause analysis.

对网络设备进行异常检测，是保证数据中心稳定的重要手段，对于保障国家行政、金融、电力、电信、互联网等方面的安全与稳定至关重要。申请人前期研究发现，基于日志的网络设备异常检测可为进一步的异常定位和根因分析奠定坚实的基础。鉴于此，本项目致力于提出一种基于日志的网络设备异常检测机制，以准确、高效、通用地解析日志并检测单条异常日志和异常日志序列。提出可动态添加分支的频繁项前缀树结构构建模板集合，实现准确提取事件、支持增量式学习模板、快速匹配模板的日志解析。针对不同型号设备日志的语法语义存在较大差异，综合考虑词频和位置，使用词袋模型构建特征向量，并使用PU learning基于部分标记的单条异常日志学习异常模式。此外，提出面向日志噪声和样本失衡的通用异常日志序列检测方法，使用词嵌入方法基于单词语义对消息模板聚类。本项目的实施，将提高数据中心网络设备异常检测的效率和准确性，有利于异常定位和根因分析。

项目摘要

高效、准确的日志检测异常对于业务管理和系统维护至关重要，为ICP服务性能管理带来了机遇与挑战。本体系结构不仅能主动发现网络设备的异常以及时采取应对措施，而且克服了基于监控指标数据进行异常检测的缺点。研究内容包括三部分：网络设备日志解析，基于部分异常标记的网络设备单条异常日志检测，网络设备异常日志序列检测方法。经过2年研究，项目组在上述3项研究内容取得了重要进展。项目组提出了以一种增量方式学习新的模板LogParse，实现了自适应跨服务；提出了利用部分标记对异常日志进行自动检测的分类框架LogClass，解决了语法语义存在较大差异的问题；提出了一种面向多语法日志的通用异常检测机制LogMerge，解决异常日志序列中噪声信号较多、不同型号网络设备各异问题。这些成果丰富了数据中心网络设备异常检测的理论和方法，在相关关键技术上有较大突破和创新。为了准确、全面地评估日志异常检测机制，本项目共使用了8个公共日志数据集进行实验。项目执行期间，项目组发表了学术论文11篇，其中IEEE Transactions 国际期刊 2篇，申请国内发明专利6项。共联合培养硕士毕业生4人，其中1人获得南开大学“优秀毕业生”荣誉称号。项目组圆满完成了项目计划书的研究计划，达到了预期的研究目标。

项目成果

DOI：{{i.doi}}

发表时间：{{i.publish_year}}

暂无此项成果

数据更新时间：2023-05-31

其他相关文献

DOI：10.13334/j.0258-8013.pcsee.190276

发表时间：2020

DOI：10.11842/wst.20190724002

发表时间：2020

DOI：

发表时间：2017

DOI：

发表时间：2017

DOI：10.3969/j.issn.1674-0696.2020.10.20

发表时间：2020

张圣林的其他基金

相似国自然基金

基于语义和视觉差异的网络异常检测

批准号：U1836213

批准年份：2018

负责人：段海新

学科分类：F0205

资助金额：247.00

项目类别：联合基金项目

基于多语义信息融合的学术文献引文推荐研究

批准号：71673211

批准年份：2016

负责人：陆伟

学科分类：G0414

资助金额：51.00

项目类别：面上项目

面向对象的高光谱异常检测

批准号：61772510

批准年份：2017

负责人：卢孝强

学科分类：F0210

资助金额：61.00

项目类别：面上项目

日志模式提炼与跨类型日志分析方法研究

批准号：61702477

批准年份：2017

负责人：赵一宁

学科分类：F0211

资助金额：25.00

项目类别：青年科学基金项目

面向多语法语义日志的数据中心网络设备异常检测机制研究

{{i.achievement_title}}

暂无此项成果

其他相关文献

多能耦合三相不平衡主动配电网与输电网交互随机模糊潮流方法

基于文献计量学和社会网络分析的国内高血压病中医学术团队研究

多元化企业IT协同的维度及测量

汽车侧倾运动安全主动悬架LQG控制器设计方法

含饱和非线性的主动悬架系统自适应控制

张圣林的其他基金

相似国自然基金