基因组数据分析的基础理论与算法

基本信息

批准号：61732009

项目类别：重点项目

资助金额：260.00

负责人：朱大铭

学科分类：

依托单位：山东大学

批准年份：2017

结题年份：2022

起止时间：2018-01-01 - 2022-12-31

项目状态：已结题

项目参与者：吴方向,郭炅,李敏,姜海涛,彭小清,余颖,郭林沅,蒲莲容,杨润民

关键词：

功能区域发现结构变异预测算法设计组合优化问题建模基因组组装

结项摘要

Genome assembly is the basis of genome data analysis. Structural variant prediction and functional area finding are crucial in genome data analysis, and the premise for going from genome structure analysis to genome function analysis. The research aims at one dimensional as well as three dimensional genome data, launches to develop foundational theory and algorithms for genome data analysis such as genome assembly, structural variant prediction, functional area finding, et al. In hope of excluding noises in all kinds of genome data, the research start with finding effective methods to control the genome data quality, and then go to mine the genome data characteristics based on the computational requirements of the concrete genome data analysis. Using the genome data characteristics, the research sets to capture combinational problems to account for the computational requirements for genome data analysis, then sets to design exact, parameterized, and approximation algorithms for those problems used to formalize the computational requirements, so that a number of foundational high quality theoretical results on genome data analysis will be achieved. Finally, a bioinformatics software with proprietary intellectual property rights will be developed, and used in cancer genome data analysis. Putting the project into effect, we aim to make breakthroughs in the area of genome data analysis theory and algorithms, and contribute to explore the principles of cancer disease, and find diagnosis and treatment methods of it.

基因组组装是基因组数据分析的基础。结构变异预测和功能区域发现是基因组数据分析的核心，是从基因组结构分析走向功能分析的前提。本项目针对从一维到高维的基因组数据，围绕基因组组装、结构变异预测、功能区域发现等内容开展基因组数据分析基础理论与算法研究。寻找基因组数据质量控制方法，降低噪声对基因组数据分析结果的影响。根据基因组数据分析的计算需求，挖掘表达数据分析质量的特征参量。利用基因组数据特征参量建立组合优化问题模型，面向基因组数据特征，设计基因组数据分析组合优化问题的精确算法、参数算法和近似算法。取得一批在国内外有影响的基础理论成果。设计高效并行算法实现基因组数据分析计算需求。最后，开发具有自主知识产权的生物信息学软件，应用于肺癌等恶性肿瘤的基因组数据分析中。项目实施，在国际基因组数据分析基础理论与算法研究领域取得重大突破，为人类认识肿瘤疾病的发病机理，寻找诊断与治疗肿瘤疾病的手段做出贡献。

项目摘要

开展了基因组和转录组组装、片段框架构建和填充、基因组重排序、测序数据质量控制、基因组结构变异预测、三维基因组数据分析、蛋白质质谱鉴定、基因调控网络和基因-疾病关联分析等内容的基因组数据分析基础理论与算法研究。设计出基因组重排序算法，正面确定了一个开放20年的组合问题的复杂性；设计出最大内点生成树近似算法，首次阐明了该问题与最大路径覆盖问题解的量化关系；首次提出转录组组装的混合整数线性规划模型，大幅提高了高精度转录本的预测精度；提出预测未知细胞类型或物种的三维基因组中Loop结构的集成机器学习模型，以此发现了小鼠基因组的高度保守Loop结构；构建了一种基于三维基因组数据的Loop预测方法评价体系，建立了支持该体系的金标准数据集。根据基因组重排原理建立了新的基因组重排序问题模型，突破了原有问题模型不能反映真实基因组重排事件的局限性。项目研究成果用于大豆基因组的重复片段和结构差异分析，获得初步成效。

项目成果

DOI：{{i.doi}}

发表时间：{{i.publish_year}}

暂无此项成果

数据更新时间：2023-05-31

其他相关文献

DOI：10.13210/j.cnki.jhmu.20190508.001

发表时间：2019

DOI：

发表时间：2021

DOI：10.1051/jnwpu/20213920292

发表时间：2021

DOI：10.13197/j.eeev.2019.05.95.fuwq.009

发表时间：2019

DOI：10.6041/j.issn.1000-1298.2022.07.022

发表时间：2022

朱大铭的其他基金

批准号：60573024

批准年份：2005

资助金额：25.00

项目类别：面上项目

批准号：60273032

批准年份：2002

资助金额：22.00

项目类别：面上项目

批准号：60073042

批准年份：2000

资助金额：14.00

项目类别：面上项目

批准号：61472222

批准年份：2014

资助金额：83.00

项目类别：面上项目

批准号：61070019

批准年份：2010

资助金额：31.00

项目类别：面上项目

相似国自然基金

基因组比较与分析算法研究

批准号：61472222

批准年份：2014

负责人：朱大铭

学科分类：F0201

资助金额：83.00

项目类别：面上项目

全基因组结构分析的组合问题与算法

批准号：61872427

批准年份：2018

负责人：姜海涛

学科分类：F0201

资助金额：63.00

项目类别：面上项目

新一代测序技术宏基因组数据分析的统计算法研究与应用

批准号：61370131

批准年份：2013

负责人：艾冬梅

学科分类：F0213

资助金额：73.00

项目类别：面上项目

数据包分析（DEA〕的基础理论研究

批准号：10071095

批准年份：2000

负责人：魏权龄

学科分类：A0603

资助金额：9.00

项目类别：面上项目

基因组数据分析的基础理论与算法

{{i.achievement_title}}

暂无此项成果

其他相关文献

病毒性脑炎患儿脑电图、神经功能、免疫功能及相关因子水平检测与意义

基于铁路客流分配的旅客列车开行方案调整方法

一种基于多层设计空间缩减策略的近似高维优化方法

基于被动变阻尼装置高层结构风振控制效果对比分析

基于改进LinkNet的寒旱区遥感图像河流识别方法

朱大铭的其他基金

基因组重组比较算法与复杂性研究

多中心点问题的算法设计与应用

基因组重组进化树问题的算法及复杂性

基因组比较与分析算法研究

基因组比较问题的算法与复杂性

相似国自然基金