The development of high-throughput MeRIP-Seq technology sets off a new upsurge of N6-methyladenosine (m6A) investigation and also poses computational challenges. Until recent years, most MeRIP-Seq data analysis still focuses on basic data processing, and the algorithms which are designed to mine regulation relationship from MeRIP-Seq data have not been sufficiently studied..RNA methylases, which have been identified as m6A regulators, exert a complex influence on RNA methylome, imposes important relationship with many biological processes, even some disease. Therefore, on the condition such as the presence of large amount of lowly expressed genes, within-group variation from biological replicates and the uncertain number of unknown RNA methylases, the project plans to construct a Dirichlet process based beta binomial infinite mixture model in nonparametric Bayesian way to unravel the RNA methylome, then investigate the relationship between RNA methylases and their regulated methylation sites, design bioinformatics algorithms for prediction of RNA methylases specificity and key enzyme genes that directly lead to dysregulation..Hopefully, this project will accelerate the understanding of RNA methylation, and will also provide strong theoretical basis for the study of applying RNA methylation in the diagnosis and treatment in human disease.
高通量MeRIP-Seq测序技术的出现,掀起了RNA m6A甲基化相关领域的研究热潮,同时提出了全新的计算问题。近年来MeRIP-Seq测序数据的研究仍停留在数据的基本处理,严重缺乏深入挖掘各种调控关系的生物信息学方法。.作为目前已明确的RNA甲基化调控因子,RNA甲基化酶对RNA甲基化谱的影响异常复杂,酶的异常与疾病等生命现象的产生密切相关。因此,本项目针对MeRIP-Seq测序数据中存在大量低表达基因、生物性重复样本组内可变性及调控RNA甲基化酶个数尚不明确等关键问题,基于Dirichlet过程建立贝塔二项式无限混合非参数贝叶斯模型实现RNA甲基化谱的分解,在此基础上,研究RNA甲基化酶基因与修饰位点间的调控关系,设计预测RNA甲基化酶基因特异性、关键致病酶基因的生物信息学方法。.该项目有望加快RNA甲基化功能的理解和认识,并将为RNA甲基化应用于人类疾病的研究提供有力的理论依据。
本项目主要针对MeRIP-Seq测序数据的分析处理,研究了RNA甲基化谱分解的Dirichlet过程贝塔二项式混合模型,建立了涵盖大部分已公开发表的RNA甲基化测序数据的m6A甲基化数据库,开发了MeRIP-Seq测序数据的可视化工具Guitar,可基于MeRIP-Seq测序数据绘制出相关m6A位点在mRNA、lncRNA上的分布曲线,开发了能够对MeRIP-Seq测序数据的数据质量进行评估的Trumpet质控工具,并基于这些工具,辅助执行了KSHV感染下的多种细胞的RNA甲基化谱的分析。.另外,本项目总结了MeRIP-Seq测序数据相关的各种分析处理方法,并对未来围绕m6A甲基化可开展的生物信息学分析课题提出了展望。在其中,针对关联分析问题,首先建立了HIWCF和NCFGER预测模型,为后续将转录组相关信息的融合以用于m6A甲基化的关联分析奠定了基础。
{{i.achievement_title}}
数据更新时间:2023-05-31
玉米叶向值的全基因组关联分析
一种光、电驱动的生物炭/硬脂酸复合相变材料的制备及其性能
正交异性钢桥面板纵肋-面板疲劳开裂的CFRP加固研究
低轨卫星通信信道分配策略
宁南山区植被恢复模式对土壤主要酶活性、微生物多样性及土壤养分的影响
非可交换的非参数贝叶斯方法的统计推断及应用
故障预测和系统健康管理的贝叶斯推断
统计因果推断及贝叶斯网络
基于贝叶斯生成对抗网络的长非编码RNA与疾病关联预测研究