面向基因组相关性研究的迁移学习理论与方法

基本信息

批准号：11471256

项目类别：面上项目

资助金额：70.00

负责人：李丽敏

学科分类：

依托单位：西安交通大学

批准年份：2014

结题年份：2018

起止时间：2015-01-01 - 2018-12-31

项目状态：已结题

项目参与者：刘庆芳,曹相湧,王栋,戴明伟,郭洪沙,郑玄玄,卢瑨,李辰潼

关键词：

生物信息迁移学习基因组相关性研究模式识别

结项摘要

Genome-wide association study has been a popular research topic in recent years. It aims to identify associated SNPs or genes in the whole genome. Due to the high price of collecting samples, the study in some populations or species might be difficult. In this project, we propose to explore how to transfer disease knowledge from one population to another, and how to transfer population or species knowledge from one disease to another. Thus the old or good knowledge in one domain can be used to help for the study in another domain. We can do this by borrowing the idea of transfer learning in manifold learning. However, the theory and methods in transfer learning are far from complete,which hampers its application in GWAS. In this project, we will explore the theories, models and applications for transfer learning, based on the problem in genome-wide association study. We focus on three unsolved problems in transfer learning. One is when one can transfer the knowledge from one domain to another. The second is the theory and methods of multi-source domain transfer learning. The final one is how to avoid negative transfer learning. The research will help complete the theory and methods of transfer learning, will help develop theories and methods for GWAS, and influence the applications in many fields such as computer science, engineering, biology and medicine. The introduction of transfer learning idea to genome-wide association study will inject vigor into the field of bioinformatics, and the complete of this project will enhance the development of biology and medicine.

基因组相关性研究是近年来生物信息中的一个研究热点，其主要目的是在特定物种的整个基因组上寻找与某种疾病相关的基因或位点。由于数据收集的高成本或不可抗拒因素，对某些物种或种群的研究必然面临小样本或强噪声的困境。在该项目中，我们创新性地设想疾病的某些特征可以在不同物种或种群之间迁移，以及在同一物种或种群中不同疾病的特征之间也可以迁移，从而可以用一个领域中相对成熟的知识帮助另一个领域中的数据解译或学习。我们拟利用迁移学习的思想来研究这些问题。为了将迁移学习原理应用于基因组相关性研究中，我们聚焦研究以下三个尚未解决的问题：(1)迁移在何种情况下可以进行；(2)多源域如何实施迁移学习；(3)如何避免负迁移。本项目拟通过解决所述三个问题来发展适用于基因组相关性研究的创新迁移学习理论，以期为基因组相关性研究提供新的理论与方法支撑，并以拟南芥的基因组相关性研究为实例进行讨论和验证，从而应用于其他物种和疾病。

项目摘要

本课题是数学，计算机和生物信息学的交叉学科，主要基于基因组相关性研究探索迁移学习的理论和方法，试图以此为工具更好得寻找和疾病有关的基因位点。我们侧重于数学模型和算法。该项目四年来的主要的研究成果有：1. 我们针对迁移学习中的领域适应问题，提出了方差匹配(Covariance matching)的半监督学习方法DACoM。2. 我们针对多源数据融合问题，我们创新性得提出了UMDS方法。3. 我们特别针对基因组数据中两个领域特征空间不同的问题，提出了异质判别MMD方法（DMMD），用于不同平台的基因组数据分类。4. 我们针对药物和基因的相互关系预测问题，提出了基于分块稀疏的多任务学习方法BBSS。5. 我们针对乳腺癌子型分类问题，提出了融合多源数据的ECMC方法。6. 我们针对药物蛋白质相互关系，提出了多源数据融合的方法MLRE。以上这些方法从理论及模型的角度给出迁移学习及其在基因组相关性学习中应用的一些方案，可以促进迁移学习和生物信息学相关领域的发展。本课题基本完成原定计划，达到了预期的研究目标。

项目成果

DOI：{{i.doi}}

发表时间：{{i.publish_year}}

暂无此项成果

数据更新时间：2023-05-31

其他相关文献

DOI：10.16796/j.cnki.1000-3770.2022.03.003

发表时间：2022

DOI：10.12354/j.issn.1000-8179.2021.20201763

发表时间：2021

DOI：10.7524 /j.issn.0254-6108.2017122903

发表时间：2018

DOI：

发表时间：

DOI：10.13343/j.cnki.wsxb.20200479

发表时间：2021

李丽敏的其他基金

批准号：11101328

批准年份：2011

资助金额：22.00

项目类别：青年科学基金项目

批准号：81202887

批准年份：2012

资助金额：23.00

项目类别：青年科学基金项目

批准号：31402174

批准年份：2014

资助金额：25.00

项目类别：青年科学基金项目

相似国自然基金

面向物体识别的迁移学习理论与方法研究

批准号：61402443

批准年份：2014

负责人：阚美娜

学科分类：F0605

资助金额：27.00

项目类别：青年科学基金项目

面向大数据的机器学习理论与方法

批准号：61332007

批准年份：2013

负责人：朱小燕

学科分类：F0201

资助金额：300.00

项目类别：重点项目

面向图像序列的深度学习理论与方法

批准号：61532009

批准年份：2015

负责人：刘青山

学科分类：F06

资助金额：290.00

项目类别：重点项目

基于鲁棒表示的迁移学习理论与方法研究

批准号：61772141

批准年份：2017

负责人：房小兆

学科分类：F0605

资助金额：60.00

项目类别：面上项目

面向基因组相关性研究的迁移学习理论与方法

{{i.achievement_title}}

暂无此项成果

其他相关文献

EBPR工艺运行效果的主要影响因素及研究现状

外泌体在胃癌转移中作用机制的研究进展

珠江口生物中多氯萘、六氯丁二烯和五氯苯酚的含量水平和分布特征

基于LS-SVM香梨可溶性糖的近红外光谱快速检测

猪链球菌生物被膜形成的耐药机制

李丽敏的其他基金

关于基因组相关性研究中人口结构问题的机器学习方法

基于创新体外分析技术对雄黄及其制剂中砷形态与体内毒效的相关性研究

肥大细胞对口蹄疫病毒样颗粒的模式识别作用与应答机制研究

相似国自然基金