Creating the intelligent decision system to assist breeders in designing hybridization scheme is the future of crop breeding. It demands the integration of Big Data and molecular breeding technologies. As the brain of Artificial Intelligence (A.I.), machine learning methodology is a powerful tool for data mining and modeling in the Big Data era. In this proposed project, we will employ machine learning strategy to create genomic selection models for genomic prediction of the phenotypes of F1 hybrids and heterosis potentials based on their genotypes. The outcome from the study will be implemented as a breeding decision-making system to assist breeders in precise selection of promising parental lines for hybridization breeding. The genotype and phenotype data in a population of 6,210 F1 hybrids was created by crossing 30 paternal lines and 207 maternal lines. The 30 paternal lines were elite inbred lines with broad genetic backgrounds, which are widely used in the current maize breeding industry in China. Thus, this dataset is only ideal for theoretical research of maize heterosis, but also can be used as a standard database as the training population for promoting genomic selection technology in China. The challenge of genomic selection facing, is the over-fitness issue when the training population and testing population have different genetic background, which will significantly lower prediction accuracy. With the advantage of machine learning, the genomic selection model will fully consider the complex genetic structure of the studied population, to overcome the above-mentioned issue with robust stability. In addition, this project will also identify heterosis-determinant genomic regions, genes and markers in the maize genome. The information will be included as fixed effect when training genomic selection models, in order to further increase prediction power.
农作物育种智能化决策依赖于大数据与分子育种的紧密结合。机器学习技术是大数据时代有力的数据挖掘与建模工具,是人工智能的核心。本研究应用机器学习的策略建立基因组选择模型,通过杂种一代的基因型预测表型,辅助育种家精准筛选具有杂种优势潜力的亲本组合以及辅助设计育种方案。本研究的基因型与表型数据来自包含6210个杂种一代的群体,由我国玉米界广为应用的、来自不同杂种优势群的30个代表性骨干自交系为父本,与207个广泛变异的自交系为母本杂交创制。该数据即适用于玉米杂种优势的理论研究,也为在我国推广基因组选择育种技术体系提供了标准训练群体。本研究建立的基于机器学习的基因组选择模型,充分考虑该群体遗传背景广泛、遗传结构复杂等特点,在保证较高预测精度的同时,重点解决模型跨亚群预测的稳健性问题。本研究将挖掘决定玉米杂种优势形成的基因组区段、基因和标记,纳入模型训练中以进一步提高玉米杂种优势预测的精度。
国粮食总产量位居全球首位,玉米产量位居世界第二,但是育种技术体系的信息化与智能化程度与欧美发达国家相比,还有相当大的差距。因此,我国急需建设为玉米种业服务的玉米智能设计育种技术体系,为我国玉米育种行业解决“卡脖子”问题。本项目设立了两个主要研究目标:一是应用人工智能领域的机器学习算法构建全基因组选择模型与基因型到表型预测模型,二是挖掘决定玉米产量杂种优势的区段与基因。上述两个研究均已按计划完成,发表带有本项目基金标注的论文或论著总计10篇,其中6篇为第一标注、1篇为第二标注、3篇作为第三标注。在本项目的资助下,开发了三款育种模型或工具,获得了三项软件著作权,分别是CropGBM、GOVS、IP4GS。其中,有两项重要研究成果较为突出。首先,开发了CropGBM作物基因组设计育种工具箱,CropGBM采用集成学习的梯度提升决策树算法(LightGBM)实现基因型到表型预测,该工作发表在Genome Biology杂志,并申请了软件著作权一项。其次,利用包含有四万余个样本大规模玉米杂交群体,探索玉米杂种优势利用的生物学规律、挖掘决定杂优模式的关键基因,建立实践精准设计育种的理论框架,该工作发表在Genome Biology杂志。鉴于项目主持人在智能设计育种领域的突出贡献,植物学权威综述杂志Trends in Plant Science发出邀请,并发表了:“Machine Learning Bridge Omics Sciences and Plant Breeding”的综述文章。“智慧种业”十四五规划明确了“构建数字化育种平台,探索基因型到表型的智能育种技术体系,加快经验育种向精确育种转变”的攻关目标。本项目的研究成果,将推动我国种业的智能化升级、加速向玉米育种4.0的时代前进。目前有辽宁东亚、黑龙江垦丰种业、北京通州种业等多家种业企业与项目主持人开展了横向课题合作。
{{i.achievement_title}}
数据更新时间:2023-05-31
监管的非对称性、盈余管理模式选择与证监会执法效率?
硬件木马:关键问题研究进展及新动向
基于LASSO-SVMR模型城市生活需水量的预测
基于多模态信息特征融合的犯罪预测算法研究
滚动直线导轨副静刚度试验装置设计
玉米杂种优势预测的智能决策专家系统
基于机器学习的RNA结构预测
应用机器学习方法预测和分析蛋白质的结构柔性
应用分子标记预测家蚕杂种优势及其QTLs定位