With the rapid development of Next Generation Sequencing technology, whole genome De novo sequencing approach has become the key strategy in the study of assemble a species genomic reference sequence. However, due to a large number of highly heterozygous regions and repetitive areas contained the genome, it posts great challenges to correctly and completely assemble the genomic sequencing. Although several modified methods have been applied to address this challenge, such as constructing multiple sequencing libraries with different DNA fragment lengths and combining both second generation sequencing and third generation sequencing platforms for this task, the assembly result is still not satisfied. It can only assemble more than thousands of disordered Scaffolds sequencing with small length. This research is proposed a novel strategy which is employed our recently developed and optimized Restriction-enzyme site Associated DNA (RAD) sequencing approach for assembling the highly heterozygous regions and repetitive areas. Our optimized approach provides both a computational tool and a library construction protocol, which can maximize the number of genomic sequence reads that uniformly cover a genome and minimize the number of redundancy reads. We can implement the developed computational tool to feasibly design the suitable RAD seq experiment to achieve expected coverage of sequence variant markers for large populations. Then, the genome wide genetic polymorphisms can be identified and used for construct the genetic linkage map. Finally, we can use this constructed genetic linkage map information to further assemble the thousands of disordered Scaffolds sequencing in order to improve quality and integrity of the constructed genomic reference sequence. This research will fill in scientific gap about the processing and analysis the whole genome De novo assembly sequencing data and try to resolve the challenges of highly heterozygous regions and repetitive areas during genomic sequence assembling from a different perspective.
通过全基因De novo测序,可以获得参考基因组序列信息,是后续多层面组学研究的基础。由于基因组是高度杂合的,并含有大量重复序列区段,这对基因组的拼装带来极大挑战。尽管,目前全基因De novo测序拼装采取第二和第三代测序相结合,同时构建不同片段长度测序文库,但是拼装结果仍不尽如人意。我们利用自己开发、优化的简化基因组测序技术,将其应用到全基因De novo测序拼装中:通过软件分析De novo测序拼装出的大量Scaffold短序列片段,寻找最优的内切酶组合,设计简化基因组测序实验,并将其应用到作图群体中去检测遗传多态性位点并进行分型分析,进而构建遗传图谱,最终将构建好的遗传图谱进一步指导Scaffold序列的再次拼装。该技术方法在一定程度上解决了基因组中重复区域和杂合区域的组装难题,将提高全基因De novo测序拼装的质量和结果,为全基因De novo测序技术提供了新的思路和选择。
通过全基因水平的De novo测序,可以获得高质量的参考基因组,进而得到全面的序列信息,是后续不同层面组学研究的基础。由于大多数物种基因组是高度杂合的,并且存在着大量重复序列区段,这对基因组的完整拼装带来极大挑战。尽管,目前全基因De novo测序拼装采取第二和第三代测序相结合,同时构建不同片段长度测序文库,但是拼装结果仍不尽如人意。我们利用自己开发的、优化的简化基因组测序技术平台,将其应用到全基因De novo测序拼装领域中,进而将该技术和策略应用到作图群体中去检测遗传多态性位点并进行分型分析,并基于大规模样本测序结果构建遗传图谱,指导基因组组装,为全基因De novo测序技术提供了新的思路和选择。同时,该策略还可以实际应用到特定的大规模群体样本中,针对特定的表型,进行遗传定位解析。
{{i.achievement_title}}
数据更新时间:2023-05-31
玉米叶向值的全基因组关联分析
论大数据环境对情报学发展的影响
正交异性钢桥面板纵肋-面板疲劳开裂的CFRP加固研究
硬件木马:关键问题研究进展及新动向
基于SSVEP 直接脑控机器人方向和速度研究
大额牛全基因组De novo测序及其起源进化和环境适应性研究
百万核扩展的基因组De Novo组装
利用简化基因组测序技术研究靛冠噪鹛的分类及遗传进化
基于基因组de novo测序的丛赤壳科真菌交配型相关基因鉴定及其系统学应用