DNA sequence encoding and design are critical to genetic engineering, especially to the development of bio-computer. Restricted by specific hybridization and the melting temperature, particularly by the contradiction of short length of DNA strands and simultaneously large number of encoded sequences, DNA encoding problem turns out to be NP-complete. This problem calls for finding largest sets of single DNA strands that do not crosshybridize to themselves or to their complements, which can be formulated as a combinatorial optimization problem. As a consequence, this project will combine the encoding problem with techniques in graph theory, including maximum independent set, graph coloring, graph connectivity, and so on, aiming at establishing a set of accurate DNA encoding system. Specifically, this research will be carried out from the following 4 aspects: (1) propose a new method, called hybridization distance, for measuring the similarity between two DNA sequences; (2) transform the DNA encoding problem into maximum independent set problem to find largest sets of DNA strands under various constraints; (3) design the algorithm of DNA encoding based on five DNA kinds of constraints, viz., the length of DNA strands, GC content, hybridization distance, Tm value and specific hybridization; (4) realize the encoding algorithm by designing corresponding softwares, and show the applications of the encoding designed.
在基因工程,特别是生物计算机的研制过程中,DNA编码理论和设计方法是关键。特异性杂交和解链温度,特别是链短且编码序列多这一矛盾的需求,使得DNA编码设计非常困难,并已被证明是NP-完全的问题。编码问题旨在寻找满足不与自身或其补序列互相交叉杂交的最大的DNA单链的集合,可以描述成一个组合优化问题。 因此,本项目拟有机地将DNA编码与图论方法,包括图的独立集理论,图着色理论,图的连通性理论等结合,力争建立一套完整的精准的DNA编码体系。本项目的研究内容主要分为以下4个方面: (1) 提出杂交距离的方法用来衡量DNA序列之间的相似度;(2) 将编码问题转化成图的独立集问题来寻找满足不同约束条件下的最大编码集合;(3) 设计在编码长度,GC含量,杂交距离,值和特异性杂交这五种约束条件下的DNA编码方法;(4) 研发编码相应的软件并给出应用。
DNA编码设计在生物计算机的研制过程中至关重要。在实际设计中,DNA编码会受到多种因素的影响和制约,如特异性杂交、解链温度、链短且编码序列多等,从而使得DNA编码非常困难,该问题已被证明是NP-完全的。编码问题旨在寻找满足不与自身或其补序列互相交叉杂交的最大的DNA单链的集合,可以描述成一个组合优化问题。 为此,本项目通过研究图的独立集,图的邻点着色和无圈着色以及图的连通性等多种组合优化问题,得到用于求解DNA编码问题的若干算法和理论。此研究为进一步建立完整的DNA编码体系提供了理论依据和指导。
{{i.achievement_title}}
数据更新时间:2023-05-31
DNAgenie: accurate prediction of DNA-type-specific binding residues in protein sequences
基于铁路客流分配的旅客列车开行方案调整方法
一种基于多层设计空间缩减策略的近似高维优化方法
神经退行性疾病发病机制的研究进展
新型树启发式搜索算法的机器人路径规划
DNA计算中编码序列集合设计
基于特征和方法的编码与序列设计
DNA序列的高维空间数字编码与DNA计算研究
DNA计算在图论中的应用