A central task for computational chemistry is to accurately and efficiently simulate the molecular free energy, which provides theoretical proof and quantitative support for many study fields, e.g. structure-based drug design, protein engineering, etc. To achieve that goal requires reliable molecular potential energy models constituting a good balance between computational accuracy and operational simplicity. Quantum mechanics methods achieve high prediction accuracy while are extremely expensive against complicated biomolecular systems. Force fields have long been the most widely used potential energy models for molecular free energy simulations, for their well-trained parameter set and easy-to-compute functional forms. However, the single-atom or fragment based pairwise potential functions are facing inherent difficulties for simulating quantum level phenomena caused by the many body effects e.g. electric polarizability and charge transfer, bringing the development of force fields to a bottleneck. Inspired by the ideas from the graph theory and machine learning techniques, we hereby propose a new protocol for the molecular potential energy model generation by using the Bayesian Field Theory (BFT) and Artificial Neural Network (ANN) based machine learning. This project aims to introduce the ANN-based potential model, for protein systems and protein-ligand complex systems, not only using the atom pairwise potentials, but also including the multi-dimensional configurational effects. In this project, we plan to introduce BFT to set up isolated close-ranged many body systems centered at every atom under study within a molecule, so that the environmental structural configurations regarding each atom descriptors can be tracked across all the molecules in the training database. We then plan to use the deep learning method for training a multi-layered ANN as the biomolecular energy model against massive number of high-level QM method calculated molecular single point energies together with high-quality molecular global minimum structures. This project will provide new insight for understanding the benefit of using machine learning methods to simulate and interpret the complicated configurational effects beyond the illustration of molecular mechanics, without the need to employ high-cost quantum level computation. Finally, this project plan to embed this new ANN-based potential model in the commercialized “Movable Type” free energy method initially developed by the applicant, to achieve both high speed and high accuracy in the free energy simulation for biomolecular systems.
分子自由能模拟是计算化学的一个重要研究方向,可为新药研发、蛋白质工程等多个前沿领域的研究提供理论依据与方法指导。兼具计算精度与效率的能量模型则是自由能模拟运算的基石。量子力学模型精度虽高,但对复杂大分子体系的运算负担过大。分子力学模型尽管计算成本相对较低,然而对于多体效应产生的相关能量缺乏准确描述,造成较大的累积误差。本项目总结近期科研实践及成果,拟采用贝叶斯场与深度学习相结合的方法研究蛋白质体系内的多体能量,建立基于多层神经网络的分子能量模型。贝叶斯场可对复杂分子的结构与能量数据进行有效的信息转化与数据降维,生成机器学习所需的输入变量,随后对有机小分子的构象能级与蛋白质大分子的全局最优构象进行逐层训练,结合分子结构与能量信息建立多层网络模型,并通过模型的数据结构分析研究原子间多体能量的机制,最终结合本研究组发明的“活字印刷”自由能算法,实现兼具高精度与高效率的生物大分子自由能模拟。
分子自由能模拟是计算化学的一个重要研究方向,可为新药研发、蛋白质工程等多个前沿领域的研究提供理论依据与方法指导。兼具计算精度与效率的能量模型则是自由能模拟运算的基石。量子力学模型精度虽高,但对复杂大分子体系的运算负担过大。分子力学模型尽管计算成本相对较低,然而对于多体效应产生的相关能量缺乏准确描述,造成较大的累积误差。本项目总结近期科研实践及成果,拟采用贝叶斯场与深度学习相结合的方法研究蛋白质体系内的多体能量,建立基于多层神经网络的分子能量模型。贝叶斯场可对复杂分子的结构与能量数据进行有效的信息转化与数据降维,生成机器学习所需的输入变量,随后对有机小分子的构象能级与蛋白质大分子的全局最优构象进行逐层训练,结合分子结构与能量信息建立多层网络模型,并通过模型的数据结构分析研究原子间多体能量的机制,最终结合本研究组发明的“活字印刷”自由能算法,实现兼具高精度与高效率的生物大分子自由能模拟。
{{i.achievement_title}}
数据更新时间:2023-05-31
玉米叶向值的全基因组关联分析
论大数据环境对情报学发展的影响
跨社交网络用户对齐技术综述
主控因素对异型头弹丸半侵彻金属靶深度的影响特性研究
转录组与代谢联合解析红花槭叶片中青素苷变化机制
贝叶斯深度张量学习模型、理论及算法研究
基于贝叶斯理论和深度学习的立体图像质量评价
耦合多智能体系统与深度学习算法的城市开发边界精细模拟研究
多源迁移学习的贝叶斯网络预测方法与应用研究