Emerging and infectious diseases represent an ongoing threat to the health and economic well-being of every country. Surveillance programs involving sample isolation and recording clinical data are a common way to investigate the seriousness of these threats and to aid in the development of an effective response. However, one of the challenges facing these programs is how to effectively analyze the data to obtain the maximum amount of information. One of the most common approaches is to screen samples to find isolates that are positive for the virus of interest, followed by sequencing one or more genes in the virus genome. This is followed by phylogenetic analysis to classify the strains according to known reference sequences (such as genotype or serotype). Alternatively, the progression of a disease outbreak may be followed by recording number of cases, along with clinical and background data, and monitoring the change in these quantities over time. One problem with these approaches is they provide little insight into mechanisms of infection or the evolution of the virus; i.e., they tell us how the virus is emerging, but not why. Many other data sources are available that can be integrated into an analysis, or additional methods may be used to identify features of interest (such as motifs within a sequence, or significant changes in the geographical distribution of a virus). However, integrating these additional data sources or software tools can be a challenging task, requiring bioinformatics or programming expertise. Consequently, virologists often resort to performing these tasks semi-manually, which becomes a limiting factor when analyzing large datasets. Another problem is that the datasets are often quite divergent and it is not always apparent how to combine the information. In this project, we propose a method that integrates typical surveillance datasets (i.e. sequence, clinical and epidemiological) and mines them using a novel technique centered around an estimated tree based on the isolate sequences. Using this method, we can filter out random mutations and identify significant or correlated nucleotide and amino acid mutations that are likely to play functional roles in the virus life cycle, and therefore represent good targets for experimental study. Our method is applicable to any virus and will be implemented in such a way to simplify the analysis process so that it can be used by scientists without a bioinformatics or strong computational background. As a demonstration of the technique, we propose to study Japanese Encephalitis Virus, which is responsible for more than 50,000 deaths annually in South Asia and represents a growing problem in China. We will use our method to analyze surveillance data collected from the Chinese Centre for Disease control to generate functional site predictions and then test the validity of these predictions experimentally.
常规的传染性疾病监测措施在掌握疾病的早期流行状况、预防疾病爆发流行,保护人民健康的过程中发挥了重要作用。但是随着人口的频繁流动,自然和社会环境的快速变化,常规的监测措施已经难以对许多新发和再发传染性疾病进行有效控制。本课题将以流行性乙型脑炎病毒为研究对象,一方面整合监测数据、毒株背景信息和蛋白结构等信息,对系统进化树进行结构分析,筛选在病毒生命周期和毒力决定中起关键作用的核苷酸和氨基酸突变位点,分析其在病毒进化中的作用。另一方面整合多重数据源、开发新算法和软件工具,研究深入挖掘病毒基因变异和进化规律的新方法,阐明病毒流行和进化的机制。然后结合实验手段,筛选并验证流行性乙型脑炎病毒序列中的功能性位点。本课题的成功实施,将为流行性乙型脑炎病毒的研究工作提供一个整合的、深入了解病毒感染和传播机制的研究平台,极大地促进流行病学、疫苗开发等方面的工作;同时也为其他病毒性流行病的研究打下良好的基础。
{{i.achievement_title}}
数据更新时间:2023-05-31
玉米叶向值的全基因组关联分析
论大数据环境对情报学发展的影响
一种光、电驱动的生物炭/硬脂酸复合相变材料的制备及其性能
正交异性钢桥面板纵肋-面板疲劳开裂的CFRP加固研究
硬件木马:关键问题研究进展及新动向
流行性乙型脑炎病毒感染相关宿主miRNA的鉴定与功能分析
流行性乙型脑炎病毒与寨卡病毒交叉免疫的特点及意义
一组乙型脑炎病毒E蛋白B细胞表位的结构与功能研究
蝙蝠在流行性乙型脑炎传播环中的作用探讨