An AIDS clinical dataset mainly consists of correlated unbalanced longitudinal data (e.g. CD4 measured repeatedly over time) and survival data (time-to-event data, e.g. time to death). The longitudinal biomarkers (e.g. CD4, viral load) may be highly associated with time-to-event such as relapse-free survival or overall survival, and can be important predictors or surrogate of a time to event. Classical Models such as the linear mixed model for longitudinal data and the Cox proportional hazards model for time-to-event data (survival data) do not consider dependencies between these two different data type (longitudinal data and time-to-event data). Joint modeling for longitudinal and time-to-event data is a powerful method that takes into account the dependency and association between the longitudinal and time-to-event data. Although there has been extensive research for joint modeling method in the last two decades motivated by the requirements of increasingly application and the importance of joint models has been increasingly recognized, researchers are still challenged by the balance among computational load, inferential efficiency, and model complexity. For the complex AIDS clinical data set obtained from long-term observation, typically containing measurement errors, missing values, censored values, and outliers, there are especially few robust and efficient joint models and computational tools, and there is a serious lack of easy-to-use standard software. In this research, based on Yunnan AIDS clinical data set in the Chinese HAART database, we will propose a novel semi-parametric joint model for AIDS clinical data analysis which consists of a semi-parametric mixed effects model for longitudinal data and a semi-parametric Cox proportional hazards model for survival data linked through shared random effects, aiming to tackle with the 'balance challenge' well, and we also will propose efficient approaches about parameters estimation and hypothesis test based on the likelihood methods. In addition, we will develop the R software package for such model. Finally this model will be used to construct an assessment system for evaluating the Chinese HAART. Some new methods and effective tools for the AIDS clinical data analysis will be supplied.
艾滋病治疗过程中的随访资料数据主要由高关联的非平衡纵向数据和生存数据构成。不同于传统的对纵向和生存数据忽视其关联分别建模的分析方法,对纵向和生存数据的联合建模方法纳入了两类数据的关联和相依,是分析关联的纵向和生存数据的有力方法。近20年,联合建模方法在快速增长的应用需求推动下得到广泛发展,但在模型复杂度、计算简洁度和推断效力的平衡方面仍面临挑战,对长期观测得到的、包含测量误差、缺失值、特异值和删失值的复杂艾滋病临床治疗数据,尚没有令人满意的稳健高效的联合模型和相关计算工具,缺乏实用的软件包。本项目拟基于国家免费艾滋病抗病毒药物治疗数据库中云南省的数据,构建新的由随机效应相连接的、以半参数的混合效应模型和Cox模型为子模型的联合模型,发展基于似然方法的参数估计和假设检验方法,开发相应的R软件包,以此为基础构建艾滋病临床治疗过程监测和评估定量分析系统,为艾滋病临床数据分析提供新方法和有效工具。
近三十年来,纵向和生存数据的联合建模方法在生物医学、工程技术、经济学和社会学等广泛领域的数据分析中表现出日益增长的重要性,但在发展高精度、低计算量的参数估计方法和同时变量选择方法方面进展缓慢,本项目围绕这两个关键问题开展了逐步深入的研究,取得了一系列成果。(1)对典型的艾滋病临床治疗中得到的纵向和生存数据的缺失机制进行了研究,得到了有效的插补方法。(2)导出了线性混合效应模型参数的惩罚似然估计方程的迭代公式,讨论了惩罚似然估计的优良性,构建了新的线性混合效应模型,模拟结果说明我们的方法优于限制极大似然估计方法。同时我们对似然估计中涉及的高维积分给出了基于模拟的算法。(3)基于两步法的思想将惩罚似然方法运用于联合模型中同时对生存子模型和纵向子模型进行变量选择,模拟计算表明我们的方法较传统方法有更高的精度。(4)我们基于He(2015,Biometrics)的同时变量选择方法分别用Lasso惩罚和SCAD惩罚以及传统的p 值方法对艾滋病临床数据构建了联合模型,对三种方法进行了比较,说明了各自的优缺点。(5)针对牛奶中所含蛋白质的纵向数据,运用机器学习方法中的决策树、boost、bagging、随机森林、神经网络、支持向量机和传统处理纵向数据的线性混合效应模型做预测对比,说明了机器学习方法的稳健性等优点。(6)构建了基于机器学习方法的纵向和生存数据的联合模型,并探索了其优良性。(7)给出了I型删失情形下威布尔分布可靠度和条件可靠度的置信限的一个新的精确算法,这个结果优于目前已有的近似算法。我们的成果对联合模型研究领域贡献了有价值的新的观点和算法,并在前述广泛的领域有较大的应用价值。
{{i.achievement_title}}
数据更新时间:2023-05-31
基于一维TiO2纳米管阵列薄膜的β伏特效应研究
基于分形L系统的水稻根系建模方法研究
论大数据环境对情报学发展的影响
特斯拉涡轮机运行性能研究综述
硬件木马:关键问题研究进展及新动向
基于CAT的艾滋病临床疗效评价的PRO量表研究
多元纵向数据的统计联合建模方法及其分析策略研究
基于过程挖掘的临床路径建模与分析方法研究
基于临床实际数据确定中医临床评价靶人群和疗效指标的富集方法研究