This proposal initiates a comprehensive research program into the study of continuous-time Markov controlled processes (MCP)(known as Markov decision processes) with random horizon and uncertain discount factors, and it is the generalization of the standard MCP with a constant discount factor and a fixed finite-horizon. The motivation of this proposal is from the real-world situations as follows: a) The discount factor in MCPs, which corresponds to an uncertain interest rate in economic and financial systems, should vary; b) in equipment maintenance one aims to minimize the total discount costs incurred during the life-time of an equipment, and thus the horizon of decision epochs can be randomized; c) the main research on MCPs are limited on the cases of a constant discount factor and a fixed horizon. For the continuous-time MCP with varying discount factors and random horizon, this proposal deals with how to optimally control a stochastic dynamic system (under given a utility criterion) by designing system's parameters such as actions of equipment maintenance, portfolio strategies in finance, and so on. In the proposal, we attend to accomplish the following technical objectives. 1) To give reasonable conditions for the existence and computation of optimal control policies for the first passage criterion of continuous-time MCPs with random horizon and uncertain discount factors. 2) To find mild conditions for the existence and computation of a so-called mean-variance optimal control policy that minimizes the variance over a set of control policies with a given utility value (rather than the optimal one). This consideration not only extends those for the standard variance minimization problem of discounted continuous-time MCPs to the case of the varying discount factors and random horizon, but also generalizes H.M.Markowitz's mean-variance portfolio selection theory into the case of the continuous-time MCPs. 3) To find suitable conditions for the existence and computation of an optimal control plicy for the probability criterion in continuous-time MCPs, whcih is the generelization of reliability of systems, and which has not been studied yet. 4) To analyze structures and characterizations of optimal control policies obtained in the terms 1)- - 3) above, which are very important for applications. Then, we study some simulations and applications to some real world situations. The main contents in the four terms 1)-4) above are new and from real wolrd situations. Hence, the technical objectives (1)-(4) show that this project seeks to provide new developments in both theoretical and applied research of Markov controlled processes.
本项目提出并研究终止时间随机且折扣因子不确定的Markov控制过程(英文缩写为MCP),它是当前固定的折扣因子和有限阶段MCP的拓展。本项目的提出源于下列事实:a)决策过程的终止时间可能是随机的(如机器的寿命);b) 折扣因子可能是不确定的(如银行的利率);c)已有MCP的研究主要致力于折扣因子与终止时间均为常数情形。针对折扣因子不确定且终止时间随机的连续时间MCP,本项目研究如何根据受控随机动态系统的状态来设计控制策略(如机器的维修方式、金融中的投资策略),使系统在运行终止前的效益(如:系统的可靠性和运行费用等性能指标)达到最优。项目研究内容有:1)首达目标折扣最优控制策略存在的条件及算法;2)首达目标"均值-方差"最优控制策略的存在性与计算;3)概率准则下最优控制策略的存在性与计算; 4)最优控制策略的结构及具体模型的应用。这些研究内容在连续MCP中是新的,并将推进MCP的新进展。
本项目的研究不仅是以往固定折扣因子和有限阶段MCP的拓展,而且由客观事实的需求所驱动。项目研究内容有:1)首达目标折扣最优控制策略存在的条件及算法;2)首达目标均值-方差最优控制策略的存在性与计算;3)概率准则下最优控制策略的存在性与计算;4) 最优控制策略的结构及具体模型的应用。项目研究主要成果如下。关于内容-1):提出并研究连续时间MCP首达目标准则,通过拓展E.B.Dynkin公式和新的推移策略技术,首次给出最优策略的存在性条件及算法,相应结果发表在国际著名权EEE Trans. Automat. Control上;关于内容-2:对一般状态MCP首达目标的可变折扣因子模型,基于首次分解技术,首次给出了均值-方差最优策略的存在性条件及其计算方法,成功将诺贝尔经济学奖获得者H. M. Markovwitz的均值-方差投资组合理论拓展到离散事件动态系统情形,相关结果发表在国际著名SIAM J. Control Optim.;关于内容-3:首次研究了平均在险值(AVaR)准则和连续时间MCP的风险概率准则。关于AVaR准则,通过建立“正偏差准则”这个新的强有力的研究技巧,不仅证明了最优则略的存在性,而且还给出最优策略的值迭代算法,相应成果已在国际著名的SIAM J. Optim.上发表。对折扣连续时间MCP的风险概率准则,通过提出了包含报酬水平的更广泛的策略,给出值函数是相应最优方程唯一解的条件,证明了最优策略的存在性,提出了值迭代算法。相关结果发表在国际著名杂志Discrete Event Dyn. Syst.上;关于内容-4:提出了混合策略的概念,首次证明关于受约束连续时间MCP平均准则的最优混合策略是约束条件数个确定性平稳策略的凸组合,刻画了最优策略结构,相关结果发表在国际著名Math. Oper. Res.上。另外,我们还用维修模型、现金流模型、生灭系统等实际问题阐述本项目结果的应用。 总之,本项目圆满完成的计划的研究内容,推动了MCP理论与应用的新进展, 并在SIAM J. Optim.,IEEE Trans. Automat. Control,SIAM J. Control Optim.等国际著名杂志上发表SCI论文25篇,培养博士毕业生5名,硕士毕业生8名。
{{i.achievement_title}}
数据更新时间:2023-05-31
DeoR家族转录因子PsrB调控黏质沙雷氏菌合成灵菌红素
端壁抽吸控制下攻角对压气机叶栅叶尖 泄漏流动的影响
基于ESO的DGVSCMG双框架伺服系统不匹配 扰动抑制
空气电晕放电发展过程的特征发射光谱分析与放电识别
当归红芪超滤物对阿霉素致心力衰竭大鼠炎症因子及PI3K、Akt蛋白的影响
不确定广义Markov跳变时滞系统的有限时间控制
基于Markov跳变概率不确定模型的带宽受限网络控制系统有限时间控制
Markov过程、随机点过程与风险理论
基于通信概率不确定Markov跳变模型的自主车队协同控制系统有限时间控制研究