终止时间随机且折扣因子不确定的Markov控制过程

基本信息

批准号：61374067

项目类别：面上项目

资助金额：82.00

负责人：郭先平

学科分类：

依托单位：中山大学

批准年份：2013

结题年份：2017

起止时间：2014-01-01 - 2017-12-31

项目状态：已结题

项目参与者：张俊玉,李炜,孙轶民,黄永辉,魏清达,张文钊,吴晓,黄香香,邹小龙

关键词：

马尔科夫控制过程最优性条件不确定折扣因子计算方法随机终止时间

结项摘要

This proposal initiates a comprehensive research program into the study of continuous-time Markov controlled processes (MCP)（known as Markov decision processes) with random horizon and uncertain discount factors, and it is the generalization of the standard MCP with a constant discount factor and a fixed finite-horizon. The motivation of this proposal is from the real-world situations as follows: a) The discount factor in MCPs, which corresponds to an uncertain interest rate in economic and financial systems, should vary; b) in equipment maintenance one aims to minimize the total discount costs incurred during the life-time of an equipment, and thus the horizon of decision epochs can be randomized; c) the main research on MCPs are limited on the cases of a constant discount factor and a fixed horizon. For the continuous-time MCP with varying discount factors and random horizon, this proposal deals with how to optimally control a stochastic dynamic system (under given a utility criterion) by designing system's parameters such as actions of equipment maintenance, portfolio strategies in finance, and so on. In the proposal, we attend to accomplish the following technical objectives. 1) To give reasonable conditions for the existence and computation of optimal control policies for the first passage criterion of continuous-time MCPs with random horizon and uncertain discount factors. 2) To find mild conditions for the existence and computation of a so-called mean-variance optimal control policy that minimizes the variance over a set of control policies with a given utility value (rather than the optimal one). This consideration not only extends those for the standard variance minimization problem of discounted continuous-time MCPs to the case of the varying discount factors and random horizon, but also generalizes H.M.Markowitz's mean-variance portfolio selection theory into the case of the continuous-time MCPs. 3) To find suitable conditions for the existence and computation of an optimal control plicy for the probability criterion in continuous-time MCPs, whcih is the generelization of reliability of systems, and which has not been studied yet. 4) To analyze structures and characterizations of optimal control policies obtained in the terms 1)- - 3) above, which are very important for applications. Then, we study some simulations and applications to some real world situations. The main contents in the four terms 1)-4) above are new and from real wolrd situations. Hence, the technical objectives (1)-(4) show that this project seeks to provide new developments in both theoretical and applied research of Markov controlled processes.

本项目提出并研究终止时间随机且折扣因子不确定的Markov控制过程（英文缩写为MCP），它是当前固定的折扣因子和有限阶段MCP的拓展。本项目的提出源于下列事实：a)决策过程的终止时间可能是随机的（如机器的寿命）；b) 折扣因子可能是不确定的（如银行的利率）；c)已有MCP的研究主要致力于折扣因子与终止时间均为常数情形。针对折扣因子不确定且终止时间随机的连续时间MCP，本项目研究如何根据受控随机动态系统的状态来设计控制策略（如机器的维修方式、金融中的投资策略），使系统在运行终止前的效益（如：系统的可靠性和运行费用等性能指标）达到最优。项目研究内容有：1)首达目标折扣最优控制策略存在的条件及算法；2)首达目标"均值-方差"最优控制策略的存在性与计算；3）概率准则下最优控制策略的存在性与计算； 4)最优控制策略的结构及具体模型的应用。这些研究内容在连续MCP中是新的,并将推进MCP的新进展。

项目摘要

本项目的研究不仅是以往固定折扣因子和有限阶段MCP的拓展，而且由客观事实的需求所驱动。项目研究内容有：1)首达目标折扣最优控制策略存在的条件及算法；2)首达目标均值-方差最优控制策略的存在性与计算；3）概率准则下最优控制策略的存在性与计算；4) 最优控制策略的结构及具体模型的应用。项目研究主要成果如下。关于内容-1）：提出并研究连续时间MCP首达目标准则，通过拓展E.B.Dynkin公式和新的推移策略技术，首次给出最优策略的存在性条件及算法，相应结果发表在国际著名权EEE Trans. Automat. Control上；关于内容-2：对一般状态MCP首达目标的可变折扣因子模型，基于首次分解技术，首次给出了均值-方差最优策略的存在性条件及其计算方法，成功将诺贝尔经济学奖获得者H. M. Markovwitz的均值-方差投资组合理论拓展到离散事件动态系统情形，相关结果发表在国际著名SIAM J. Control Optim.；关于内容-3：首次研究了平均在险值（AVaR）准则和连续时间MCP的风险概率准则。关于AVaR准则，通过建立“正偏差准则”这个新的强有力的研究技巧，不仅证明了最优则略的存在性，而且还给出最优策略的值迭代算法，相应成果已在国际著名的SIAM J. Optim.上发表。对折扣连续时间MCP的风险概率准则，通过提出了包含报酬水平的更广泛的策略，给出值函数是相应最优方程唯一解的条件，证明了最优策略的存在性，提出了值迭代算法。相关结果发表在国际著名杂志Discrete Event Dyn. Syst.上；关于内容-4：提出了混合策略的概念，首次证明关于受约束连续时间MCP平均准则的最优混合策略是约束条件数个确定性平稳策略的凸组合，刻画了最优策略结构，相关结果发表在国际著名Math. Oper. Res.上。另外，我们还用维修模型、现金流模型、生灭系统等实际问题阐述本项目结果的应用。总之，本项目圆满完成的计划的研究内容，推动了MCP理论与应用的新进展，并在SIAM J. Optim.，IEEE Trans. Automat. Control，SIAM J. Control Optim.等国际著名杂志上发表SCI论文25篇，培养博士毕业生5名，硕士毕业生8名。

项目成果

DOI：{{i.doi}}

发表时间：{{i.publish_year}}

暂无此项成果

数据更新时间：2023-05-31

其他相关文献

DOI：10.13334/j.0258-8013.pcsee.190276

发表时间：2020

DOI：10.13197/j.eeev.2019.05.95.fuwq.009

发表时间：2019

DOI：

发表时间：2016

DOI：10.7641/CTA.2018.70969

发表时间：2018

DOI：

发表时间：2023

郭先平的其他基金

批准号：60574002

批准年份：2005

资助金额：23.00

项目类别：面上项目

批准号：60874004

批准年份：2008

资助金额：30.00

项目类别：面上项目

批准号：10271120

批准年份：2002

资助金额：20.00

项目类别：面上项目

批准号：61773411

批准年份：2017

资助金额：67.00

项目类别：面上项目

批准号：19901038

批准年份：1999

资助金额：5.00

项目类别：青年科学基金项目

相似国自然基金

不确定广义Markov跳变时滞系统的有限时间控制

批准号：61803186

批准年份：2018

负责人：赵俊杰

学科分类：F0301

资助金额：25.00

项目类别：青年科学基金项目

基于Markov跳变概率不确定模型的带宽受限网络控制系统有限时间控制

批准号：61403258

批准年份：2014

负责人：邱丽

学科分类：F0301

资助金额：25.00

项目类别：青年科学基金项目

Markov过程、随机点过程与风险理论

批准号：10271062

批准年份：2002

负责人：郭军义

学科分类：A0209

资助金额：19.00

项目类别：面上项目

基于通信概率不确定Markov跳变模型的自主车队协同控制系统有限时间控制研究

批准号：61703167

批准年份：2017

负责人：高焕丽

学科分类：F0301

资助金额：20.00

项目类别：青年科学基金项目

终止时间随机且折扣因子不确定的Markov控制过程

{{i.achievement_title}}

暂无此项成果

其他相关文献

多能耦合三相不平衡主动配电网与输电网交互随机模糊潮流方法

基于被动变阻尼装置高层结构风振控制效果对比分析

基于MCPF算法的列车组合定位应用研究

具有随机多跳时变时延的多航天器协同编队姿态一致性

新产品脱销等待时间对顾客抱怨行为的影响:基于有调节的双中介模型

郭先平的其他基金

排队系统的最优控制及其应用的研究

随机动态系统高级最优控制的研究

一般状态连续时间马氏过程最优控制的研究

风险灵敏的连续时间马氏决策过程

马氏决策过程的理论与应用

相似国自然基金