Artigo Revisado por pares

The Existence of a Stationary $\varepsilon $-Optimal Policy for a Finite Markov Chain

1979; Society for Industrial and Applied Mathematics; Volume: 23; Issue: 2 Linguagem: Inglês

10.1137/1123033

ISSN

1095-7219

Autores

E. A. Fainberg,

Tópico(s)

Markov Chains and Monte Carlo Methods

Resumo

Previous article Next article The Existence of a Stationary $\varepsilon $-Optimal Policy for a Finite Markov ChainE. A. FainbergE. A. Fainberghttps://doi.org/10.1137/1123033PDFBibTexSections ToolsAdd to favoritesExport CitationTrack CitationsEmail SectionsAbout[1] O. V. Viskov and , A. N. Širjaev, On controls which reduce to optimal stationary regimes, Trudy Mat. Inst. Steklov., 71 (1964), 35–45, (In Russian.) MR0173579 Google Scholar[2] Cyrus Derman, On sequential decisions and Markov chains, Management Sci., 9 (1962/1963), 16–24 MR0169685 0995.90621 CrossrefGoogle Scholar[3] E. B. Dynkin and , A. A. Yushkevich, Controlled Markov processes and their applications, Nauka, Moscow, 1975, (In Russian.) 0466.90046 Google Scholar[4] R. Ya. Chitashvili, A controlled finite Markov chain with an arbitrary set of decisions, Theory Prob. Applications, 20 (1975), 839–846 10.1137/1120091 0371.60084 LinkGoogle Scholar[5] John Bather, Optimal decision procedures for finite Markov chains. I. Examples, Advances in Appl. Probability, 5 (1973), 328–339 MR0368790 0262.90063 CrossrefGoogle Scholar[6] Sheldon M. Ross, On the nonexistence of $\varepsilon$-optimal randomized stationary policies in average cost Markov decision models, Ann. Math. Statist., 42 (1971), 1767–1768 MR0359821 0238.62094 CrossrefGoogle Scholar[7] E. A. Fainberg, On controlled finite state Markov processes with compact sets, Theory Prob. Applications, 20 (1975), 856–861 10.1137/1120093 0372.60093 LinkGoogle Scholar[8] Anders Martin-Löf, Existence of a stationary control for a Markov chain maximizing the average reward, Operations Res., 15 (1967), 866–871 MR0225565 0149.38103 CrossrefGoogle Scholar[9] John Bather, Optimal decision procedures for finite Markov chains. II. Communicating systems, Advances in Appl. Probability, 5 (1973), 521–540 MR0426858 0275.90049 CrossrefGoogle Scholar[10] Cyrus Derman, Finite state Markovian decision processes, Mathematics in Science and Engineering, Vol. 67, Academic Press, New York, 1970xiii+159 MR0267686 0262.90001 Google Scholar[11] A. Hordijk, Dynamic programming and Markov potential theory, Mathematisch Centrum, Amsterdam, 1974vi+134 pp. (loose addendum), Math. Centre Trancts, 51 MR0432227 0284.49012 Google Scholar[12] R. Howard, Dynamic Programming and Markov Processes, Soviet Radio, Moscow, 1964, (In Russian.) 0117.15404 Google Scholar[13] R. Tyrrell Rockafellar, Convex analysis, Princeton Mathematical Series, No. 28, Princeton University Press, Princeton, N.J., 1970xviii+451 MR0274683 0193.18401 CrossrefGoogle Scholar[14] John Bather, Optimal decision procedures for finite Markov chains. III. General convex systems, Advances in Appl. Probability, 5 (1973), 541–553 MR0426859 0275.90050 CrossrefGoogle Scholar[15] E. A. Fai˘nberg, Finite controllable Markov chains, Uspehi Mat. Nauk, 32 (1977), 181–182, (In Russian.) MR0496709 Google Scholar Previous article Next article FiguresRelatedReferencesCited byDetails The average cost of Markov chains subject to total variation distance uncertaintySystems & Control Letters, Vol. 120 Cross Ref Equilibrium control policies for Markov chains Cross Ref Discounted Approximations for Risk-Sensitive Average Criteria in Markov Decision Chains with Finite State SpaceMathematics of Operations Research, Vol. 36, No. 1 Cross Ref Policy iteration type algorithms for recurrent state Markov decision processesComputers & Operations Research, Vol. 31, No. 14 Cross Ref Introduction Cross Ref Discrete-Time Controlled Markov Processes with Average Cost Criterion: A SurveyAristotle Arapostathis, Vivek S. Borkar, Emmanuel Fernández-Gaucherand, Mrinal K. Ghosh, and Steven I. Marcus14 July 2006 | SIAM Journal on Control and Optimization, Vol. 31, No. 2AbstractPDF (7400 KB)An Average Reward Criterion by Mandl Cross Ref An expected average reward criterionStochastic Processes and their Applications, Vol. 26 Cross Ref On maximizing the average time at a goalStochastic Processes and their Applications, Vol. 17, No. 2 Cross Ref Volume 23, Issue 2| 1979Theory of Probability & Its Applications History Submitted:02 March 1976Published online:17 July 2006 InformationCopyright © 1979 Society for Industrial and Applied MathematicsPDF Download Article & Publication DataArticle DOI:10.1137/1123033Article page range:pp. 297-313ISSN (print):0040-585XISSN (online):1095-7219Publisher:Society for Industrial and Applied Mathematics

Referência(s)