Please note: This is a beta version of the new dblp website.
You can find the classic dblp view of this page here.
You can find the classic dblp view of this page here.
Rémi Munos
2010 – today
- 2013
[j19]Mohammad Gheshlaghi Azar, Rémi Munos, Hilbert J. Kappen: Minimax PAC bounds on the sample complexity of reinforcement learning with a generative model. Machine Learning 91(3): 325-349 (2013)
[i7]Amir Sani, Alessandro Lazaric, Rémi Munos: Risk-Aversion in Multi-armed Bandits. CoRR abs/1301.1936 (2013)
[i6]Odalric-Ambrym Maillard, Rémi Munos, Daniil Ryabko: Selecting the State-Representation in Reinforcement Learning. CoRR abs/1302.2552 (2013)- 2012
[j18]Alessandro Lazaric, Rémi Munos: Learning with stochastic inputs and adversarial outputs. J. Comput. Syst. Sci. 78(5): 1516-1537 (2012)
[j17]Lucian Busoniu, Rémi Munos: Optimistic planning for Markov decision processes. Journal of Machine Learning Research - Proceedings Track 22: 182-189 (2012)
[j16]Alexandra Carpentier, Rémi Munos: Bandit Theory meets Compressed Sensing for high dimensional Stochastic Linear Bandit. Journal of Machine Learning Research - Proceedings Track 22: 190-198 (2012)
[c51]Emilie Kaufmann, Nathaniel Korda, Rémi Munos: Thompson Sampling: An Asymptotically Optimal Finite-Time Analysis. ALT 2012: 199-213
[c50]Ronald Ortner, Daniil Ryabko, Peter Auer, Rémi Munos: Regret Bounds for Restless Markov Bandits. ALT 2012: 214-228
[c49]Alexandra Carpentier, Rémi Munos: Minimax Number of Strata for Online Stratified Sampling Given Noisy Samples. ALT 2012: 229-244
[c48]Mohammad Gheshlaghi Azar, Rémi Munos, Bert Kappen: On the Sample Complexity of Reinforcement Learning with a Generative Model . ICML 2012
[c47]Alexandra Carpentier, Rémi Munos: Adaptive Stratified Sampling for Monte-Carlo integration of Differentiable functions. NIPS 2012: 251-259
[c46]Joan Fruitet, Alexandra Carpentier, Rémi Munos, Maureen Clerc: Bandit Algorithms boost Brain Computer Interfaces for motor-task selection of a brain-controlled button. NIPS 2012: 458-466
[c45]Amir Sani, Alessandro Lazaric, Rémi Munos: Risk-Aversion in Multi-armed Bandits. NIPS 2012: 3284-3292
[i5]Emilie Kaufmann, Nathaniel Korda, Rémi Munos: Thompson Sampling: An Optimal Finite Time Analysis. CoRR abs/1205.4217 (2012)
[i4]Ronald Ortner, Daniil Ryabko, Peter Auer, Rémi Munos: Regret Bounds for Restless Markov Bandits. CoRR abs/1209.2693 (2012)- 2011
[j15]Sébastien Bubeck, Rémi Munos, Gilles Stoltz, Csaba Szepesvári: X-Armed Bandits. Journal of Machine Learning Research 12: 1655-1695 (2011)
[j14]Odalric-Ambrym Maillard, Rémi Munos: Adaptive Bandits: Towards the best history-dependent strategy. Journal of Machine Learning Research - Proceedings Track 15: 570-578 (2011)
[j13]Odalric-Ambrym Maillard, Rémi Munos, Gilles Stoltz: A Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences. Journal of Machine Learning Research - Proceedings Track 19: 497-514 (2011)
[j12]Sébastien Bubeck, Rémi Munos, Gilles Stoltz: Pure exploration in finitely-armed and continuous-armed bandits. Theor. Comput. Sci. 412(19): 1832-1852 (2011)
[c44]Alexandra Carpentier, Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos, Peter Auer: Upper-Confidence-Bound Algorithms for Active Learning in Multi-armed Bandits. ALT 2011: 189-203
[c43]Matthew W. Hoffman, Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos: Regularized Least Squares Temporal Difference Learning with Nested ℓ2 and ℓ1 Penalization. EWRL 2011: 102-114
[c42]Mohammad Ghavamzadeh, Alessandro Lazaric, Rémi Munos, Matthew W. Hoffman: Finite-Sample Analysis of Lasso-TD. ICML 2011: 1177-1184
[c41]Rémi Munos: Optimistic Optimization of a Deterministic Function without the Knowledge of its Smoothness. NIPS 2011: 783-791
[c40]Alexandra Carpentier, Rémi Munos: Finite Time Analysis of Stratified Sampling for Monte Carlo. NIPS 2011: 1278-1286
[c39]Alexandra Carpentier, Odalric-Ambrym Maillard, Rémi Munos: Sparse Recovery with Brownian Sensing. NIPS 2011: 1782-1790
[c38]Mohammad Gheshlaghi Azar, Rémi Munos, Mohammad Ghavamzadeh, Hilbert J. Kappen: Speedy Q-Learning. NIPS 2011: 2411-2419
[c37]Odalric-Ambrym Maillard, Rémi Munos, Daniil Ryabko: Selecting the State-Representation in Reinforcement Learning. NIPS 2011: 2627-2635- 2010
[j11]Odalric-Ambrym Maillard, Rémi Munos, Alessandro Lazaric, Mohammad Ghavamzadeh: Finite-sample Analysis of Bellman Residual Minimization. Journal of Machine Learning Research - Proceedings Track 13: 299-314 (2010)
[c36]Jean-Yves Audibert, Sébastien Bubeck, Rémi Munos: Best Arm Identification in Multi-Armed Bandits. COLT 2010: 41-53
[c35]
[c34]Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos: Analysis of a Classification-based Policy Iteration Algorithm. ICML 2010: 607-614
[c33]Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos: Finite-Sample Analysis of LSTD. ICML 2010: 615-622
[c32]Amir Massoud Farahmand, Rémi Munos, Csaba Szepesvári: Error Propagation for Approximate Policy and Value Iteration. NIPS 2010: 568-576
[c31]Mohammad Ghavamzadeh, Alessandro Lazaric, Odalric-Ambrym Maillard, Rémi Munos: LSTD with Random Projections. NIPS 2010: 721-729
[c30]Odalric-Ambrym Maillard, Rémi Munos: Scrambled Objects for Least-Squares Regression. NIPS 2010: 1549-1557
[c29]Odalric-Ambrym Maillard, Rémi Munos: Online Learning in Adversarial Lipschitz Environments. ECML/PKDD (2) 2010: 305-320
[i3]Sébastien Bubeck, Rémi Munos, Gilles Stoltz, Csaba Szepesvári: X-Armed Bandits. CoRR abs/1001.4475 (2010)
2000 – 2009
- 2009
[j10]Jean-Yves Audibert, Rémi Munos, Csaba Szepesvári: Exploration-exploitation tradeoff using variance estimates in multi-armed bandits. Theor. Comput. Sci. 410(19): 1876-1902 (2009)
[c28]Sébastien Bubeck, Rémi Munos, Gilles Stoltz: Pure Exploration in Multi-armed Bandits Problems. ALT 2009: 23-37
[c27]
[c26]Jean-Yves Audibert, Peter Auer, Alessandro Lazaric, Rémi Munos, Daniil Ryabko, Csaba Szepesvári: Workshop summary: On-line learning with limited feedback. ICML 2009: 168
[c25]Pierre-Arnaud Coquelin, Romain Deguest, Rémi Munos: Sensitivity analysis in HMMs with application to likelihood maximization. NIPS 2009: 387-395
[c24]- 2008
[j9]Rémi Munos, Csaba Szepesvári: Finite-Time Bounds for Fitted Value Iteration. Journal of Machine Learning Research 9: 815-857 (2008)
[j8]András Antos, Csaba Szepesvári, Rémi Munos: Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path. Machine Learning 71(1): 89-129 (2008)
[c23]Raphaël Maîtrepierre, Jérémie Mary, Rémi Munos: Adaptive play in Texas Hold'em Poker. ECAI 2008: 458-462
[c22]
[c21]Sébastien Bubeck, Rémi Munos, Gilles Stoltz, Csaba Szepesvári: Online Optimization in X-Armed Bandits. NIPS 2008: 201-208
[c20]Pierre-Arnaud Coquelin, Romain Deguest, Rémi Munos: Particle Filter-based Policy Gradient in POMDPs. NIPS 2008: 337-344
[c19]Yizao Wang, Jean-Yves Audibert, Rémi Munos: Algorithms for Infinitely Many-Armed Bandits. NIPS 2008: 1729-1736
[e1]Sertan Girgin, Manuel Loth, Rémi Munos, Philippe Preux, Daniil Ryabko (Eds.): Recent Advances in Reinforcement Learning, 8th European Workshop, EWRL 2008, Villeneuve d'Ascq, France, June 30 - July 3, 2008, Revised and Selected Papers. Lecture Notes in Computer Science 5323, Springer 2008, ISBN 978-3-540-89721-7
[i2]Sébastien Bubeck, Rémi Munos, Gilles Stoltz: Pure Exploration for Multi-Armed Bandit Problems. CoRR abs/0802.2655 (2008)- 2007
[j7]Rémi Munos: Analyse en norme Lp de l'algorithme d'itérations sur les valeurs avec approximations. Revue d'Intelligence Artificielle 21(1): 53-74 (2007)
[j6]Rémi Munos: Performance Bounds in Lp-norm for Approximate Value Iteration. SIAM J. Control and Optimization 46(2): 541-561 (2007)
[c18]Jean-Yves Audibert, Rémi Munos, Csaba Szepesvári: Tuning Bandit Algorithms in Stochastic Environments. ALT 2007: 150-165
[c17]András Antos, Rémi Munos, Csaba Szepesvári: Fitted Q-iteration in continuous action-space MDPs. NIPS 2007
[c16]
[i1]- 2006
[j5]Rémi Munos: Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation. Journal of Machine Learning Research 7: 413-427 (2006)
[j4]Rémi Munos: Policy Gradient in Continuous Time. Journal of Machine Learning Research 7: 771-791 (2006)
[c15]András Antos, Csaba Szepesvári, Rémi Munos: Learning Near-Optimal Policies with Bellman-Residual Minimization Based Fitted Policy Iteration and a Single Sample Path. COLT 2006: 574-588- 2005
[j3]Emmanuel Gobet, Rémi Munos: Sensitivity Analysis Using It[o-circumflex]--Malliavin Calculus and Martingales, and Application to Stochastic Optimal Control. SIAM J. Control and Optimization 43(5): 1676-1713 (2005)
[c14]
[c13]Rémi Munos: Geometric Variance Reduction in Markov Chains. Application to Value Function and Gradient Estimation. AAAI 2005: 1012-1017
[c12]
[c11]Csaba Szepesvári, Rémi Munos: Finite time bounds for sampling based fitted value iteration. ICML 2005: 880-887- 2003
[c10]- 2002
[j2]Rémi Munos, Andrew W. Moore: Variable Resolution Discretization in Optimal Control. Machine Learning 49(2-3): 291-323 (2002)- 2001
[c9]- 2000
[j1]Rémi Munos: A Study of Reinforcement Learning in the Continuous Case by the Means of Viscosity Solutions. Machine Learning 40(3): 265-299 (2000)
[c8]Rémi Munos, Andrew W. Moore: Rates of Convergence for Variable Resolution Schemes in Optimal Control. ICML 2000: 647-654
1990 – 1999
- 1999
[c7]Rémi Munos, Andrew W. Moore: Variable Resolution Discretization for High-Accuracy Solutions of Optimal Control Problems. IJCAI 1999: 1348-1355- 1998
[c6]Rémi Munos: A General Convergence Method for Reinforcement Learning in the Continuous Case. ECML 1998: 394-405
[c5]Rémi Munos, Andrew W. Moore: Barycentric Interpolators for Continuous Space and Time Reinforcement Learning. NIPS 1998: 1024-1030- 1997
[c4]Rémi Munos: Finite-Element Methods with Local Triangulation Refinement for Continuous Reimforcement Learning Problems. ECML 1997: 170-182
[c3]Rémi Munos: A Convergent Reinforcement Learning Algorithm in the Continuous Case Based on a Finite Difference Method. IJCAI (2) 1997: 826-831
[c2]Rémi Munos, Paul Bourgine: Reinforcement Learning for Continuous Stochastic Control Problems. NIPS 1997- 1996
[c1]Rémi Munos: A Convergent Reinforcement Learning Algorithm in the Continuous Case: The Finite-Element Reinforcement Learning. ICML 1996: 337-345
Coauthor Index
data released under the ODC-BY 1.0 license. See also our legal information page
last updated on 2013-05-28 21:40 CEST by the dblp team



