| 2012 | ||
|---|---|---|
| 101 | Mahdi Milani Fard, Joelle Pineau, Csaba Szepesvári: PAC-Bayesian Policy Evaluation for Reinforcement Learning CoRR abs/1202.3717: (2012) | |
| 100 | Sylvain Gelly, Levente Kocsis, Marc Schoenauer, Michèle Sebag, David Silver, Csaba Szepesvári, Olivier Teytaud: The grand challenge of computer Go: Monte Carlo tree search and extensions. Commun. ACM 55(3): 106-113 (2012) | |
| 99 | Yasin Abbasi-Yadkori, Dávid Pál, Csaba Szepesvári: Online-to-Confidence-Set Conversions and Application to Sparse Stochastic Bandits. Journal of Machine Learning Research - Proceedings Track 22: 1-9 (2012) | |
| 98 | Gergely Neu, András György, Csaba Szepesvári: The adversarial stochastic shortest path problem with unknown transition probabilities. Journal of Machine Learning Research - Proceedings Track 22: 805-813 (2012) | |
| 2011 | ||
| 97 | Jyrki Kivinen, Csaba Szepesvári, Esko Ukkonen, Thomas Zeugmann: Algorithmic Learning Theory - 22nd International Conference, ALT 2011, Espoo, Finland, October 5-7, 2011. Proceedings Springer 2011 | |
| 96 | Jyrki Kivinen, Csaba Szepesvári, Esko Ukkonen, Thomas Zeugmann: Editors' Introduction. ALT 2011: 1-13 | |
| 95 | Csaba Szepesvári: Invited Talk: Towards Robust Reinforcement Learning Algorithms. EWRL 2011: 4 | |
| 94 | Pallavi Arora, Csaba Szepesvári, Rong Zheng: Sequential learning for optimal monitoring of multi-channel wireless networks. INFOCOM 2011: 1152-1160 | |
| 93 | Yasin Abbasi-Yadkori, Dávid Pál, Csaba Szepesvári: Improved Algorithms for Linear Stochastic Bandits. NIPS 2011: 2312-2320 | |
| 92 | Mahdi Milani Fard, Joelle Pineau, Csaba Szepesvári: PAC-Bayesian Policy Evaluation for Reinforcement Learning. UAI 2011: 195-202 | |
| 91 | András Antos, Gábor Bartók, Dávid Pál, Csaba Szepesvári: Toward a Classification of Finite Partial-Monitoring Games CoRR abs/1102.2041: (2011) | |
| 90 | Yasin Abbasi-Yadkori, Dávid Pál, Csaba Szepesvári: Online Least Squares Estimation with Self-Normalized Processes: An Application to Bandit Problems CoRR abs/1102.2670: (2011) | |
| 89 | András Antos, Gábor Bartók, Csaba Szepesvári: Non-trivial two-armed partial-monitoring games are bandits CoRR abs/1108.4961: (2011) | |
| 88 | Arash Afkanpour, Csaba Szepesvári, Michael H. Bowling: Alignment Based Kernel Learning with a Continuous Set of Base Kernels CoRR abs/1112.4607: (2011) | |
| 87 | Sébastien Bubeck, Rémi Munos, Gilles Stoltz, Csaba Szepesvári: X-Armed Bandits. Journal of Machine Learning Research 12: 1655-1695 (2011) | |
| 86 | Yasin Abbasi-Yadkori, Csaba Szepesvári: Regret Bounds for the Adaptive Control of Linear Quadratic Systems. Journal of Machine Learning Research - Proceedings Track 19: 1-26 (2011) | |
| 85 | Gábor Bartók, Dávid Pál, Csaba Szepesvári: Minimax Regret of Finite Partial-Monitoring Games in Stochastic Environments. Journal of Machine Learning Research - Proceedings Track 19: 133-154 (2011) | |
| 84 | István Szita, Csaba Szepesvári: Agnostic KWIK learning and efficient approximate reinforcement learning. Journal of Machine Learning Research - Proceedings Track 19: 739-772 (2011) | |
| 83 | Amir Massoud Farahmand, Csaba Szepesvári: Model selection in reinforcement learning. Machine Learning 85(3): 299-332 (2011) | |
| 2010 | ||
| 82 | Csaba Szepesvári: Algorithms for Reinforcement Learning Morgan & Claypool Publishers 2010 | |
| 81 | Gábor Bartók, Dávid Pál, Csaba Szepesvári: Toward a Classification of Finite Partial-Monitoring Games. ALT 2010: 224-238 | |
| 80 | Gergely Neu, András György, Csaba Szepesvári: The Online Loop-free Stochastic Shortest-Path Problem. COLT 2010: 231-243 | |
| 79 | Istvan Szita, Csaba Szepesvári: Model-based reinforcement learning with nearly tight exploration complexity bounds. ICML 2010: 1031-1038 | |
| 78 | Hamid Reza Maei, Csaba Szepesvári, Shalabh Bhatnagar, Richard S. Sutton: Toward Off-Policy Learning Control with Function Approximation. ICML 2010: 719-726 | |
| 77 | Liuyang Li, Barnabás Póczos, Csaba Szepesvári, Russell Greiner: Budgeted Distribution Learning of Belief Net Parameters. ICML 2010: 879-886 | |
| 76 | Yasin Abbasi-Yadkori, Joseph Modayil, Csaba Szepesvári: Extending rapidly-exploring random trees for asymptotically optimal anytime motion planning. IROS 2010: 127-132 | |
| 75 | Gergely Neu, András György, Csaba Szepesvári, András Antos: Online Markov Decision Processes under Bandit Feedback. NIPS 2010: 1804-1812 | |
| 74 | Dávid Pál, Barnabás Póczos, Csaba Szepesvári: Estimation of Renyi Entropy and Mutual Information Based on Generalized Nearest-Neighbor Graphs. NIPS 2010: 1849-1857 | |
| 73 | Amir Massoud Farahmand, Rémi Munos, Csaba Szepesvári: Error Propagation for Approximate Policy and Value Iteration. NIPS 2010: 568-576 | |
| 72 | Sarah Filippi, Olivier Cappé, Aurélien Garivier, Csaba Szepesvári: Parametric Bandits: The Generalized Linear Case. NIPS 2010: 586-594 | |
| 71 | Sébastien Bubeck, Rémi Munos, Gilles Stoltz, Csaba Szepesvári: X-Armed Bandits CoRR abs/1001.4475: (2010) | |
| 70 | Dávid Pál, Barnabás Póczos, Csaba Szepesvári: Estimation of Rényi Entropy and Mutual Information Based on Generalized Nearest-Neighbor Graphs CoRR abs/1003.1954: (2010) | |
| 69 | Gábor Bartók, Csaba Szepesvári, Sandra Zilles: Models of active learning in group-structured state spaces. Inf. Comput. 208(4): 364-384 (2010) | |
| 68 | Barnabás Póczos, Sergey Kirshner, Csaba Szepesvári: REGO: Rank-based Estimation of Renyi Information using Euclidean Graph Optimization. Journal of Machine Learning Research - Proceedings Track 9: 605-612 (2010) | |
| 67 | Péter Torma, András György, Csaba Szepesvári: A Markov-Chain Monte Carlo Approach to Simultaneous Localization and Mapping. Journal of Machine Learning Research - Proceedings Track 9: 852-859 (2010) | |
| 66 | András Antos, Varun Grover, Csaba Szepesvári: Active learning in heteroscedastic noise. Theor. Comput. Sci. 411(29-30): 2712-2728 (2010) | |
| 2009 | ||
| 65 | Hengshuai Yao, Shalabh Bhatnagar, Csaba Szepesvári: LMS-2: Towards an algorithm that is as cheap as LMS and almost as efficient as RLS. CDC 2009: 1181-1188 | |
| 64 | Barnabás Póczos, Yasin Abbasi-Yadkori, Csaba Szepesvári, Russell Greiner, Nathan R. Sturtevant: Learning when to stop thinking and do something! ICML 2009: 104 | |
| 63 | Richard S. Sutton, Hamid Reza Maei, Doina Precup, Shalabh Bhatnagar, David Silver, Csaba Szepesvári, Eric Wiewiora: Fast gradient-descent methods for temporal-difference learning with linear function approximation. ICML 2009: 125 | |
| 62 | Jean-Yves Audibert, Peter Auer, Alessandro Lazaric, Rémi Munos, Daniil Ryabko, Csaba Szepesvári: Workshop summary: On-line learning with limited feedback. ICML 2009: 168 | |
| 61 | Alireza Farhangfar, Russell Greiner, Csaba Szepesvári: Learning to segment from a few well-selected training images. ICML 2009: 39 | |
| 60 | Amir Massoud Farahmand, Azad Shademan, Martin Jägersand, Csaba Szepesvári: Model-based and model-free reinforcement learning for visual servoing. ICRA 2009: 2917-2924 | |
| 59 | Hamid Reza Maei, Csaba Szepesvári, Shalabh Bhatnagar, Doina Precup, David Silver, Richard S. Sutton: Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation. NIPS 2009: 1204-1212 | |
| 58 | Hengshuai Yao, Richard S. Sutton, Shalabh Bhatnagar, Diao Dongcui, Csaba Szepesvári: Multi-Step Dyna Planning for Policy Evaluation and Control. NIPS 2009: 2187-2195 | |
| 57 | Yaoliang Yu, Yuxi Li, Dale Schuurmans, Csaba Szepesvári: A General Projection Property for Distribution Families. NIPS 2009: 2232-2240 | |
| 56 | Yuxi Li, Csaba Szepesvári, Dale Schuurmans: Learning Exercise Policies for American Options. Journal of Machine Learning Research - Proceedings Track 5: 352-359 (2009) | |
| 55 | Gergely Neu, Csaba Szepesvári: Training parsers by inverse reinforcement learning. Machine Learning 77(2-3): 303-337 (2009) | |
| 54 | Jean-Yves Audibert, Rémi Munos, Csaba Szepesvári: Exploration-exploitation tradeoff using variance estimates in multi-armed bandits. Theor. Comput. Sci. 410(19): 1876-1902 (2009) | |
| 2008 | ||
| 53 | András Antos, Varun Grover, Csaba Szepesvári: Active Learning in Multi-armed Bandits. ALT 2008: 287-302 | |
| 52 | Gábor Bartók, Csaba Szepesvári, Sandra Zilles: Active Learning of Group-Structured Environments. ALT 2008: 329-343 | |
| 51 | Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, Shie Mannor: Regularized Fitted Q-Iteration: Application to Planning. EWRL 2008: 55-68 | |
| 50 | Volodymyr Mnih, Csaba Szepesvári, Jean-Yves Audibert: Empirical Bernstein stopping. ICML 2008: 672-679 | |
| 49 | Richard S. Sutton, Csaba Szepesvári, Hamid Reza Maei: A Convergent O(n) Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation. NIPS 2008: 1609-1616 | |
| 48 | Sébastien Bubeck, Rémi Munos, Gilles Stoltz, Csaba Szepesvári: Online Optimization in X-Armed Bandits. NIPS 2008: 201-208 | |
| 47 | Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, Shie Mannor: Regularized Policy Iteration. NIPS 2008: 441-448 | |
| 46 | Alejandro Isaza, Csaba Szepesvári, Vadim Bulitko, Russell Greiner: Speeding Up Planning in Markov Decision Processes via Automatically Constructed Abstraction. UAI 2008: 306-314 | |
| 45 | Richard S. Sutton, Csaba Szepesvári, Alborz Geramifard, Michael H. Bowling: Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping. UAI 2008: 528-536 | |
| 44 | Rémi Munos, Csaba Szepesvári: Finite-Time Bounds for Fitted Value Iteration. Journal of Machine Learning Research 9: 815-857 (2008) | |
| 43 | András Antos, Csaba Szepesvári, Rémi Munos: Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path. Machine Learning 71(1): 89-129 (2008) | |
| 2007 | ||
| 42 | Jean-Yves Audibert, Rémi Munos, Csaba Szepesvári: Tuning Bandit Algorithms in Stochastic Environments. ALT 2007: 150-165 | |
| 41 | Peter Auer, Ronald Ortner, Csaba Szepesvári: Improved Rates for the Stochastic Continuum-Armed Bandit Problem. COLT 2007: 454-468 | |
| 40 | Amir Massoud Farahmand, Csaba Szepesvári, Jean-Yves Audibert: Manifold-adaptive dimension estimation. ICML 2007: 265-272 | |
| 39 | István Bíró, Zoltán Szamonek, Csaba Szepesvári: Sequence Prediction Exploiting Similary Information. IJCAI 2007: 1576-1581 | |
| 38 | András György, Levente Kocsis, Ivett Szabó, Csaba Szepesvári: Continuous Time Associative Bandit Problems. IJCAI 2007: 830-835 | |
| 37 | András Antos, Rémi Munos, Csaba Szepesvári: Fitted Q-iteration in continuous action-space MDPs. NIPS 2007 | |
| 36 | Gergely Neu, Csaba Szepesvári: Apprenticeship Learning using Inverse Reinforcement Learning and Gradient Methods. UAI 2007: 295-302 | |
| 2006 | ||
| 35 | Levente Kocsis, Csaba Szepesvári, Mark H. M. Winands: RSPSA: Enhanced Parameter Optimization in Games. ACG 2006: 39-56 | |
| 34 | András Antos, Csaba Szepesvári, Rémi Munos: Learning Near-Optimal Policies with Bellman-Residual Minimization Based Fitted Policy Iteration and a Single Sample Path. COLT 2006: 574-588 | |
| 33 | Levente Kocsis, Csaba Szepesvári: Bandit Based Monte-Carlo Planning. ECML 2006: 282-293 | |
| 32 | Péter Torma, Csaba Szepesvári: Local Importance Sampling: A Novel Technique to Enhance Particle Filtering. Journal of Multimedia 1(1): 32-43 (2006) | |
| 31 | Levente Kocsis, Csaba Szepesvári: Universal parameter optimisation in games based on SPSA. Machine Learning 63(3): 249-286 (2006) | |
| 2005 | ||
| 30 | Zoltán Szamonek, Csaba Szepesvári: X-mHMM: An Efficient Algorithm for Training Mixtures of HMMs When the Number of Mixtures Is Unknown. ICDM 2005: 434-441 | |
| 29 | Csaba Szepesvári, Rémi Munos: Finite time bounds for sampling based fitted value iteration. ICML 2005: 880-887 | |
| 2004 | ||
| 28 | Csaba Szepesvári: Shortest Path Discovery Problems: A Framework, Algorithms and Experimental Results. AAAI 2004: 550-555 | |
| 27 | Csaba Szepesvári, András Kocsor, Kornél Kovács: Kernel Machine Based Feature Extraction Algorithms for Regression Problems. ECAI 2004: 1091-1092 | |
| 26 | Péter Torma, Csaba Szepesvári: Enhancing Particle Filters Using Local Likelihood Sampling. ECCV (1) 2004: 16-27 | |
| 25 | András Kocsor, Kornél Kovács, Csaba Szepesvári: Margin Maximizing Discriminant Analysis. ECML 2004: 227-238 | |
| 24 | Csaba Szepesvári, William D. Smart: Interpolation-based Q-learning. ICML 2004 | |
| 2002 | ||
| 23 | M. French, Csaba Szepesvári, Eric Rogers: LQ performance bounds for adaptive output feedback controllers for functionally uncertain nonlinear systems. Automatica 38(4): 683-693 (2002) | |
| 22 | M. French, Csaba Szepesvári, Eric Rogers: An Asymptotic Scaling Analysis of LQ Performance for an Approximate Adaptive Control Design. MCSS 15(2): 145-176 (2002) | |
| 2001 | ||
| 21 | Csaba Szepesvári: Efficient approximate planning in continuous space Markovian Decision Problems. AI Commun. 14(3): 163-176 (2001) | |
| 20 | András Lörincz, György Hévízi, Csaba Szepesvári: Ockham's Razor Modeling of the Matrisome Channels of the Basal Ganglia Thalamocortical Loops. Int. J. Neural Syst. 11(2): 125-143 (2001) | |
| 2000 | ||
| 19 | György Balogh, Ervin Dobler, Tamás Gröbler, Béla Smodics, Csaba Szepesvári: FlexVoice: A Parametric Approach to High-Quality Speech Synthesis. TSD 2000: 189-194 | |
| 18 | Zsolt Kalmár, Csaba Szepesvári, András Lörincz: Modular Reinforcement Learning: A Case Study in a Robot Domain. Acta Cybern. 14(3): 507-522 (2000) | |
| 17 | Satinder P. Singh, Tommi Jaakkola, Michael L. Littman, Csaba Szepesvári: Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms. Machine Learning 38(3): 287-308 (2000) | |
| 1999 | ||
| 16 | Csaba Szepesvári, Michael L. Littman: A Unified Analysis of Value-Function-Based Reinforcement Learning Algorithms. Neural Computation 11(8): 2017-2060 (1999) | |
| 15 | Zsolt Kalmár, Zsolt Marczell, Csaba Szepesvári, András Lörincz: Parallel and robust skeletonization built on self-organizing elements. Neural Networks 12(1): 163-173 (1999) | |
| 14 | János Murvai, Kristian Vlahovicek, Endre Barta, Csaba Szepesvári, Cristina Acatrinei, Sándor Pongor: The SBASE protein domain library, release 6.0: a collection of annotated protein sequence segments. Nucleic Acids Research 27(1): 257-259 (1999) | |
| 1998 | ||
| 13 | Zoltán Gábor, Zsolt Kalmár, Csaba Szepesvári: Multi-criteria Reinforcement Learning. ICML 1998: 197-205 | |
| 12 | Csaba Szepesvári: Non-Markovian Policies in Sequential Decision Problems. Acta Cybern. 13(3): 305-318 (1998) | |
| 11 | Zsolt Kalmár, Csaba Szepesvári, András Lörincz: Module-Based Reinforcement Learning: Experiments with a Real Robot. Auton. Robots 5(3-4): 273-295 (1998) | |
| 10 | Zsolt Kalmár, Csaba Szepesvári, András Lörincz: Module-Based Reinforcement Learning: Experiments with a Real Robot. Machine Learning 31(1-3): 55-85 (1998) | |
| 1997 | ||
| 9 | Csaba Szepesvári: Learning and Exploitation Do Not Conflict Under Minimax Optimality. ECML 1997: 242-249 | |
| 8 | Zsolt Kalmár, Csaba Szepesvári, András Lörincz: Module Based Reinforcement Learning: An Application to a Real Robot. EWLR 1997: 29-45 | |
| 7 | Csaba Szepesvári: The Asymptotic Convergence-Rate of Q-learning. NIPS 1997 | |
| 6 | Csaba Szepesvári, Szabolcs Cimmer, András Lörincz: Neurocontroller using dynamic state feedback for compensatory control. Neural Networks 10(9): 1691-1708 (1997) | |
| 1996 | ||
| 5 | Csaba Szepesvári, András Lörincz: Inverse Dynamics Controllers for Robust Control: Consequences for Neurocontrollers. ICANN 1996: 791-796 | |
| 4 | Michael L. Littman, Csaba Szepesvári: A Generalized Reinforcement-Learning Model: Convergence and Applications. ICML 1996: 310-318 | |
| 3 | Tibor Fomin, Tamás Rozgonyi, Csaba Szepesvári, András Lörincz: Self-Organizing Multi-Resolution Grid for Motion Planning and Control. Int. J. Neural Syst. 7(6): 757- (1996) | |
| 2 | Csaba Szepesvári, András Lörincz: Approximate geometry representations and sensory fusion. Neurocomputing 12(2-3): 267-287 (1996) | |
| 1994 | ||
| 1 | Csaba Szepesvári, László Balázs, András Lörincz: Topology Learning Solved by Extended Objects: A Neural Network Model. Neural Computation 6(3): 441-458 (1994) | |
Colors in the list of coauthors
Last update Fri May 25 01:42:58 2012 CET by the DBLP Team —
Data released under the ODC-BY 1.0 license — See also our legal information page