direkt zum Inhalt springen

direkt zum Hauptnavigationsmenü

Sie sind hier

TU Berlin

Inhalt des Dokuments

Approximate Reinforcement Learning

Lupe [1]

Fully autonomous agents that interact with the environment (like humans and robots) present challenges very different from classic machine learning. The agent must balance future benefits of actions against their costs without the advantage of a teacher or prior knowledge of the environment. In addition costs may not only include the expected benefits (or rewards), but may well be formulated as a trade-off between different objectives (for example: rewards vs. risk).
Exact solutions in the field of Reinforcement Learning scale badly with the task's complexity and are rarely applicable in practice. To close the gap between theory and reality, this project aims for approximate solutions that not only make favourable decisions but also avoid irrational behaviour or dead ends. The approximation's highly adaptive nature allows a direct application onto the agent's sensor data and therefore a full sensor-actor control loop. Newly developed algorithms are tested in simulations and on robotic systems. Reinforcement and reward-based learning is also investigated in the context of understanding and modeling human decision making. For details see "Research" page "Perception and Decision Making in Uncertain Environments [2]".


Acknowledgements: Research is funded by Deutsche Forschungsgemeinschaft (DFG), Human-Centric Communication Cluster (H-C3) and Technische Universität Berlin.

Selected Publications:

Böhmer, W., Grünewälder, S., Shen, Y., Musial, M. and Obermayer, K. (2013). Construction of Approximation Spaces for Reinforcement Learning [3]. Journal of Machine Learning Research, 14, 2067–2118.


Shen, Y., Stannat, W. and Obermayer, K. (2013). Risk-sensitive Markov Control Processes [4]. SIAM Journal on Control and Optimization, 51, 3652–3672.


Böhmer, W. and Obermayer, K. (2013). Towards Structural Generalization: Factored Approximate Planning [5]. ICRA Workshop on Autonomous Learning


Böhmer, W., Grünewalder, S., Nickisch, H. and Obermayer, K. (2012). Generating feature spaces for linear algorithms with regularized sparse kernel slow feature analysis [6]. Machine Learning, 89, 67–86.


Böhmer, W., Grünewälder, S., Nickisch, H. and Obermayer, K. (2011). Regularized Sparse Kernel Slow Feature Analysis [7]. Lecture Notes in Computer Science. Springer-Verlag Berlin Heidelberg, 235–248.,


Grünwälder, S. and Obermayer, K. (2011). The Optimal Unbiased Extimator and its Relation to LSTD, TD and MC [8]. Machine Learning, 83, 289 – 330.


------ Links: ------

Zusatzinformationen / Extras

Quick Access:

Schnellnavigation zur Seite über Nummerneingabe

Auxiliary Functions

Copyright TU Berlin 2008