direkt zum Inhalt springen

direkt zum Hauptnavigationsmenü

Sie sind hier

TU Berlin

Inhalt des Dokuments

Approximate Reinforcement Learning

Lupe

Fully autonomous agents that interact with the environment (like humans and robots) present challenges very different from classic machine learning. The agent must balance future benefits of actions against their costs without the advantage of a teacher or prior knowledge of the environment. In addition costs may not only include the expected benefits (or rewards), but may well be formulated as a trade-off between different objectives (for example: rewards vs. risk).
Exact solutions in the field of Reinforcement Learning scale badly with the task's complexity and are rarely applicable in practice. To close the gap between theory and reality, this project aims for approximate solutions that not only make favourable decisions but also avoid irrational behaviour or dead ends. The approximation's highly adaptive nature allows a direct application onto the agent's sensor data and therefore a full sensor-actor control loop. Newly developed algorithms are tested in simulations and on robotic systems. Reinforcement and reward-based learning is also investigated in the context of understanding and modeling human decision making. For details see "Research" page "Perception and Decision Making in Uncertain Environments".


Acknowledgements: Research is funded by Deutsche Forschungsgemeinschaft (DFG), Human-Centric Communication Cluster (H-C3) and Technische Universität Berlin.

Selected Publications:

Regularized Sparse Kernel Slow Feature Analysis
Citation key Boehmer2011
Author Böhmer, W. and Grünewälder, S. and Nickisch, H. and Obermayer, K.
Title of Book Lecture Notes in Computer Science
Pages 235–248
Year 2011
Volume 6911
Month September
Editor Gunopulos, D.; Hofmann, Th.; Malerba, D.; Vazirgiannis, M.
Publisher Springer-Verlag Berlin Heidelberg
Series LNAI 6911
Chapter Machine Learning and Knowledge Discovery in Databases
Abstract This paper develops a kernelized slow feature analysis (SFA) algorithm. SFA is an unsupervised learning method to extract features which encode latent variables from time series. Generative relationships are usually complex, and current algorithms are either not powerful enough or tend to over-fit. We make use of the kernel trick in combi- nation with sparsification to provide a powerful function class for large data sets. Sparsity is achieved by a novel matching pursuit approach that can be applied to other tasks as well. For small but complex data sets, however, the kernel SFA approach leads to over-fitting and numeri- cal instabilities. To enforce a stable solution, we introduce regularization to the SFA objective. Versatility and performance of our method are demonstrated on audio and video data sets.
Bibtex Type of Publication Selected:reinforcement
Link to original publication Download Bibtex entry

Zusatzinformationen / Extras

Quick Access:

Schnellnavigation zur Seite über Nummerneingabe

Auxiliary Functions