### Page Content

### to Navigation

# Learning on Structured Representations

Learning from examples in order to predict is one of the standard tasks in machine learning. Many techniques have been developed to solve classification and regression problems, but by far, most of them were specifically designed for vectorial data. Vectorial data are very convenient because of the structure imposed by the Euclidean metric. For many data sets (protein sequences, text, images, videos, chemical formulas, etc.) a vector-based description is not only inconvenient but may simply wrong, and representations that consider relationships between objects or that embed objects in spaces with non-Euclidean structure are often more appropriate. Here we follow different approaches to extend learning from examples to non-vectorial data. One approach is focussed on an extension of kernel methods leading to learning algorithms specifically designed for relational data representations of a general form. In a second approach - specifically designed for objects which are naturally represented in terms of finite combinatorial structures - we explore embeddings into quotient spaces of a Euclidean vector space ("structure spaces"). In a third approach we consider representations of in spaces with data adapted geometries, i.e. using Riemannian manifolds as models for data spaces. In this context we are also interested in active learning schemes which are based on geometrical concepts. The developed algorithms have been applied to various applications domains, including bio- and chemoinformatics (cf. "Research" page "Applications to Problems in Bio- and Chemoinformatics") and the analysis of multimodal neural data (cf. "Research" page "MRI, EM, Autoradiography, and Multi-modal Data").

Acknowledgement: This work was funded by the BMWA and by the Technical University of Berlin.

### Software:

The Potential Support Vector Machine (P-SVM)### Selected Publications:

Citation key | Jain2009c |
---|---|

Author | Jain, B. and Obermayer, K. |

Pages | 2667 – 2714 |

Year | 2009 |

ISSN | 1532-4435 |

Journal | Journal of Machine Learning Research |

Volume | 10 |

Abstract | Finite structures such as point patterns, strings, trees, and graphs occur as "natural" representations of structured data in different application areas of machine learning. We develop the theory of structure spaces and derive geometrical and analytical concepts such as the angle between structures and the derivative of functions on structures. In particular, we show that the gradient of a differentiable structural function is a well-defined structure pointing in the direction of steepest ascent. Exploiting the properties of structure spaces, it will turn out that a number of problems in structural pattern recognition such as central clustering or learning in structured output spaces can be formulated as optimization problems with cost functions that are locally Lipschitz. Hence, methods from nonsmooth analysis are applicable to optimize those cost functions. |

Bibtex Type of Publication | Selected:main selected:structured selected:publications |