### Inhalt des Dokuments

# Learning Vector Quantization and Self-organizing Maps

Self-organizing maps, often termed Kohonen maps, are a versatile and widely used tool for exploratory data analysis. Here we were interested in mathematically characterizing the embedding properties of the Self-organizing Map. We proposed robust learning schemes using deterministic annealing and we investigated extensions of the Self-organizing Map to relational data representations which included pairwise data as a special case. Emphasis was given to formulations which are based on cost-functions and optimization, and we investigated, how the different variants of the Self-organizing map relate to each other and to the original Kohonen map. We also studied prototype-based classifiers related to Learning Vector Quantization with a particular focus on improved learning schemes. Self-organizing maps were also investigated in the context of understanding self-organization and pattern formation in neural development. For details see "Research" page "Models of Neural Development".

Acknowledgement: Research was funded by the Technische Universität Berlin.

### Selected Publications:

Citation key | Graepel1997b |
---|---|

Author | Graepel, T. and Burger, M. and Obermayer, K. |

Pages | 3876 – 3890 |

Year | 1997 |

DOI | 10.1103/PhysRevE.56.3876 |

Journal | PHYSICAL REVIEW E |

Volume | 56 |

Publisher | APS |

Abstract | We describe the development of neighborhood-preserving stochastic maps in terms of a probabilistic clustering problem. Starting from a cost function for central clustering that incorporates distortions from channel noise we derive a soft topographic vector quantization algorithm (STVQ) which is based on the maximum entropy principle and which maximizes the corresponding likelihood in an expectation-maximization (EM) fashion. Among other algorithms a probabilistic version of Kohonen\'s self-organizing map (SOM) is derived from STVQ as a computationally efficient approximation of the E-step. The foundation of STVQ in statistical physics motivates a deterministic annealing scheme in the temperature parameter $\\beta$, and leads to a robust minimization algorithm of the clustering cost function. In particular, this scheme offers an alternative to the common stepwise shrinking of the neighborhood width in the SOM and makes it possible to use its neighborhood function solely to encode the desired neighborhood relations between the clusters. The annealing in $\\beta$, which corresponds to a stepwise refinement of the resolution of representation in data space, leads to the splitting of an existing cluster representation during the ``cooling\'\' process. We describe this phase transition in terms of the covariance matrix C of the data and the transition matrix H of the channel noise and calculate the critical temperatures and modes as functions the eigenvalues and eigenvectors of C and H. The analysis is extended to the phenomenon of the automatic selection of feature dimensions in dimension-reducing maps, thus leading to a ``batch\'\'-alternative to the Fokker-Planck formalism for on-line learning. The results provide insights into the relation between the width of the neighborhood and the temperature parameter $\\beta$: It is shown that the phase transition which leads to the representation of the excess-dimensions can be triggered not only by a change in the statistics of the input data but also by an increase of $\\beta$, which corresponds to a decrease in noise level. The theoretical results are validated by numerical methods. In particular, a quantity equivalent to the heat capacity in thermodynamics is introduced to visualize the properties of the annealing process. |

Bibtex Type of Publication | Selected:quantization |