Inhalt des Dokuments
Deep Networks
Deep neural networks are very successful in many application areas. Nevertheless, it is unclear so far why they are so successful. Especially the success of overparameterized neural networks contradicts the findings of statistical learning theory. With the help of the analysis of representations, we try to reveal new insights. We are interested in the following questions.
- Are (visual) tasks "related" and can we quantify the "closeness" of tasks?
- What are the contributions of data set (input statistics) vs. task demand (input-output statistics)?
- How can we efficiently mine these relationships?
- Does the concept of an intermediate-level representation help?
- Are there universal representations for (visual) data, which allow for an efficient solution for many "everyday" tasks?
Currently, we meet every Thursday at 2 pm to discuss these issues and gain new insights. If you are interested, don't hesitate to get in touch with deep.networks@ni.tu-berlin.de.
Selected Publications:
Citation key | Goerttler2021 |
---|---|
Author | Goerttler, T. and Obermayer, K. |
Year | 2021 |
Journal | Learning to Learn workshop at ICLR 2021 |
Abstract | In past years model-agnostic meta-learning (MAML) has been one of the most promising approaches in meta-learning. It can be applied to different kinds of problems, e.g., reinforcement learning, but also shows good results on few-shot learning tasks. Besides their tremendous success in these tasks, it has still not been fully revealed yet, why it works so well. Recent work proposes that MAML rather reuses features than rapidly learns. In this paper, we want to inspire a deeper understanding of this question by analyzing MAML's representation. We apply representation similarity analysis (RSA), a well-established method in neuroscience, to the few-shot learning instantiation of MAML. Although some part of our analysis supports their general results that feature reuse is predominant, we also reveal arguments against their conclusion. The similarity-increase of layers closer to the input layers arises from the learning task itself and not from the model. In addition, the representations after inner gradient steps make a broader change to the representation than the changes during meta-training. |
Bibtex Type of Publication | Selected:structured selected:main selected:quantify |
Zusatzinformationen / Extras
Quick Access:
Schnellnavigation zur Seite über Nummerneingabe