[21 February, 13:00] Let's Talk ML

Radek Bartyzal - Born Again Neural Networks (slides)

Knowledge distillation is a process of training a compact model (student) to approximate the results of a previously trained, more complex model (teacher).
The authors of this paper have inspired themselves by this idea and tried training a student of same complexity as its teacher and found that the student surpasses the teacher in many cases. They also try to train a student that has a different architecture than the teacher with interesting results.

This will be one longer (40 min) talk where I will also describe the relevant architectures used in the paper. (DenseNet, Wide ResNet).

Follow Us

Copyright (c) Data Science Laboratory @ FIT CTU 2014–2016. All rights reserved.