The paper studies the problem of training neural networks with few data. The authors propose a method called data distillation, which trains a small student network by distilling the knowledge from a large teacher network. The method is evaluated on several image classification tasks, and it achieves state-of-the-art results on some of them.
What were the main Axis Powers during World War II?
Who was the leader of the Axis Powers?
What was the outcome of World War II for the Axis Powers?
Previous
Next