Abstract: With the success of deep neural networks, knowledge distillation which guides the learning of a small student network from a large teacher network is being actively studied for model ...