首页| English| 中国科学院

Sparse Deep Neural Networks Through L(1,∞)-Weight Normalization

副标题:

时间:2018-10-10  来源:KLMM

题目:         Sparse Deep Neural Networks Through L(1,)-Weight Normalization

报告人:      杨周旺教授 (中国科学技术大学)

时间地点:  2018.10.24  15:30pm  N205

摘要:         We study L_{1,\infty}-weight normalization for deep neural networks to achieve the sparse architecture. Empirical evidence suggests that inducing sparsity can relieve overfitting, and weight normalization can accelerate the algorithm convergence. In this paper, we theoretically establish the generalization error bounds for both regression and classification under the L_{1,\infty}-weight normalization. It is shown that the upper bounds are independent of the network width and sqrt(k)-dependence on the network depth k, which are the best available bounds for networks with bias neurons. These results provide theoretical justifications on the usage of such weight normalization. We also develop an easily implemented gradient projection descent algorithm to practically obtain a sparse neural network. We perform various experiments to validate our theory and demonstrate the effectiveness of the resulting approach.

相关附件
相关文档