3.我们明确地将这层作为输入层相关的学习残差函数,而不是学习未知的函数。同时,我们提供了全面实验数据,这些数据证明残差网络更容易优化,并且可以从深度增加中大大提高精度。
5.在ImageNet测试集中,这些残差网络整体达到了3.57%的误差。该结果在2015年大规模视觉识别挑战赛分类任务中赢得了第一。此外,我们还用了100到1000层深度分析了的CIFAR-10。
1.Deep networks naturally integrate low/mid/highlevel features and classifiers in an end-to-end multilayer fashion, and the “levels” of features can be enriched by the number of stacked layers (depth).
2.Problem of vanishing/exploding gradients, which hamper convergence from the beginning. This problem, however, has been largely addressed by normalized initialization and intermediate normalization layers , which enable networks with tens of layers to start converging for stochastic gradient descent (SGD) with backpropagation.
3.with the network depth increasing, accuracy gets saturated (which might be unsurprising) and then degrades rapidly. Unexpectedly, such degradation is not caused by overfitting, and adding more layers to a suitably deep model leads to higher training error, as reported in and thoroughly verified by our experiments.
4.There exists a solution by construction to the deeper model: the added layers are identity mapping, and the other layers are copied from the learned shallower model.
5.In this paper, we address the degradation problem by introducing a deep residual learning framework. Instead of hoping each few stacked layers directly fit a desired underlying mapping, we explicitly let these layers fit a residual mapping.
advantage:
1) Our extremely deep residual nets are easy to optimize, but the counterpart “plain” nets (that simply stack layers) exhibit higher training error when the depth increases;
2) Our deep residual nets can easily enjoy accuracy gains from greatly increased depth, producing results substantially better than previous networks.