Dropout
노트
- Moreover, the convergence properties of dropout can be understood in terms of stochastic gradient descent.[1]
 - The weights of the network will be larger than normal because of dropout.[2]
 - Therefore, before finalizing the network, the weights are first scaled by the chosen dropout rate.[2]
 - This is sometimes called “inverse dropout” and does not require any modification of weights during training.[2]
 - At test time, we scale down the output by the dropout rate.[2]
 - Now that we know a little bit about dropout and the motivation, let’s go into some detail.[3]
 - If you just wanted an overview of dropout in neural networks, the above two sections would be sufficient.[3]
 - In dropout, we randomly shut down some fraction of a layer’s neurons at each training step by zeroing out the neuron values.[4]
 - The fraction of neurons to be zeroed out is known as the dropout rate, .[4]
 - The two images represent dropout applied to a layer of 6 units, shown at multiple training steps.[4]
 - The dropout rate is 1/3, and the remaining 4 neurons at each training step have their value scaled by x1.5.[4]
 - In this paper we conduct an empirical study to investigate the effect of dropout and batch normalization on training deep learning models.[5]
 - Section 3 systematically describes the depth calculation model based on adaptive dropout proposed in this paper.[6]
 - Finally, the value of the dropout rate for each layer needs to be in the interval (0, 1).[6]
 
소스
- ↑ The dropout learning algorithm
 - ↑ 2.0 2.1 2.2 2.3 A Gentle Introduction to Dropout for Regularizing Deep Neural Networks
 - ↑ 3.0 3.1 Dropout in (Deep) Machine learning
 - ↑ 4.0 4.1 4.2 4.3 Dropout in Neural Networks
 - ↑ Dropout vs. batch normalization: an empirical study of their impact to deep learning
 - ↑ 6.0 6.1 Medical Image Segmentation Algorithm Based on Optimized Convolutional Neural Network-Adaptive Dropout Depth Calculation