• Overfitting: “fitting the data more than is warranted” [1] • Fitting the noise... Use Learning Curve to detect Overfitting..[r]
Trang 1Sonpvh
Trang 21. Overfitting
2. Regularization
3. Validation
4. Model selection
Trang 3We can fit any function … But noise …
& not function
Trang 4• Overfitting: “fitting the data more than is warranted” [1]
• Fitting the noise
Trang 6y = f x + ϵ 𝑥 = σ 𝑞=0 𝑄 𝛼 𝑖 𝑥 𝑖 + 𝜎 2
Observation Target
Function Noise
Target complexity Noise
𝑄 : target complexity
𝜎2: noise level
N : sample size
Deterministic noise Stochastics noise
Overfitting
Trang 7𝐸𝑜𝑢𝑡 𝑔 𝐷 = 𝔼𝐷 𝑔 𝐷 𝑥 − ҧ𝑔 x 2 + 𝔼𝐷 ҧ𝑔 x − 𝑓 𝑥 2
+ 𝔼 𝑥 (𝜖 x ) 2
Noise Deterministic Noise Variance Stochastic Noise
Trang 8x x x
Trang 9Use Learning Curve to detect Overfitting
Trang 10▪ Definition: “any modification we make to a learning algorithm that is intended to reduce its generalization error but not its training error ” [3]
𝑄 : target complexity
𝜎2: noise level
N : sample size
Overfitting
Trang 111. Parameter Norm Penalties
2. Norm Penalties as Constrained
Trang 12L1 – Lasso Reg L2 – Ridge Reg
Trang 13 (y − wixi )2 + 𝜆 wi or (y − wixi )2 + 𝜆 (wi)2
Trang 14Noise Fitting …
Trang 17Eout h = Ein h + Overfitting penalty
Regularizationestimate this quantity
Validation estimate this quantity
TOP SECRET
Target complexity
Noise level
VALIDATION
Trang 191. K-fold Cross validation
2. Holdout or Train/Test split
3. Stratified K-Fold Cross Validation
4. Repeated Cross validation
5. Leave-one-out cross validation - LOOCV
6. …
Trang 20Train-Test split+ Simple, cheap
- Waste data …
Trang 22K-fold CV+ Only waste 𝑁𝐾 data+ Only K times more expensive than train-test split
Trang 23K-fold (K ==5)
Holdout or Train – Test split Early stopping
Eval
Trang 24[7]
Trang 251. Yaser S Abu-Mostafa, Malik Magdon-Ismail, Hsuan-Tien Lin-Learning From Data
A short course-AMLBook (2012)
2.
https://medium.com/greyatom/what-is-underfitting-and-overfitting-in-machine-learning-and-how-to-deal-with-it-6803a989c76
3. Ian GoodFellow, Yoshua Bengio, Aaron Courville – Deep learning – Chapter 7
Regularization for Deep Learning