Bias-Variance Trade Off

Cerca_Trova
4 min readOct 15, 2020

When it comes to an accuracy or performance of a machine learning , its important to understand and master in the Bias Variance Trade off. By theoretically, all will be understanding the concept of Bias and Variance , but when its come to the model , the deep understanding of this trade off will be a good backup .

When we are working with models training , the main goal is to find out the relationship between the Dependent and Independent variables in the form of function . We will consider each features as X1,X2,…Xn and the target variable as Y . The relationship between these we can mention as ,

Y=f(Xn)+e

Where f is an fixed unknown function on X values and e represents the irreducible error. Even though the goal is to find out the optimized estimated f(x) value , but its not possible to make a perfect estimate for f(X) and that gives the error term value .

Mainly the error have two types,

  1. Reducible Error
  2. Irreducible Error

Reducible error are errors which can be reduced by parameter tuning , normalization etc.. and which may increase the accuracy of model in a small amount. However we can not generate a model with 100% accuracy , where that leads to the irreducible error (e).In other terms we can say that irreducible errors are the information of Y that X cannot give.

When we will be finding out how exactly the model fits with data , the main method to our mind will be MSE(Mean Squared Error).

This is nothing but the Squared sum of the difference of Predicted outcome and actual outcome. This value will give us three important information.

  1. Variance of f(X)
  2. Squared Bias of f(X)
  3. The variance of error term (e)

Hear comes the bias and Variance, which hold a big hand in the accuracy or performance the model.

Bias

Bias is the error that is introduced when the model simplifies the complex data to predict accurately. Generally we can say that Bias is the difference between the average prediction of our model and the correct value which we are trying to predict. High Bias will miss the important relation between the dependent and independent variable and leads to Underfitting .

Variance

On other hand variance is the difference of the model in prediction with different data sets. Since the training data is used to estimate the f(X), the different data set will give different estimate value. But the estimate values should not vary too much , if else which may leads to overfitting.

In the above picture, the red line denotes the MSE , Green line denotes Bias and the Yellow line denotes the Variance with respect to flexibility of the model. As we can see when the flexibility increase , the Bias reduce and in some point will maintain the same level, and in other hand variance will increase which leads to high MSE value. From the figure we can see the Minimum MSE will leads to trade off value of both bias and variance. Ideally we can say that low bias and low variance will more good model. To understand the trade off between these to we can see the below picture.

Where the red circle denotes the actual prediction . When the bias and variance changes , the data points which represents the predicted value also move from red circle. For example , when the model have both low bias and variance, the prediction is highly near to the actual outcome, but when the variance changes from low to high , the datapoints are scattering and which leading to increase the difference from actual value.

So for a model , in order to minimize the expected test error, we need to select a learning model that simultaneously achieves low variance and low bias.

General approaches for reducing the variance and bias are ,

1 . Dimensionality Reduction for variances.

2. Feature selection for Variance.

3. Feature addition help to reduce bias, but may leads to variance .

As each model have its on different approaches, and each data can be used differently, the vast knowledge in practical data will help us to understand and identify the trade off .

Thank you all. Its a small knowledge sharing from my side , while the learning journey .Please help me to understand the mistakes if any , else please help to promote the article.

--

--