Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Regression

Regression

Linear Regression

📖Source

Figure 1:📖Source

Y=b0+b1x1+b2x2+ϵY = b_0 + b_1 * x_1 + b_2 * x_2 + \epsilon

The idea is to find the line or plane which best fits the data. Collectively, b0,b1,b2b_0, b_1, b_2 are called regression coefficients. ϵ\epsilon is the error term, the part of YY the regression model is unable to explain.

📖Source

Figure 2:📖Source

Metrics

Now once you have the model fit next comes the metrics to measure how good the fit is, some of the common metrics are as follows:

Feature selection

📖Explanation

LASSO vs Ridge, the red contours are that of RSS whereas the geometric shapes are that of Ridge and Lasso. (📖Source)

Figure 3:LASSO vs Ridge, the red contours are that of RSS whereas the geometric shapes are that of Ridge and Lasso. (📖Source)

Assumptions

Non-Linear Regression

In some cases, the true relationship between the outcome and a predictor variable might not be linear. There are different solutions extending the linear regression model for capturing these nonlinear effects, some of these are covered below.

Polynomial Regression

The equation of polynomial becomes something like this.

Y=b0+b1x1+b2x12+bnx1nY = b_0 + b_1 * x_1 + b_2 * x_1^2 + b_n * x_1^n and so on...

The degree of order which to use is a Hyperparameter, and we need to choose it wisely. But using a high degree of polynomial tries to overfit the data and for smaller values of degree, the model tries to underfit so we need to find the optimum value of a degree. Polynomial Regression on datasets with high variability chances to result in over-fitting.

Regression Splines

📖Explanation

In order to overcome the disadvantages of polynomial regression, we can use an improved regression technique which, instead of building one model for the entire dataset, divides the dataset into multiple bins and fits each bin with a separate model. Such a technique is known as Regression spline.

In polynomial regression, we generated new features by using various polynomial functions on the existing features which imposed a global structure on the dataset. To overcome this, we can divide the distribution of the data into separate portions and fit linear or low degree polynomial functions on each of these portions. The points where the division occurs are called Knots. Functions which we can use for modelling each piece/bin are known as Piecewise functions. There are various piecewise functions that we can use to fit these individual bins.

Generalized additive models

It does the same thing as above but just removes the need to specifying the knots. It fits spline models with automated selection of knots.

Questions