Polynomial regression is an extension of linear regression that is used when the relationship between the independent and dependent variables is non-linear. Despite its name, it is still called linear regression because the model remains linear in the coefficients (b). For instance, even though the input x is transformed into polynomial features (x², x³, etc.), the regression equation remains a linear combination of these features:
y=b0+b1x+b2x2+b3x3+⋯+bnxn y = b_0 + b_1x + b_2x^2 + b_3x^3 + \dots + b_nx^nThis flexibility allows polynomial regression to model more complex relationships than simple linear regression.
In this project, we aim to predict salaries based on position levels using polynomial regression. The dataset used is Position_Salaries.csv. Below is a step-by-step explanation along with the code for implementation:
1. Importing Libraries
First, we import the necessary libraries for numerical computations, data handling, and visualization:
import numpy as np
import pandas as pd
import matplotlib.pyplot as mp
2. Loading the Dataset
The dataset is loaded using pandas, and the independent (x) and dependent (y) variables are extracted. The independent variable x is reshaped into a 2D array for compatibility with scikit-learn models:
# Importing the dataset
dataset = pd.read_csv('Position_Salaries.csv')
x = dataset.iloc[:, 1].values.reshape(-1, 1) # Independent variable
y = dataset.iloc[:, -1].values # Dependent variable
3. Training the Linear Regression Model
A baseline linear regression model is trained on the entire dataset:
from sklearn.linear_model import LinearRegression
lin_reg = LinearRegression()
lin_reg.fit(x, y)
4. Training the Polynomial Regression Model
To handle non-linear relationships, we transform x into polynomial features using the PolynomialFeatures class from scikit-learn. For this example, we use a polynomial of degree 4:
We visualize the results of the linear regression model. The scatter plot shows the original data points in red, while the blue line represents predictions from the linear regression model:
mp.scatter(x, y, color='red')
mp.plot(x, lin_reg.predict(x), color='blue')
mp.title('Truth or Bluff (Linear Regression)')
mp.xlabel('Position Level')
mp.ylabel('Salary')
mp.show()
Polynomial Regression Results
For polynomial regression, we use a finer grid of x values to create a smoother curve:
Polynomial Linear Regression: Comprehensive Notes
Polynomial regression is an extension of linear regression that is used when the relationship between the independent and dependent variables is non-linear. Despite its name, it is still called linear regression because the model remains linear in the coefficients (b). For instance, even though the input x is transformed into polynomial features (x², x³, etc.), the regression equation remains a linear combination of these features:
y=b0+b1x+b2x2+b3x3+⋯+bnxn y = b_0 + b_1x + b_2x^2 + b_3x^3 + \dots + b_nx^nThis flexibility allows polynomial regression to model more complex relationships than simple linear regression.
In this project, we aim to predict salaries based on position levels using polynomial regression. The dataset used is Position_Salaries.csv. Below is a step-by-step explanation along with the code for implementation:
1. Importing Libraries
First, we import the necessary libraries for numerical computations, data handling, and visualization:
2. Loading the Dataset
The dataset is loaded using pandas, and the independent (x) and dependent (y) variables are extracted. The independent variable x is reshaped into a 2D array for compatibility with scikit-learn models:
3. Training the Linear Regression Model
A baseline linear regression model is trained on the entire dataset:
4. Training the Polynomial Regression Model
To handle non-linear relationships, we transform x into polynomial features using the PolynomialFeatures class from scikit-learn. For this example, we use a polynomial of degree 4:
5. Visualization of Results
Linear Regression Results
We visualize the results of the linear regression model. The scatter plot shows the original data points in red, while the blue line represents predictions from the linear regression model:
Polynomial Regression Results
For polynomial regression, we use a finer grid of x values to create a smoother curve:
6. Making Predictions
Linear Regression Prediction
We can make predictions using the linear regression model. For example:
Polynomial Regression Prediction
Using the polynomial regression model, we can predict more accurate results for non-linear data:
Key Concepts
Final Notes