Key Property: In linear regression using OLS, the sum of residuals is zero, ensuring an unbiased estimate of the mean.
Regression vs. Classification
Regression: Predicts continuous real values (e.g., salary, price).
Classification: Predicts categorical outcomes (e.g., spam vs. not spam).
Implementing Regression Using scikit-learn
# Import the necessary library
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt
# Create a Linear Regression instance
regressor = LinearRegression()
# Train the model
regressor.fit(x_train, y_train)
# Predict results
y_pred = regressor.predict(x_test)
# Visualize results
plt.scatter(x_train, y_train, color='red') # Training data points
plt.plot(x_train, regressor.predict(x_train), color='blue') # Regression line
plt.title('Years of Experience vs Salary (Training Set)')
plt.xlabel('Years of Experience')
plt.ylabel('Salary')
plt.show()
# Visualize test set results
plt.scatter(x_test, y_test, color='red') # Test data points
plt.plot(x_test, regressor.predict(x_test), color='blue') # Regression line
plt.title('Years of Experience vs Salary (Test Set)')
plt.xlabel('Years of Experience')
plt.ylabel('Salary')
plt.show()
Key Points
fit() Method:
Trains the model on the x_train and y_train datasets.
Prediction:
regressor.predict() computes predicted values for the test set.
Visualization:
matplotlib is used for scatter plots and regression lines.
Types of Regression Functions
Ordinary Least Squares (OLS)
Regression vs. Classification
Implementing Regression Using scikit-learn
Key Points