Friday, May 16, 2025
News PouroverAI
Visit PourOver.AI
No Result
View All Result
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing
News PouroverAI
No Result
View All Result

What is Ridge Regression?

April 18, 2024
in AI Technology
Reading Time: 6 mins read
0 0
A A
0
Share on FacebookShare on Twitter


Contributed by: Prashanth Ashok

What is Ridge regression?

Ridge regression is a model-tuning method that is used to analyze any data that suffers from multicollinearity. This method performs L2 regularization. When the issue of multicollinearity occurs, least-squares are unbiased, and variances are large, resulting in predicted values being far away from the actual values.

The cost function for ridge regression:

Min(||Y – X(theta)||^2 + λ||theta||^2)

Lambda is the penalty term. λ given here is denoted by an alpha parameter in the ridge function. By changing the values of alpha, we are controlling the penalty term. The higher the values of alpha, the bigger is the penalty and therefore the magnitude of coefficients is reduced. It shrinks the parameters. Therefore, it is used to prevent multicollinearity and reduces the model complexity by coefficient shrinkage.

Check out the free course on regression analysis. Ridge Regression Models

For any type of regression machine learning model, the usual regression equation forms the base which is written as:

Y = XB + e

Where Y is the dependent variable, X represents the independent variables, B is the regression coefficients to be estimated, and e represents the errors are residuals.

Once we add the lambda function to this equation, the variance that is not evaluated by the general model is considered. After the data is ready and identified to be part of L2 regularization, there are steps that one can undertake.

Standardization

In ridge regression, the first step is to standardize the variables (both dependent and independent) by subtracting their means and dividing by their standard deviations. This causes a challenge in notation since we must somehow indicate whether the variables in a particular formula are standardized or not. As far as standardization is concerned, all ridge regression calculations are based on standardized variables. When the final regression coefficients are displayed, they are adjusted back into their original scale. However, the ridge trace is on a standardized scale.

Bias and variance trade-off

Bias and variance trade-off is generally complicated when it comes to building ridge regression models on an actual dataset. However, following the general trend which one needs to remember is:

  • The bias increases as λ increases.
  • The variance decreases as λ increases.

Assumptions of Ridge Regressions

The assumptions of ridge regression are the same as those of linear regression: linearity, constant variance, and independence. However, as ridge regression does not provide confidence limits, the distribution of errors to be normal need not be assumed. Now, let’s take an example of a linear regression problem and see how ridge regression if implemented, helps us to reduce the error. We shall consider a data set on Food restaurants trying to find the best combination of food items to improve their sales in a particular region.

Upload Required Libraries

import numpy as np

import pandas as pd
import os
import seaborn as sns
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt
import matplotlib.style plt.style.use('classic')
import warnings
warnings.filterwarnings("ignore")
df = pd.read_excel("food.xlsx")

After conducting all the EDA on the data, and treatment of missing values, we shall now go ahead with creating dummy variables, as we cannot have categorical variables in the dataset.

df =pd.get_dummies(df, columns=cat,drop_first=True)

Where columns=cat is all the categorical variables in the data set.

After this, we need to standardize the data set for the Linear Regression method.

Scaling the variables

#Scales the data. Essentially returns the z-scores of every attribute

from sklearn.preprocessing import StandardScaler
std_scale = StandardScaler()
std_scale df['week'] = std_scale.fit_transform(df[['week']])
df['final_price'] = std_scale.fit_transform(df[['final_price']])
df['area_range'] = std_scale.fit_transform(df[['area_range']])

Train-Test Split

# Copy all the predictor variables into X dataframe

X = df.drop('orders', axis=1)
# Copy target into the y dataframe. Target variable is converted in to Log.
y = np.log(df[['orders']])
# Split X and y into training and test set in 75:25 ratio
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25 , random_state=1)

Linear Regression Model

# invoke the LinearRegression function and find the bestfit model on training data

regression_model = LinearRegression()
regression_model.fit(X_train, y_train)

# Let us explore the coefficients for each of the independent attributes

for idx, col_name in enumerate(X_train.columns):
print("The coefficient for {} is {}".format(col_name, regression_model.coef_[0][idx]))

The coefficient for week is -0.0041068045722690814

The coefficient for final_price is -0.40354286519747384

The coefficient for area_range is 0.16906454326841025

…

…

…

The coefficient for home_delivery_1.0 is 1.026400462237632

The coefficient for night_service_1 is 0.0038398863634691582

#checking the magnitude of coefficients

from pandas import Series, DataFrame
predictors = X_train.columns
coef = Series(regression_model.coef_.flatten(), predictors).sort_values()
plt.figure(figsize=(10,8))
coef.plot(kind='bar', title="Model Coefficients")
plt.show()

Variables showing Positive effect on regression model are food_category_Rice Bowl, home_delivery_1.0, food_category_Desert, food_category_Pizza, website_homepage_mention_1.0, food_category_Sandwich, food_category_Salad and area_range – these factors highly influencing our model.

Difference Between Ridge Regression Vs Lasso Regression

Aspect Ridge Regression Lasso Regression
Regularization Approach Adds penalty term proportional to square of coefficients Adds penalty term proportional to absolute value of coefficients
Coefficient Shrinkage Coefficients shrink towards but never exactly to zero Some coefficients can be reduced exactly to zero
Effect on Model Complexity Reduces model complexity and multicollinearity Results in simpler, more interpretable models
Handling Correlated Inputs Handles correlated inputs effectively Can be inconsistent with highly correlated features
Feature Selection Capability Limited Performs feature selection by reducing some coefficients to zero
Preferred Usage Scenarios All features assumed relevant or dataset has multicollinearity When parsimony is advantageous, especially in high-dimensional datasets

Ridge Regression in Machine Learning

Ridge regression is a key technique in machine learning, indispensable for creating robust models in scenarios prone to overfitting and multicollinearity. This method modifies standard linear regression by introducing a penalty term proportional to the square of the coefficients, which proves particularly useful when dealing with highly correlated independent variables. Among its primary benefits, ridge regression effectively reduces overfitting through added complexity penalties, manages multicollinearity by balancing effects among correlated variables, and enhances model generalization to improve performance on unseen data.

The implementation of ridge regression in practical settings involves the crucial step of selecting the right regularization parameter, commonly known as lambda. This selection, typically done using cross-validation techniques, is vital for balancing the bias-variance tradeoff inherent in model training. Ridge regression enjoys widespread support across various machine learning libraries, with Python’s scikit-learn being a notable example. Here, implementation entails defining the model, setting the lambda value, and employing built-in functions for fitting and predictions. Its utility is particularly notable in sectors like finance and healthcare analytics, where precise predictions and robust model construction are paramount. Ultimately, ridge regression’s capacity to improve accuracy and handle complex datasets solidifies its ongoing importance in the dynamic field of machine learning.

The higher the value of the beta coefficient, the higher is the impact. Dishes like Rice Bowl, Pizza, Desert with a facility like home delivery and website_homepage_mention plays an important role in demand or number of orders being placed in high frequency.

Variables showing negative effect on the regression model for predicting restaurant orders: cuisine_Indian, food_category_Soup, food_category_Pasta, food_category_Other_Snacks. Final_price has a negative effect on the order – as expected. Dishes like Soup, Pasta, other_snacks, Indian food categories hurt model prediction on the number of orders being placed at restaurants, keeping all other predictors constant.

Some variables which are hardly affecting model prediction for order frequency are week and night_service. Through the model, we are able to see object types of variables or categorical variables are more significant than continuous variables.

Regularization

Value of alpha, which is a hyperparameter of Ridge, which means that they are not automatically learned by the model instead they have to be set manually. We run a grid search for optimum alpha values to find optimum alpha for…



Source link

Tags: RegressionRidge
Previous Post

AI in Cybersecurity: Protecting Against Advanced Threats

Next Post

The Real-Time Deepfake Romance Scams Have Arrived

Related Posts

How insurance companies can use synthetic data to fight bias
AI Technology

How insurance companies can use synthetic data to fight bias

June 10, 2024
From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset
AI Technology

From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

June 10, 2024
Decoding Decoder-Only Transformers: Insights from Google DeepMind’s Paper
AI Technology

Decoding Decoder-Only Transformers: Insights from Google DeepMind’s Paper

June 9, 2024
How Game Theory Can Make AI More Reliable
AI Technology

How Game Theory Can Make AI More Reliable

June 9, 2024
Buffer of Thoughts (BoT): A Novel Thought-Augmented Reasoning AI Approach for Enhancing Accuracy, Efficiency, and Robustness of LLMs
AI Technology

Buffer of Thoughts (BoT): A Novel Thought-Augmented Reasoning AI Approach for Enhancing Accuracy, Efficiency, and Robustness of LLMs

June 9, 2024
Deciphering Doubt: Navigating Uncertainty in LLM Responses
AI Technology

Deciphering Doubt: Navigating Uncertainty in LLM Responses

June 9, 2024
Next Post
The Real-Time Deepfake Romance Scams Have Arrived

The Real-Time Deepfake Romance Scams Have Arrived

How to Create UTM Tracking URLs on Google Analytics

How to Create UTM Tracking URLs on Google Analytics

Probable Root Cause: Accelerating incident remediation with causal AI 

Probable Root Cause: Accelerating incident remediation with causal AI 

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Trending
  • Comments
  • Latest
Is C.AI Down? Here Is What To Do Now

Is C.AI Down? Here Is What To Do Now

January 10, 2024
Porfo: Revolutionizing the Crypto Wallet Landscape

Porfo: Revolutionizing the Crypto Wallet Landscape

October 9, 2023
23 Plagiarism Facts and Statistics to Analyze Latest Trends

23 Plagiarism Facts and Statistics to Analyze Latest Trends

June 4, 2024
A Complete Guide to BERT with Code | by Bradney Smith | May, 2024

A Complete Guide to BERT with Code | by Bradney Smith | May, 2024

May 19, 2024
How To Build A Quiz App With JavaScript for Beginners

How To Build A Quiz App With JavaScript for Beginners

February 22, 2024
Saginaw HMI Enclosures and Suspension Arm Systems from AutomationDirect – Library.Automationdirect.com

Saginaw HMI Enclosures and Suspension Arm Systems from AutomationDirect – Library.Automationdirect.com

December 6, 2023
Can You Guess What Percentage Of Their Wealth The Rich Keep In Cash?

Can You Guess What Percentage Of Their Wealth The Rich Keep In Cash?

June 10, 2024
AI Compared: Which Assistant Is the Best?

AI Compared: Which Assistant Is the Best?

June 10, 2024
How insurance companies can use synthetic data to fight bias

How insurance companies can use synthetic data to fight bias

June 10, 2024
5 SLA metrics you should be monitoring

5 SLA metrics you should be monitoring

June 10, 2024
From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

June 10, 2024
UGRO Capital: Targeting to hit milestone of Rs 20,000 cr loan book in 8-10 quarters: Shachindra Nath

UGRO Capital: Targeting to hit milestone of Rs 20,000 cr loan book in 8-10 quarters: Shachindra Nath

June 10, 2024
Facebook Twitter LinkedIn Pinterest RSS
News PouroverAI

The latest news and updates about the AI Technology and Latest Tech Updates around the world... PouroverAI keeps you in the loop.

CATEGORIES

  • AI Technology
  • Automation
  • Blockchain
  • Business
  • Cloud & Programming
  • Data Science & ML
  • Digital Marketing
  • Front-Tech
  • Uncategorized

SITEMAP

  • Disclaimer
  • Privacy Policy
  • DMCA
  • Cookie Privacy Policy
  • Terms and Conditions
  • Contact us

Copyright © 2023 PouroverAI News.
PouroverAI News

No Result
View All Result
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing

Copyright © 2023 PouroverAI News.
PouroverAI News

Welcome Back!

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Fill the forms bellow to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In