Multiple Linear Regression - Predicting Revenue from Ads spending

Learn how to apply multiple linear regression to your dataset

Multiple linear regression (MLR), also known simply as multiple regression, is a statistical technique that uses several explanatory variables to predict the outcome of a response variable. The goal of multiple linear regression (MLR) is to model the linear relationship between the explanatory (independent) variables and response (dependent) variable.
In essence, multiple regression is the extension of ordinary least-squares (OLS) regression that involves more than one explanatory variable.

In this project, we will be applying Multiple Regression to our dataset. Our dataset contains 4 columns - the first three of which list the amount of money spent on different advertising platforms and the 4th column contains the revenue amount.

We will try to answer the following questions:

1. Advertising on which platform has the maximum impact on the revenue?

2. Can we reduce the amount of money spent on a certain advertising platform without having much effect on the revenue?

3. Is there any interaction effect (Synergy) that can better explain the relationship between the predictors and dependent variable (revenue)?
Marketing synergy happens when multiple marketing initiatives combine to create an effect greater than the sum of their parts.

I have created an interactive Google Colab document along with the description of each step for this project in a detailed manner. Feel free to experiment with the document below by changing different parameters and seeing how it affects the output. Have fun!

More information regarding scikit-learn API can be found here.

Archana Tikayat Ray
Archana Tikayat Ray
atr@gatech.edu

My research interests include applied data science, machine learning, and Natural Language Processing.

Related