Multiple Regression Model Introduction

Multiple regression model (a regression having multi-variable) is referred to as a regression model having more than one predictor (independent and explanatory variable) to explain a response (dependent) variable. We know that in simple regression models has one predictor used to explain a single response while for the case of multiple (multivariable) regression models, more than one predictor in the models. Simple regression models and multiple (multivariable) regression models can further be categorized as linear or non-linear regression models.

Note that linearity does not base on predictors or addition of more predictors in the simple regression model, it is referred to the parameter of variability (parameters attached with predictors). If the parameters of variability having a constant rate of change then the models are referred to as linear models either it is a simple regression model or multiple (multivariable) regression models. It is assumed that the relationship between variables is considered as linear, though this assumption can never be confirmed for the case of multiple linear regression. However, as a rule, it is better to look at bivariate scatter diagram of the variable of interests, you check that there should be no curvature in the relationship.

Multiple regression also allows to determine the overall fit (which is known as variance explained) of the model and the relative contribution of each of the predictors to the total variance explained (overall fit of the model). For example, one may be interested to know how much of the variation in exam performance can be explained by the following predictors such as revision time, test anxiety, lecture attendance and gender “as a whole”, but also the “relative contribution” of each independent variable in explaining the variance.

A multiple regression model has the form

\[y=\alpha+\beta_1 x_1+\beta_2 x_2+\cdots+\beta_k x_k+\varepsilon\]

Here y is continuous variables, x’s are known as predictors which may be continuous, categorical or discrete. The above model is referred to as a linear multiple (multivariable) regression model.

For example prediction of college GPA by using, high school GPA, test scores, time gives to study and rating of high school as predictors.

Read Assumptions of Multiple Linear Regression Model

Muhammad Imdad Ullah

Currently working as Assistant Professor of Statistics in Ghazi University, Dera Ghazi Khan. Completed my Ph.D. in Statistics from the Department of Statistics, Bahauddin Zakariya University, Multan, Pakistan. l like Applied Statistics, Mathematics, and Statistical Computing. Statistical and Mathematical software used is SAS, STATA, Python, GRETL, EVIEWS, R, SPSS, VBA in MS-Excel. Like to use type-setting LaTeX for composing Articles, thesis, etc.

You may also like...

5 Responses

  1. Qassim Ahmad Yaseen says:

    Assalam alykum warahmatullah wabarakatuh
    I have the following Question please help me
    How to determine the best – fitting regression equation for the data mathematically as well as
    graphically and interpret it.
    b) What percentage of total variation in the number of arrest (Y) is explained by equation?

    • haris khurram says:


      a) if you have more then one predictor and you want to model the situation, initially, Check whether the model is linear or not for that purpose use Scatter plot. If the model is linear then you have to choose the best regression equation. There are few model diagnostic methods such as R-square, MSE, Mellows Cp, Likelihood. The best model can be selected on the behave of the values of these methods. Another best way is to use, Best subset selection including, forward elimination and backward elimination. The best subset selection method gives you a better choice for the equation for data. As for the graphical method you can use the added variable plot to decide whether to select a model or not.
      b) the percentage of total variation explained by the equation can be measured through R-square for a linear model but it has some limitations. you may also you likelihood as a replacement, as they are used in pseudo r squared.

      Hope the answer will be sufficient to solve your problem. for reference see: Applied linear regression by Sanford Weisberg

      • Qassim Ahmad Yaseen says:

        May Allah bless and guide you akhii, I will come back to you i encounter any problem with these.

  2. azhar hayat says:

    Nice works sir

  3. Rana ZeeShan says:

    correction required in first line of the article
    “….having more the one….”
    it should be
    “…having more than one …”

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

x Logo: Shield Security
This Site Is Protected By
Shield Security