Simple Linear Regression Model (SLRM)

A simple linear regression model (SLRM) is based on a single independent (explanatory) variable and it fits a straight line such that the sum of squared residuals of the regression model (or vertical distances between the fitted line and points of the data set) as small as possible. The simple linear regression model (usually known as a statistical or probabilistic model) is

\begin{align*}
y_i &= \alpha + \beta x_i +\varepsilon_i\\
\text{OR} \quad y_i&=b_0 + b_1 x_i + \varepsilon_i\\
\text{OR} \quad y_i&=\beta_0 + \beta x_i + \varepsilon_i
\end{align*}
where $y$ is the dependent variable, $x$ is the independent variable. In the regression context, $y$ is the regressand, and $x$ is the regressor. The epsilon ($\varepsilon$) is unobservable, denoting random error or the disturbance term of a regression model. $\varepsilon$ (random error) has some specific importance for its inclusion in the regression model:

Importance of Error Term in Simple Linear Regression Model

  1. Random error ($\varepsilon$) captures the effect on the dependent variable of all variables which are not included in the model under study, because the variable not included in the model may or may not be observable.
  2. Random error ($\varepsilon$) captures any specification error related to the assumed linear-functional form.
  3. Random error ($\varepsilon$) captures the effect of unpredictable random components present in the dependent variable.

We can say that $\varepsilon$ is the variation in variable$y$ not explained (unexplained) by the independent variable $x$ included in the model.

In the above equation or model $\hat{\beta_0}, \hat{\beta_1}$ are the parameters of the model and our main objective is to obtain the estimates of their numerical values i.e. $\hat{\beta_0}$ and $\hat{\beta_1}$, where $\beta_0$ is the intercept (regression constant), it passes through the ($\overline{x}, \overline{y}$) i.e. center of mass of the data points and $\beta_1$ is the slope or regression coefficient of the model and slope is the correlation between variable $x$ and $y$ corrected by the ratio of standard deviations of these variables.

The subscript $i$ denotes the ith value of the variable in the model.
\[y=\beta_0 + \beta_1 x_1\]
This is a mathematical model as all the variation in $y$ is due solely to change in $x$. There are no other factors affecting the dependent variable. If this is true then all the pairs $(x, y)$ will fall on a straight line if plotted on a two-dimensional plane. However, the plot may or may not be a straight line for observed values. A dimensional diagram with points plotted in pair form is called a scatter diagram.

Simple Linear Regression Model scatter with regression line

See Assumptions about Simple Linear Regression Model

FAQs about Simple Linear Regression Models

  1. What is a simple linear regression Model?
  2. What is a Probabilistic/ Statistical model?
  3. What is the equation of a simple linear regression model?
  4. Write about the importance of error terms in the regression model.
  5. What are the parameters in a simple linear regression model?
  6. What is the objective of estimating the parameters of a simple linear regression model?
itfeature.com statistics help

visit and learn R Programming Language

Range Measure of Dispersion (2013)

Measure of Central Tendency provides typical value about the data set, but it does not tell the actual story about the data i.e. mean, median, and mode are enough to get summary information, though we know about the center of the data. In other words, we can measure the center of the data by looking at averages (mean, median, and mode). These measures tell nothing about the spread of data. So for more information about data, we need some other measure, such as the Range measure of dispersion or spread.

Range Measure of Dispersion

The Spread of data can be measured by calculating the range of data; the range tells us how many numbers of data extend. The range is an absolute measure of dispersion that can be found by subtracting the highest value (called upper bound) in data from the smallest value (called lower bound). i.e.

Range = Upper Bound – Lowest Bound
OR
Range = Largest Value – Smallest Value

This absolute measure of dispersion has disadvantages as range only describes the width of the data set (i.e. only spread out) measured in the same unit as data, but it does not give the real picture of how data is distributed. If data has outliers, using range to describe the spread of that can be very misleading as the range is sensitive to outliers.

We need to be careful in using the range measure of dispersion as it does not give the full picture of what’s going between the highest and lowest values. It might give a misleading picture of the spread of the data because it is based only on the two extreme values. Therefore, Range is an unsatisfactory measure of dispersion.

Range measure-of-dispersion

However, the range measure of dispersion is widely used in statistical process control such as control charts of manufactured products, daily temperature, stock prices, etc., applications as it is very easy to calculate. It is an absolute measure of dispersion, its relative measure known as the coefficient of dispersion defines the relation

\[Coefficient\,\, of\,\, Dispersion = \frac{x_m-x_0}{x_m-x_0}\]

Measure of Dispersion

The coefficient of dispersion is pure dimensionless and is used for comparison purposes.

Data Frame in R Language

Online MCQs Test Website

Introduction to Mathematica (2013)

MATHEMATICA created by Steven Wolfram, a product of Wolfram Research, Inc. Mathematica is available for different operating systems, such as SGI, Sun, NeXT, Mac, DOS, and Windows. This introduction to Mathematica will help you to understand its use as a mathematical and programming language with numerical, symbolic, and graphical calculations.

Introduction to Mathematica

  1. A calculator for arithmetic, symbolic, and algebraic calculations
  2. A language for developing transformation rules, so that general mathematical relationships can be expressed
  3. An interactive environment for the exploration of numerical, symbolic, and graphical calculations
  4. A tool for preparing input to other programs, or to process output from other programs

Getting Started with Mathematica

Starting Mathematica will open a fresh window or a notebook, where we do all mathematical calculations and some graphics. Initially window’s title is “untitled-1” which can be changed after saving the notebook by name as desired. Mathematica notebook with text, graphics, and Mathematica input and output

Introduction to mathematica notebook

Entering Expressions

Type 1+1 in the notebook and press the ENTER key from the keyboard. You will get an answer in the next line of work area. This is called evaluating or entering the expression. Note that Mathematica places “In[1]:=” and “out[1]=” (without quotation marks) labels to 1+1 and 2 respectively. You will also see a set of brackets on the right side of the input and output. The innermost brackets enclose the input and output while the outer bracket (larger bracket) groups the input and output. Each bracket contains a cell. Each time you enter or change the input you will notice that the “In” and “Out” labels will also be changed.

Basic Arithmetic

Mathematica can perform basic operations of additions (+), subtraction (-), multiplication (*), division (/), exponentiation(^), etc. For example, write the following line for basic arithmetic in Mathematica

2*3+4^2
5*6
2(3+4)
(2-3+1)(1+2/3)-5^(-1)
6!

Using Previous Results in Mathematica

Often we need the output of the first (previous) calculations in our next (coming) computation. For this purpose % symbol can be used to refer to the output of the previous cell. For example,

2^5
% + 100

Here 2^5 is added in 100.

%% refers to the result before the last results (2nd last).

Exact vs Approximation

Mathematica can give approximate results; when we need

3^20/2^21 produces $\frac{3486784401}{2097152}$

We can force Mathematica to approximate results in decimals by putting decimals in expressions (with any digit or number) such as

3.0^20/ 2^21

For a decimal in number in an expression, Mathematica considers it to be an approximation rather than an exact number.

Wolfram Mathematica

R Frequently Asked Questions