Summation Operator Properties and Examples (2024)

The summation operator is denoted by $\Sigma$. The summation operator is a mathematical notation used to represent the sum of numbers or terms. The summation is the total of all the terms added according to the specified range of values for the index.

Suppose, we have information about the height of students, such as 54, 55, 58, 60, 61, 45, 53.
Using variable and value notation one can denote the height of the students like

  • First height in the information $X_1$, that is $X_1=54$
  • Second height in the information $X_2$, that is $X_2=55$
  • Last or nth information $X_n$, that is $X_n=53$.
Summation Operator

In general, the variable and its values can be denoted by $X_i$, where $i=1,2,3, \cdots, n$.

The sum of all numeric information (values of the variable $X_1, X_2, \cdots, X_n$) can be totaled by $X_1+X_2+\cdots+X_n$. The short and useful summation for the set of values is $\sum\limits_{i=1}^n X_i$, where the symbol $\Sigma$ is a Greek letter and denotes the sum of all values ranging from $i=1$ (start) to $n$ (last) value.

Summation Operator

The number written on top of $\Sigma$ is called the upper limit (Upper Bound) of the sum, below $\Sigma$, there are two additional components: the index and the lower bound (lower limit). On the right of $\Sigma$, there is the sum term for all the indexes.

Summation Operator

Consider the following example for the use of summing values using the Summation operator.

\begin{align*}
X_1 + X_2 + X_3 + \cdots X_n &= \sum\limits_{i=1}^{n} X_i\\
X_1Y_1 + X_2Y_2 + X_3Y_3 + \cdots X_nY_n &= \sum\limits_{i=1}^{n} X_iY_i\\
X_1^2 + X_2^2 + \cdots + X_3^2 + \cdots X_n^2 &= \sum\limits_{i=1}^n X_i^2\\
(X_1 + X_2 + X_3 + \cdots X_n)^2 &= \left( \sum\limits_{i=1}^{n} X_i \right)^2
\end{align*}

The following examples make use of the summation operator, when a number (constant) and values of the variable are involved.

\begin{align}
a+a+a+ \cdots + a = na&=\sum\limits_{i=1}^{n}a\\
aX_1 + aX_2 + aX_3 \cdots + aX_n &= a \sum\limits_{i=1}^n X_i\\
(X_1-a)+(X_2-a)+\cdots + (X_n-a) &= \sum\limits_{i=1}^n (X_i-a)\\
(X_1-a)^2+(X_2-a)^2+\cdots + (X_n-a)^2 &= \sum\limits_{i=1}^n (X_i-a)^2\\
[(X_1-a)+(X_2-a)+\cdots + (X_n-a)]^2 &= \left[\sum\limits_{i=1}^n (X_i-a)\right]^2
\end{align}

Properties of Summation Operator

The summation operator is denoted by the $\Sigma$ symbol. It is a mathematical notation used to represent the sum of a collection of (data) values. The following useful properties for the manipulation of the sum operator are:

1) Multiplying a sum by a constant
$$c\sum\limits_{i=1}^n x_i = \sum\limits_{i=1}^n cx_i$$

2) Linearity: The summation operator is linear meaning that it satisfies the following properties for constant $a$ and $b$, and sequence $x_n$ and $y_n$.
$$\sum\limits_{i=1}^N(ax_i + by_i) = a \sum_{i=1}^N x_n + b\sum\limits_{i=1}^N y_i$$

3) Splitting a sum into two sums
$$\sum\limits_{i=a}^n x_i = \sum\limits_{i=a}^{c}x_i + \sum_{i=c+1}^n x_i$$

4) Combining Summations: Multiple summations can be combined into a single summation:
$$\sum\limits_{i=1}^b x_n + \sum\limits_{i=b+1}^c x_i = \sum\limits_{i=1}^c x_i$$

5) Changing the order of individual sums in multiple sum expressions
$$\sum\limits_{i=1}^{m} \sum\limits_{j=1}^{n} a_{ij} = \sum\limits_{j=1}^{n}\sum\limits_{i=1}^{m} a_{ij}$$

6) Distributivity over Scalar Multiplication: The summation operator distributes over scalar multiplication
$$c\sum\limits_{i=1}^b x_i = \sum_{i=1}^b (cx_i)$$

7) Adding or Subtracting Sums
$$\sum\limits_{i=1}^a x_i \pm \sum_{i=1}^a y_i = \sum\limits_{i=1}^a (x_i \pm y_i)$$

8) Multiplying the Sums:
$$\sum\limits_{i_1=a_1}^{n_1} x_{i_1} \times \cdots \times \sum\limits_{i_n=a_n}^{n_n} x_{i_n} = \sum\limits_{i_1=a_1}^{n_1} \times \cdots \times \sum\limits_{i_1=a_1}^{n_n}x_{i_1}\times \cdots \times x_{i_n}$$

https://itfeature.com

Online MCQs Test Preparation Website

Learning R Programming Language

Properties of Correlation Coefficient (2024)

The coefficient of correlation is a statistic used to measure the strength and direction of the linear relationship between two Quantitative variables.

Properties of Correlation Coefficient

Understanding these properties helps us to interpret the correlation coefficient accurately and avoid misinterpretations. The following are some important Properties of Correlation Coefficient.

  • The correlation coefficient ($r$) between $X$ and $Y$ is the same as the correlation between $Y$ and $X$. that is the correlation is symmetric with respect to $X$ and $Y$, i.e., $r_{XY} = r_{YX}$.
  • The $r$ ranges from $-1$ to $+1$, i.e., $-1\le r \le +1$.
  • There is no unit of $r$. The correlation coefficient $r$ is independent of the unit of measurement.
  • It is not affected by the change of origin and scale, i.e., $r_{XY}=r_{YX}$. If a constant is added to each value of a variable, it is called a change of origin and if each value of a variable is multiplied by a constant, it is called a change of scale.
  • The $r$ is the geometric mean of two regression coefficients, i.e., $\sqrt{b_{YX}\times b_{XY}}$.
    In other words, if the two regression lines of $Y$ on $X$ and $X$ on $Y$ are written as $Y=a+bX$ and $X=c+dy$ respectively then $bd=r^2$.
  • The sign of $r_{XY}, b_{YX}$, and $b_{XY}$ is dependent on covariance which is common in the three as given below:
  • $r=\frac{Cov(X, Y)}{\sqrt{Var(X) Var(Y)}},\,\, b_{YX} = \frac{Cov(Y, X)}{Var(X)}, \,\, b_{XY}=\frac{Cov(Y, X)}{Var(Y)}$

Hence, $r_{YX}, b_{YX}$, and $b_{XY}$ have the same sign.

  • If $r=-1$ the correlation is perfectly negative, meaning as one variable increases the other increases proportionally.
  • If $r=+1$ the correlation is perfectly positive, meaning as one variable increases the other decreases proportionally.
  • If $r=0$ there is no correlation, i.e., there is no linear relationship between the variables. However, a non-linear relationship may exist but it does not necessarily mean that the variables are independent.
Properties of Correlation Coefficient

Theorem: Correlation: Independent of Origin and Scale. Show that the correlation coefficient is independent of origin and scale, i.e., $r_{XY}=r_{uv}$.

Proof: The formula for correlation coefficient is,

$$r_{XY}=\frac{\varSigma(X-\overline{X})((Y-\overline{Y})) }{\sqrt{[\varSigma(X-\overline{X})^2][\varSigma(Y-\overline{Y})^2]}}$$

\begin{align*}
\text{Let}\quad u&=\frac{X-a}{h}\\
\Rightarrow X&=a+hu \Rightarrow \overline{X}=a+h\overline{u} \\
\text{and}\quad v&=\frac{Y-b}{K}\\
\Rightarrow Y&=b+Kv \Rightarrow \overline{Y}=b+K\overline{v}\\
\text{Therefore}\\
r_{uv}&=\frac{\varSigma(u-\overline{u})((v-\overline{v})) }{\sqrt{[\varSigma(u-\overline{u})^2][\varSigma(v-\overline{v})^2]}}\\
&=\frac{\varSigma (a+hu-a-h\overline{u}) (b+Kv-b-K\overline{v})} {\sqrt{\varSigma(a+hu-a-h\overline{u})^2\varSigma(b+Kv-b-K\overline{v})^2}}\\
&=\frac{\varSigma(hu-h\overline{u})(Kv-K\overline{v})}{\sqrt{[\varSigma(hu-h\overline{u})^2][\varSigma(Kv-K\overline{v})^2]}}\\
&=\frac{hK\varSigma(u-\overline{u})(v-\overline{v})}{\sqrt{[h^2 K^2 \varSigma(u-\overline{u})^2] [\varSigma(v-\overline{v})^2]}}\\
&=\frac{hK\varSigma(u-\overline{u})(v-\overline{v})}{hK\,\sqrt{[\varSigma(u-\overline{u})^2] [\varSigma(v-\overline{v})^2]}}\\
&=\frac{\varSigma(u-\overline{u})(v-\overline{v}) }{\sqrt{[\varSigma(u-\overline{u})^2][\varSigma(v-\overline{v})^2]}}=
r_{uv}
\end{align*}

Correlation Coefficient Range

Note that

  1. Non-causality: Correlation does not imply causation. If two variables are strongly correlated, it does not necessarily mean that changes in one variable cause changes in the other. This is because the correlation only measures the strength and direction of the linear relationship between two quantitative variables, not the underlying cause-and-effect relationship.
  2. Sensitive to Outliers: The correlation coefficient can be sensitive to outliers, as outliers can disproportionately influence the correlation calculation.
  3. Assumption of Linearity: The correlation coefficient measures the linear relationship between variables. It may not accurately capture non-linear relationships between variables.
  4. Scale Invariance: The correlation coefficient is independent of the scale of the data. That is, multiplying or dividing all the values of one or both variables by a constant will not affect the strength and direction of correlation coefficient. This makes it useful for comparing relationships between variables measured in different units.
  5. Strength vs. Causation: A high correlation does not necessarily imply causation. It is because two variables are strongly correlated does not mean one causes the other. There might be a third unknown factor influencing both variables. Correlation analysis is a good starting point for exploring relationships, but further investigation is needed to establish causality.
https://itfeature.com

https://gmstat.com

https://rfaqs.com

Layout of the Factorial Design: Two Factor $2^2$ (2024)

The layout of a factorial design is typically organized in a table format. Each row of the table represents an experimental run, while each column represents a factor or the response variable. The levels of factors are indicated by symbols such as + and – for high and low levels, respectively. The response variable values corresponding to each experimental condition are recorded in the form of a sign table.

Consider a simple example layout for a two-factor factorial design with factors $A$ and $B$.

RunFactor AFactor BResponse
1$Y_1$
2+$Y_2$
3+$Y_3$
4++$Y_4$

Layout of the Factorial Design: Two Factor in $n$ Replicates

Consider there are two factors and each factor has two levels in $n$ replicates. The layout of the factorial design will be as described below for $n$ replicates.

Layout for the factorial design Two Factor Two Level

$y_{111}$ is the response from the first factor at the low level, the second factor at the low level, and the first replicate of the trial. Similarly, $y_{112}$ represents the second replicate of the same trial, and up to $n$th observation is $n$th trial at the same level of $A$ and $B$.

Geometrical Structure of Two-Factor Factorial Design

The geometrical structure of two factors (Factor $A$ and $B$), each factor has two levels, low ($-$) and high (+). Response 1 is at the low level of $A$ and a low level of $B$, similarly, response 2 is produced at a high level of $A$ and a low level of $B$. The third response is at a low level of $A$ and a high level of $B$, similarly, the 4th response is at a high level of $A$ and a high level of $B$.

Geometrical Structure of two Factor Layout of Factorial Experiment

Real Life Example

The concentration of reactant vs the amount of the catalyst produces some response, the experiment has three replicates.

Layout of Two Factors Real Life Example

Geometrical Structure of the Example

Layout of the Factorial Design: Two Factor $2^2$ (2024)

Factor Effects

\begin{align} A &=\frac{(a+ab)-((I) +b)}{2} = \frac{100+90-80-60}{2} = 25\\
B &= \frac{(b+ab) – ((I) +a) }{2} = \frac{60+90-80-100}{2} = -15\\
AB&=\frac{((I)+ab)-(a+b)}{2} = \frac{80+90-100-60}{2}=5
\end{align}

Minus 15 ($-15$) is the effect of $B$, which shows the change in factor level from low to high bringing on the average $-15$ decrease in the response.

Reference

Montgomery, D. C. (2017). Design and Analysis of Experiments. 9th ed, John Wiley & Sons.

R and Data Analysis

Test Preparation MCQs