Median of Ungrouped Data

Introduction to Median of Ungrouped Data

The post is about calculating the median ungrouped data. The median is the most central point (middlemost central value) of the data/set of observations, with the condition that the data or set of observations should be arranged in ascending or descending order. The median divides the data into two equal parts. That is the main objective of the median.

It is important to note that the criteria for finding the median for grouped and ungrouped data are different.

The primary and secondary data can be defined as:

  1. Primary data, also called raw or ungrouped data, does not undergo any statistical procedure/method, which is not in the form of frequency distribution.
  2. Secondary data may also be called group data if it is in the form of frequency distribution.

Let us discuss how to find the median for ungrouped data.

There are two cases for ungrouped data. These cases are based on no of observations which is $n$

When the number of observations is odd (Say $n$ i.e. $n$ is odd), and when the number of observations is even (Say $n$ i.e. $n$ is even).

Median Calculations

The data below contains the odd number of observations.

Observation No.
(Ascending Order)
1st2nd3rd4th5th6th7th8th9th10th11th
Data Values81899096100102103104108109118
(Descending Order)1110987654321

Since the number of observations is odd ($n = 11$), the central value after arranging in ascending order will be the 6th value. and the 6th value is 102. That is the median is 102 for the above data.

The position of the median can be located mathematically, as follows:

\begin{align*}
\tilde{x} &= \left( \frac{n+1}{2} \right)th\,\, \text{value}\\
&=\frac{11+1}{2} = 6th\,\, \text{value}
\end{align*}

The value at the 6th position (from sorted data) is 102. The $\tilde{x}$ can be read as “x-tild” which is the notation of the median.

Median for Even Numbers of Observations

Consider the following data that contains an even number of observations.

Observation No.12345678910
Data Values81100961089010210410310989

Data after sorting (either in ascending or descending order) is

Observations No.1st2nd3rd4th5th6th7th8th9th10th
x81899096100102103104108109

Since $n=10$ which is even, the central position (that is median) lies between the 5th value and the 6th value. This central value is the average of the 5th and 6th values (from the sorted data). The average of these two central observations is called the median. The two central positions are 100 and 102, take the average of these two numbers and find the median.

$$Median = \frac{100+102}{2} = 101$$

Median Formula for Large Data Sets

The median formula for large or small data sets can be represented mathematically.

  • For large data sets one can find the median of data mathematically. The formula for both odd number of observations and even numbers of observations is different.

The point to remember when computing the median is that

  • For an odd number of observations, the median is the centermost value after sorting the data
  • For an even number of observations, the median is the average of two central values after sorting the data

\begin{align*}
\tilde{x} &= \frac{1}{2} \left[ \left(\frac{n}{2}th \, \, value \right)+ \left(\frac{n}{2}+1 \right)the \,\, value \right]\quad \quad \text{(When observations are even)}\\
&= \frac{n+1}{2} \quad \quad \text{(when observations are odd)}
\end{align*}

The other way of the median formula is

Median of ungrouped data

Consider, a data set containing 157 observations. To compute the median, first of all, you need to sort the data in either ascending or descending order. The formula for this data will be

$$\tilde{x} = \frac{n+1}{2} = \frac{157+1}{2}=79th$$.

The 79th observation in the sorted data will be the median of the data.

In case, if there are even number of observations (say $n=396$, the median will be

\begin{align*}
\tilde{x} &= \frac{1}{2}\left[\left(\frac{n}{2}\right)th + \left(\frac{n+1}{2}\right)th \right]\\
&=\frac{1}{2} \left[\frac{396}{2}th + \frac{396}{2}+1 \right]\\
&= \frac{1}{2} \left[198th + 199th\right]
\end{align*}

The average of 198th value and 199th value from the sorted data will be the median of the data.

https://rfaqs.com

https://gmstat.com

Geometric Mean

Introduction to Geometric Mean

The geometric mean (GM) is a way of calculating an average, but instead of adding values like the regular (arithmetic) mean, it multiplies them and then takes a root. The geometric mean is defined as the $n$th root of the product of $n$ positive values.

If we have two observations let’s say 9 and 4, then the geometric mean is the square root of the product of these values, which is 6 ($\sqrt{9\times 4}=6$. If there are three values let’s say  3, 9, and 3 then the geometric average will be the $sqrt[3]{3\times 9 \times 3} = 3$. In a similar pattern, mathematically, for $n$ number of observations ($x_1, x_2, \cdots, x_n$) then the Geometric Average Formula will be

$$GM = (x_1 \times x_2 \times x_3 \times \cdots \times x_n)^{\frac{1}{n} }$$

Geometric Mean

Geometric Mean Example

Suppose we have the following set of values $x=32, 36, 36, 37, 39, 41, 45, 46, 48$. The Computation of Geometric Mean will be

\begin{align*}
GM &= (32\times 36 \times 36 \tmies 37 \times 39 \times 41 \times 45 \times 46 \times 48)^{\frac{1}{9}}\\
&=(243790484520960)^{\frac{1}{9}} = 39.7
\end{align*}

For a large number of observations one can compute the GM by taking the log of all observations using the following formula:

$$GM = antilog \left[\frac{\sum\limits_{i=1}^n log\, x}{n} \right]$$

$x$$log\, x$
32Log 32 = 1.5051
36log 36 = 1.5563
36log 36 = 1.5563
37log 37 = 1.5682
39log 39 = 1.5911
41log 41 = 1.6128
45log 45 = 1.6532
46log 46 = 1.6628
48log 48 = 1.6812
Total14.3870

\begin{align*}
GM &= antilog \left[ \frac{\sum\limits_{i=1}^n log\, x}{n} \right]\\
&= antilog \left[\frac{14.3870}{9}\right] = antilog [1.5986]\\
&= 38.7
\end{align*}

One important point that should be remembered is that if any value in the data set is zero or negative then the GM cannot be computed.

Geometric Mean for Grouped Data

The GM for grouped data can also be computed using the following formula:

$$GM = antilog \left[ \frac{\Sigma f\times log\, x}{\Sigma f} \right]$$

Suppose, we have the following frequency distribution as follows:

ClassesFrequency
65 to 849
85 to 10410
105 to 12417
125 to 14410
145 to 1645
165 to 1844
185 to 2045
Tota60

The GM of the above frequency distribution can be performed as follows

Classes$f$$X$$log\, X$$f \times log\, X$
65-84974.5log 74.5 = 1.872216.8494
85-1041094.5log 94.5 = 1.975419.7543
105-12417114.5log 114.5 = 2.058834.9997
125-14410134.5log 134.5 = 2.128721.2872
145-1645154.5log 154.5 = 2.188910.9446
165-1844174.5log 174.5 = 2.24188.9672
185-2045194.5log 194.5 = 2.288911.4446
Total60  124.2471

\begin{align*}
GM &= antilog \left[ \frac{124.2471}{60} \right]\\
&=antilog (2.0708) = 117.4
\end{align*}

The GM is particularly useful when dealing with rates of change or ratios, such as growth rates in investments. That is because geometric mean considers how things are multiplied over time, rather than simply added.

Use and Application of Geometric Mean

Geometric Mean is useful in situations like:

  • Investment returns: When one looks at average investment growth, one wants to consider how much one’s money is multiplied over time, not just the change each year. That is why the GM is better suited for this scenario.
  • Rates of change: Similar to investment returns, if something is increasing or decreasing by a percentage each time, the GM is a more accurate measure of the overall change.
  • Growth Rates: When dealing with percentages or ratios that change over time (like investment returns or population growth), the geometric mean provides a more accurate picture of the overall change compared to the arithmetic mean.
  • Proportional Changes: It is helpful for situations where changes are multiplied, not added. For example, if a recipe calls for doubling all ingredients, the geometric mean of the original quantities represents the final amount.

https://rfaqs.com

https://gmstat.com

Weighted Average Real Life Examples

Introduction to Weighted Averages

The multipliers or sets of numbers that express more or less relative importance of various observations (data points) in a data set are called weights.

The weighted arithmetic mean (simply called weighted average or weighted mean) is similar to an ordinary arithmetic mean except that instead of each data point contributing equally to the final average, some data points contribute more than others. Weighted means are useful in a wide variety of scenarios. Weighted averages are used when there are a bunch of values, but some of those values are more important or contribute more to the overall result.

Example of Weighted Average

For example, a student may use a weighted mean to calculate his/her percentage grade in a course. In such an example, the student would multiply the weight of all assessment items in the course (e.g., assignments, exams, sessionals, quizzes, projects, etc.) by the respective grade that was obtained in each of the categories.

As an example, suppose in a course there are a total of 60 marks, while the distribution of marks is as follows, Assignment-1 has a weightage of 10%, Assignment-2 has a weightage of 10%, the mid-term examination has a weightage of 30% and the final term examination have the weightage of 50%. The scenario is described in the table below:

Assessment
Item
Weight
($w_i$)
Grades
($x_i$)
MarksWeighted Marks
($w_ix_i$)
Midterm10 %70 %67 %
Assignment # 210 %65 %66.5 %
Midterm Examination30 %70 %1221 %
Final Term Examination50 %85 %3042.5 %
 100 %290 %6077 %

Weighted Average Formula

Mathematically, the weighted average forma is given as

$$\overline{x}_w = \frac{\sum\limits_{i=1}^n w_i x_i}{\sum\limits_{i=1}^n w_i}$$

Another Example

Consider another example: Suppose we have monthly expenditures of a family on different items with their quantity

ItemsWeights ($w_i$)Expenses ($x_i$)Weighted Expenses
$w_ix_i$
Food7.52902175
Rent2.054108
Clothing1.596144
Fuel and light1.07575
Misc0.57537.5
Total12.55902539.5

The average expenses will be: $AM = \frac{590}{5} = 118$.

However, the weighted average of the scenario will be $\overline{x}_w = \frac{\sum\limits_{i=1}^n w_i x_i}{\sum\limits_{i=1}^n w_i} = \frac{2539.5}{12.5}=203.16$

Keeping in mind the importance of weight, the average monthly expenses of a family was 203.16, not 118.

Note that in a frequency distribution, the computation of relative frequency (rf) is also related to the concept of weighted averages.

ClassesFrequencyMid point ($X$)rfPercentage
65-84974.5$\frac{9}{60} = 0.15$15
85-1041094.5$\frac{10}{60} = 0.17$17
105-12417114.5$\frac{10}{60} = 0.28$28
125-14410134.5$\frac{10}{60} = 0.17$17
145-1645154.5$\frac{5}{60} = 0.08$8
165-1844174.5$\frac{4}{60} =0.07$7
185-2045194.5$\frac{5}{60} =0.08$8
Total60  

Some Real-World Examples of Weighted Averages

  • Calculating class grade: Different assignments might have different weights (e.g., exams worth more than quizzes). A weighted mean considers these weights to determine the overall grade.
  • Stock market performance: A stock index might use a weighted average to reflect the influence of large companies compared to smaller ones.
  • Customer Satisfaction: Finding the average customer satisfaction score when some customers’ feedback might hold more weight (e.g., frequent buyers).
  • Average Customer Spending: if some customers buy more frequently.
  • Expected Value: Determining the expected value of outcomes with different probabilities.
Weighted Average

The following are some important questions. What is the importance of weighted mean? Describe its advantages and disadvantages. What is an average? What are the qualities of a good average? What does Arithmetic mean? Describe the advantages and disadvantages of Arithmetic mean. In which situations do we apply arithmetic mean?

https://gmstat.com

https://rfaqs.com

Summation Operator Properties and Examples (2024)

The summation operator is denoted by $\Sigma$. The summation operator is a mathematical notation used to represent the sum of numbers or terms. The summation is the total of all the terms added according to the specified range of values for the index.

Suppose, we have information about the height of students, such as 54, 55, 58, 60, 61, 45, 53.
Using variable and value notation one can denote the height of the students like

  • First height in the information $X_1$, that is $X_1=54$
  • Second height in the information $X_2$, that is $X_2=55$
  • Last or nth information $X_n$, that is $X_n=53$.
Summation Operator

In general, the variable and its values can be denoted by $X_i$, where $i=1,2,3, \cdots, n$.

The sum of all numeric information (values of the variable $X_1, X_2, \cdots, X_n$) can be totaled by $X_1+X_2+\cdots+X_n$. The short and useful summation for the set of values is $\sum\limits_{i=1}^n X_i$, where the symbol $\Sigma$ is a Greek letter and denotes the sum of all values ranging from $i=1$ (start) to $n$ (last) value.

Summation Operator

The number written on top of $\Sigma$ is called the upper limit (Upper Bound) of the sum, below $\Sigma$, there are two additional components: the index and the lower bound (lower limit). On the right of $\Sigma$, there is the sum term for all the indexes.

Summation Operator

Consider the following example for the use of summing values using the Summation operator.

\begin{align*}
X_1 + X_2 + X_3 + \cdots X_n &= \sum\limits_{i=1}^{n} X_i\\
X_1Y_1 + X_2Y_2 + X_3Y_3 + \cdots X_nY_n &= \sum\limits_{i=1}^{n} X_iY_i\\
X_1^2 + X_2^2 + \cdots + X_3^2 + \cdots X_n^2 &= \sum\limits_{i=1}^n X_i^2\\
(X_1 + X_2 + X_3 + \cdots X_n)^2 &= \left( \sum\limits_{i=1}^{n} X_i \right)^2
\end{align*}

The following examples make use of the summation operator, when a number (constant) and values of the variable are involved.

\begin{align}
a+a+a+ \cdots + a = na&=\sum\limits_{i=1}^{n}a\\
aX_1 + aX_2 + aX_3 \cdots + aX_n &= a \sum\limits_{i=1}^n X_i\\
(X_1-a)+(X_2-a)+\cdots + (X_n-a) &= \sum\limits_{i=1}^n (X_i-a)\\
(X_1-a)^2+(X_2-a)^2+\cdots + (X_n-a)^2 &= \sum\limits_{i=1}^n (X_i-a)^2\\
[(X_1-a)+(X_2-a)+\cdots + (X_n-a)]^2 &= \left[\sum\limits_{i=1}^n (X_i-a)\right]^2
\end{align}

Properties of Summation Operator

The summation operator is denoted by the $\Sigma$ symbol. It is a mathematical notation used to represent the sum of a collection of (data) values. The following useful properties for the manipulation of the sum operator are:

1) Multiplying a sum by a constant
$$c\sum\limits_{i=1}^n x_i = \sum\limits_{i=1}^n cx_i$$

2) Linearity: The summation operator is linear meaning that it satisfies the following properties for constant $a$ and $b$, and sequence $x_n$ and $y_n$.
$$\sum\limits_{i=1}^N(ax_i + by_i) = a \sum_{i=1}^N x_n + b\sum\limits_{i=1}^N y_i$$

3) Splitting a sum into two sums
$$\sum\limits_{i=a}^n x_i = \sum\limits_{i=a}^{c}x_i + \sum_{i=c+1}^n x_i$$

4) Combining Summations: Multiple summations can be combined into a single summation:
$$\sum\limits_{i=1}^b x_n + \sum\limits_{i=b+1}^c x_i = \sum\limits_{i=1}^c x_i$$

5) Changing the order of individual sums in multiple sum expressions
$$\sum\limits_{i=1}^{m} \sum\limits_{j=1}^{n} a_{ij} = \sum\limits_{j=1}^{n}\sum\limits_{i=1}^{m} a_{ij}$$

6) Distributivity over Scalar Multiplication: The summation operator distributes over scalar multiplication
$$c\sum\limits_{i=1}^b x_i = \sum_{i=1}^b (cx_i)$$

7) Adding or Subtracting Sums
$$\sum\limits_{i=1}^a x_i \pm \sum_{i=1}^a y_i = \sum\limits_{i=1}^a (x_i \pm y_i)$$

8) Multiplying the Sums:
$$\sum\limits_{i_1=a_1}^{n_1} x_{i_1} \times \cdots \times \sum\limits_{i_n=a_n}^{n_n} x_{i_n} = \sum\limits_{i_1=a_1}^{n_1} \times \cdots \times \sum\limits_{i_1=a_1}^{n_n}x_{i_1}\times \cdots \times x_{i_n}$$

https://itfeature.com

Online MCQs Test Preparation Website

Learning R Programming Language