Chart and Graphics - Statistics for Data Science & Analytics

Cumulative Frequency Distribution and Polygon

Apr 4, 2025May 7, 2012 by Muhammad Imdad Ullah

Introduction

A cumulative frequency distribution (cumulative frequency curve or ogive) and a cumulative frequency polygon require cumulative frequencies. The cumulative frequency is denoted by CF, and for a class interval, it is obtained by adding the frequency of all the preceding classes, including that class. It indicates the total number of values less than or equal to the upper limit of that class. For comparing two or more distributions, relative cumulative frequencies or percentage cumulative frequencies are computed.

A cumulative frequency distribution shows the running total of frequencies up to a certain point in a dataset. It tells you how many values lie below (or above) a particular value or class interval. There are two types:

Less than cumulative frequency: Total frequency up to and including a class.
Greater than cumulative frequency: Total frequency from a class and above.

The relative cumulative frequencies are the proportions of the cumulative frequency denoted by CRF and are obtained by dividing the cumulative frequency by the total frequency (Total number of Observations). The CRF of a class can also be obtained by adding the relative frequencies (rf) of the preceding classes, including that class. Multiplying the relative frequencies by 100 gives the corresponding percentage cumulative frequency of a class.

Method of Construction of Cumulative Frequencies

The method of construction of cumulative frequencies and cumulative relative frequencies is explained in the following table:

Plot a Cumulative Frequency Distribution

To plot a CF distribution, scale the upper limit of each class along the x-axis and the corresponding cumulative frequencies along the y-axis. For additional information, you can label the vertical axis on the left in units and the vertical axis on the right in percentages. The cumulative frequencies are plotted along the y-axis against upper or lower-class boundaries, and the plotted points are joined by a straight line. Cumulative Frequency Polygon can be used to calculate median, quartiles, deciles, and percentiles, etc.

Data Visualization in R Programming Language

Cumulative Frequency Distribution Ogive — Cumulative Frequency Polygon or Ogive

Cumulative Frequency distribution and Frequency polygon

Uses of Cumulative Frequency Distribution

Understanding Data Distribution: It helps in visualizing how data accumulates and how it is distributed across intervals. For example, how many students scored less than 70 on a class test?
Identifying Percentiles, Quartiles, and Medians: Cumulative frequency is essential for calculating:
- Median (50th percentile)
- Lower Quartile ($Q_1$: 25th Percentile)
- Upper Quartile ($Q_3$: 7th Percential)
  The example is in a class of 100 students, the 50th value gives the median score.
Comparing Groups: It makes it easy to compare two datasets — for example, comparing scores of two different classes or years. For example, Class A vs. Class B test scores over the same intervals.
Visualizing with Ogive (Cumulative Frequency Graph): The ogive curve helps identify the median, quartiles, and general data trends visually. For example, plotting a cumulative graph of income groups shows how many people earn less than a certain amount.
Simplifying Large Data Sets: For large datasets, cumulative frequency helps summarize and compress information into a more understandable form.
Estimating Probabilities: It provides a rough idea of how likely a value or range of values is to occur. For example, what is the chance that a randomly selected customer is younger than 35?

FAQs about Cumulative Frequency Distribution

What is Cumulative Frequency?
What is Ogive?
What is a cumulative frequency polygon?
What can be measured from a Distribution of Cumulative Frequency?
What can be visualized from ogive?
What are the two methods for the construction of cumulative frequency distribution?
What are relative cumulative frequencies?
How can one compare two data sets?

Pie Chart | Visual Display of Categorical Data

Mar 23, 2024Mar 4, 2012 by Muhammad Imdad Ullah

A pie chart is a way of summarizing a set of categorical data. It is a circle that is divided into segments/sectors. Each segment represents a particular category. The area of each segment is proportional to the number of cases in that category. It is a useful way of displaying the data where the division of a whole into parts needs to be presented. It can also be used to compare such divisions at different times.

Pie Chart

A pie chart is constructed by dividing the total angle of a circle of 360 degrees into different components. The angle A for each sector is obtained by the relation:

$$A=\frac{Component Part}{Total}\times 360$$

Each sector is shaded with different colors or marks so that they look separate from each other.

Pie Chart Example

Make an appropriate chart for the data available regarding the total production of urea fertilizer and its use on different crops. Let the total production of urea be about 200 thousand (kg) and its consumption for different crops wheat, sugarcane, maize, and lentils is 75, 80, 30, and 15 thousand (kg) respectively.

Solution:

The appropriate diagram seems to be a pie chart because we have to present a whole into 4 parts. To construct a pie chart, we calculate the proportionate arc of the circle, i.e.

Crops	Fertilizer (000 kg)	Proportionate arc of the circle
Wheat	75	$\frac{75}{200}\times 360=135$
Sugarcane	80	$\frac{80}{200}\times 360=144$
Maize	30	$\frac{30}{200}\times 360=54$
Lentils	15	$\frac{15}{200}\times 360=27$
Total	200	360

Now draw a circle of an appropriate radius, and make the angles clockwise or anticlockwise with the help of a protractor or any other device. For wheat make an angle of 135 degrees, for sugarcane an angle of 44 degrees, for maize, an angle of 54 degrees, and for lentils, an angle of 27 degrees, hence the circular region is divided into 4 sectors. Now shade each of the sectors with different colors or marks so that they look different from each other. The pie chart of the above data is

Online MCQs Test Preparation Website gmstat.com

Favourite Subjects Pie Chart Example — Favourite Subjects

Scatter Diagram

Apr 11, 2025Feb 8, 2012 by Muhammad Imdad Ullah

A scatterplot (a scatter graph or scatter Diagram) is used to observe the strength and direction between two quantitative variables. In statistics, the quantitative variables follow the interval or ratio scale from measurement scales.

Scatter Diagram

Usually, in a scatter, diagram the independent variable (also called the explanatory, regressor, or predictor variable) is taken on the X-axis (the horizontal axis) while on the Y-axis (the vertical axis) the dependent (also called the outcome variable) is taken to measure the strength and direction of the relationship between the variables. However, it is not necessary to take explanatory variables on the X-axis and outcome variables on the Y-axis. The scatter diagram and Pearson’s correlation measure the mutual correlation (interdependencies) between the variables, not the dependence or cause and effect.

The diagram below describes some possible relationships between two quantitative variables ($X$ & $Y$). A short description is also given of each possible relationship.

Drawing Scatter Plot/ Diagram

A scatter diagram can be drawn between two quantitative variables. The length (number of observations) of both of the variables should be equal. Suppose we have two quantitative variables, $X$ and $Y$. We want to observe the strength and direction of the relationship between these two variables. It can be done in R language easily.

x <- c(5, 7, 8, 7, 2, 2, 9, 4, 11 ,12, 9, 6)
y <- c(99, 86, 87, 88, 111, 103, 87, 94, 78, 77, 85, 86)

plot(x, y)

From the above discussion, it is clear that the main objective of a scatter diagram is to visualize the linear or some other type of relationship between two quantitative variables. The visualization may also help to depict the trends, strength, and direction of the relationship between variables.

Limitations of Scatter Diagrams

Limited to Two Variables: Scatter plots can only depict the relationship between two variables at a time. If there are more than two variables, one might need to use other visualization techniques.
Strength of Correlation: While scatter diagrams can show the direction of a relationship, they don’t necessarily indicate the strength of that correlation. You might need to calculate correlation coefficients to quantify the strength.

In conclusion, scatter diagrams are a powerful and versatile tool for exploring relationships between variables. By understanding how to create and interpret them, one can gain valuable insights from the data and inform decision-making processes across various disciplines.

Importance of Scatter Diagram

1. Identifies Relationships Between Variables

This diagram shows whether two quantitative variables are positively correlated (both increase together), negatively correlated (one increases while the other decreases), or not correlated.
It helps in detecting non-linear relationships (e.g., quadratic or exponential trends).

2. Detects Outliers

It is used to reveal unusual data points (observations) that deviate from the general trend, which may indicate errors or special cases.

3. Useful for Predictive Analysis

It helps in regression analysis by determining if a linear or other model fits the data well.

4. Visualizes Data Distribution

It is used to show the spread/scatteredness and clustering of data points, helping in understanding variability.

5. Supports Decision-Making

It is used in business, science, engineering, and healthcare to analyze cause-and-effect relationships (e.g., marketing spend vs. sales, temperature vs. product defects).

6. Easy to Interpret

Scatter diagrams provide a simple, intuitive way to observe trends without complex statistics.

Example and Uses

Business: Analyzing sales vs. advertising expenditure.
Quality Control: Checking if machine speed affects defect rates.
Healthcare: Studying the relationship between exercise and blood pressure.
Psychology: studying the relationship between anxiety score and depression score.

FAQs About Correlation Analysis

What is the coefficient of correlation?
What is the use of a scatter diagram?
What are the limitations of a scatter diagram?
How scatter diagram be used to assess the relation and direction of the relationship between variables?

For more about correlation and regression analysis

Learn R Language for Statistical Computing

Cumulative Frequency Distribution and Polygon

Introduction

Table of Contents

Method of Construction of Cumulative Frequencies

Plot a Cumulative Frequency Distribution

Uses of Cumulative Frequency Distribution

FAQs about Cumulative Frequency Distribution

Pie Chart | Visual Display of Categorical Data

Pie Chart

Scatter Diagram

Table of Contents

Scatter Diagram

Drawing Scatter Plot/ Diagram

Limitations of Scatter Diagrams

Importance of Scatter Diagram

1. Identifies Relationships Between Variables

2. Detects Outliers

3. Useful for Predictive Analysis

4. Visualizes Data Distribution

5. Supports Decision-Making

6. Easy to Interpret

Example and Uses

FAQs About Correlation Analysis

Introduction

Table of Contents

Method of Construction of Cumulative Frequencies

Plot a Cumulative Frequency Distribution

Uses of Cumulative Frequency Distribution

FAQs about Cumulative Frequency Distribution

Share this:

Pie Chart

Share this:

Table of Contents

Scatter Diagram

Drawing Scatter Plot/ Diagram

Limitations of Scatter Diagrams

Importance of Scatter Diagram

1. Identifies Relationships Between Variables

2. Detects Outliers

3. Useful for Predictive Analysis

4. Visualizes Data Distribution

5. Supports Decision-Making

6. Easy to Interpret

Example and Uses

FAQs About Correlation Analysis

Share this: