Describing Data Discover Story (2024)

Describing data effectively involves summarizing its key characteristics and highlighting interesting patterns or trends. Therefore, to extract information from the sample one needs to organize and summarize the collected data. The arrangement (organization) of data into a reduced form which is easy to understand, analyze, and interpret is known as the presentation of data.

Remember: our goal is to construct tables, charts, and graphs that will help to quickly reveal the concentration and shape of the data. Graphical Presentation of Data help in making wise decisions.

Visualizations: Describing Data Visually/ Graphically

Charts and graphs are powerful tools for showcasing data patterns and trends. In this article, we will discuss bar graphs and histograms only.

Describing Data Using Bar Graph

Bar diagrams can be used to get an impression of the distribution of a discrete or categorical data set. They can also be used to compare groups, and categories in explanatory data analysis (EDA) to illustrate the major features of the data distribution in a convenient form.

A graphical representation in which the discrete classes are reported on the horizontal axis and the class frequencies on the vertical axis and the class frequencies are proportional to the heights of the bars. It is a way of summarizing a set of categorical data.

Note that a distinguishing characteristic of a bar chart is that there is a distance or a gap between the bars i.e. the variable of interest is qualitative and the bars are not adjacent to each other. Thus a bar chart graphically describes a frequency table using a series of uniformly wide rectangles, where the height of each rectangle is the class frequency.

There are different versions of bar graphs such as clustered bar graphs, stacked bar graphs, horizontal bar graphs, and vertical bar graphs.

Describing Data: Bar Graphs

Describing Data in Histogram

A histogram is a similar graphical representation to bar graphs. It is used to summarize data that are quantitative i.e. measured on an interval or ratio scale (continuous). Histograms are constructed from the grouped data by taking class boundaries along the x-axis and the corresponding frequencies along the y-axis. The heights of the bars represent the class frequencies.

Note that the horizontal axis represents all possible values because the nature of data is quantitative which is usually measured using continuous scales, not discrete. That is why, histogram bars are drawn adjacent to each other to show the continuous nature of data. It is generally used for large data sets (having more than 100 observations) when stem and leaf plots become tedious to construct. A histogram can also help in detecting any unusual observations (outliers) or gaps in the data set.

Describing Data: Histogram

Data (in its raw form) is a collection of numbers, characters, or observations that might seem overwhelming or meaningless. Describing data is the crucial step in unlocking its potential. In essence, describing data is like laying the groundwork for a building. It provides a clear understanding of the data’s characteristics, empowers informed decision-making, and paves the way for further analysis to extract valuable insights.

MCQs Economics

R Frequently Asked Questions

Leave a Comment

Discover more from Statistics for Data Analyst

Subscribe now to keep reading and get access to the full archive.

Continue reading