Tagged: Chart and Graph

Histogram: a Useful Graphical Representation of Data

A histogram is very similar to the bar chart for a frequency distribution based on quantitative data showing the distribution of qualitative data. It is a useful graphical representation of data which helps to visualize the distribution of data.

The histogram is constructed from the grouped data by taking the class boundaries (not class limits) along the x-axis and the corresponding frequencies along the y-axis. For ungrouped data, we have to form the grouped frequency distribution before making a histogram. The Histogram consists of a set of bars (like bar chart) but these bars are adjacent to each other and the height of bars is proportional to the frequency associated with respective classes. The area of each rectangle represented the respective class frequencies. When the class intervals are equal, the rectangles all have the same width and their heights directly represent the class frequencies. For the case in which class-intervals are not all equal, the height of the rectangle (bar) over an unequal class-interval, is to be adjusted because it is area and not the height that measures frequency. This means that the height of a rectangle must be proportionally decreased if the length of the corresponding class-interval increases. For example, if the length of a class-interval becomes double, then the height of the rectangle is to be halved so that area, being the fundamental property of the rectangle of a histogram remains unchanged. This sort of rescaling is necessary to observe the correct pattern of distribution.

The feature of a histogram is that there is no gap (space) between the vertical bars because the variable plotted on the horizontal axis is quantitative and the variable is from the measure of scale either interval or ratio. Thus, the histogram provides an easy interpreted visual representation of a frequency distribution. Note that class midpoints are used as labels for the classes.

The Histogram allows us to analyze extremely large datasets by reducing them to a single graphical representation which is used to show primary, secondary and tertiary peaks in data and also help us by giving a visual representation of the statistical significance of those peaks.

An alternative to the histogram is kernel density estimation, which uses a kernel to smooth samples. This will construct a smooth probability density function, which will, in general, more accurately reflect the underlying variable.

Histogram for continuous grouped data

To draw a histogram from the continuous grouped frequency distribution, the following steps are taken.

  1. Mark class boundaries of the classes along the x-axis.
  2. Mark frequencies along the y-axis.
  3. Draw a rectangle for each class such that the height of each rectangle is proportional to the frequency corresponding to that class. This is the case when classes are of equal width as they often are.
  4. If the classes are of unequal width, then the area instead of the height of each rectangle is proportional to the frequency corresponding to that class and the height of each rectangle is obtained by dividing the frequency of the class by width of that class.

It may be noted that the area under a histogram can be calculated by adding up the areas of all the rectangles that constitute the histogram. The area of one rectangle is obtained by the multiplication of width of the class by the corresponding frequency i.e.

Area of a single rectangle = width of the class x frequency of the class

Histogram for Discrete Data

Bar graphs are usually drawn for discrete and categorical data but there are some situations where there is a need to make an approximation, the histogram may be constructed. To construct a histogram for discrete grouped data, the following steps are taken:

  1. Mark possible values on the x-axis.
  2. Mark frequencies along the y-axis.
  3. Draw a rectangle centered on each value with equal width on each side possible 0.5 to either side of the value.

Advantages:

The advantages of the histogram as compared to the unprocessed data are:

  1. It gives a range of data.
  2. It gives the location of the data.
  3. it gives a clue about the skewness of the data.
  4. It gives information about the out of control situation.
  5. Histogram are density estimates (gives a good impression of the distribution of data.
  6. Can be compared to the normal curve.

Disadvantages:

  1. Exact values cannot be read from histogram because data is grouped into categories and individuality of data vanishes in grouped data.
  2. It is more difficult to compares two data sets.
  3. It is used only for the continuous data set.

Scatter Diagram: Graphical Representation for two Quantitative Variables

A scatterplot (also called a scatter graph or scatter Diagram) is used to observe the strength and direction between two quantitative variables. In statistics, the quantitative variables follow the interval or ratio scale from measurement scales.

Usually, in scatter, diagram the independent variable (also called the explanatory, regressor, or predictor variable) is taken on the X-axis (the horizontal axis) while on the Y-axis (the vertical axis) the dependent (also called outcome variable) is taken to measure the strength and direction of the relationship between the variables. However, it is not necessary to take explanatory variables on X-axis and outcome variables on Y-axis. Because, scatter diagram and Pearson’s correlation measure the mutual correlation (interdependencies) between the variables, not the dependence or cause and effect.

Diagram below describe some possible relationship between two quantitative variables (X & Y). A short description is also given on each possible relationship.

Correlation

For more about correlation see the post link below

x Logo: Shield Security
This Site Is Protected By
Shield Security