A standard normal table, also called the unit normal table or Z-table, is a table for the values of Φ calculated mathematically, and these are the values from the cumulative normal distribution function. A standard normal distribution table is used to find the probability that a statistic is observed below, above, or between values on the standard normal distribution, and by extension, any normal distribution. Since probability tables cannot be printed for every normal distribution, as there is an infinite variety (families) of normal distributions, it is common practice to convert a normal to a standard normal and then use the standard normal table to find the required probabilities (area under the normal curve).
The standard normal curve is symmetrical, so the table can be used for values going in any direction, for example, a negative 0.45 or positive 0.45 has an area of 0.1736.
The Standard Normal distribution is used in various hypothesis testing procedures such as tests on single means, the difference between two means, and tests on proportions. The Standard Normal distribution has a mean of 0 and a standard deviation of 1.
The values inside the given table represent the areas under the standard normal curve for values between 0 and the relative z-score.
The table value for $$Z is 1 minus the value of the cumulative normal distribution.
Standard Normal Table (Area Under the Normal Curve)
For example, the value for 1.96 is $P(Z>1.96) = 0.0250$.
Standard Normal Table (Summary)
A table of values for the cumulative distribution function (CDF) of the standard normal distribution.
The standard normal distribution has a mean of 0 and a standard deviation of 1.
This table shows the probability that a standard normal variable will be less than a certain value (z-score).
FAQs about Standard Normal Table
What is a standard normal distribution table?
What is the value of mean and variance in standard normal distribution?
What is the cumulative distribution function of standard normal distribution?
What kind of values are in the standard normal distribution table?
Is the standard normal distribution curve symmetrical?
What is meant by the area under the normal curve?
What is the use of standard normal distribution?
The values of $Z$ inside the standard normal table range from 0 to what value?
A histogram is very similar to the bar chart for a frequency distribution based on quantitative data showing the distribution of qualitative data. It is a useful graphical representation of data that helps to visualize the distribution of data.
Table of Contents
Important Points to Draw a Histogram Graph
The histogram is constructed from the grouped data by taking the class boundaries(not class limits) along the x-axis and the corresponding frequencies along the y-axis. For ungrouped data, we have to form the grouped frequency distribution before making a histogram. It consists of a set of bars (like a bar chart) but these bars are adjacent to each other and the height of the bars is proportional to the frequency associated with respective classes.
The area of each rectangle represented the respective class frequencies. When the class intervals are equal, the rectangles all have the same width and their heights directly represent the class frequencies. For the case in which class intervals are not all equal, the height of the rectangle (bar) over an unequal class interval, is to be adjusted because it is area and not the height that measures frequency. This means that the height of a rectangle must be proportionally decreased if the length of the corresponding class interval increases.
For example, if the length of a class interval becomes double, then the height of the rectangle is to be halved so that area, being the fundamental property of the rectangle of the histogram remains unchanged. This sort of rescaling is necessary to observe the correct pattern of distribution.
Important Features of Histogram
The important feature of the Histogram graph is that there is no gap (space) between the vertical bars because the variable plotted on the horizontal axis is quantitative and the variable is from the measure of scale either interval or ratio. Thus, it provides an easily interpreted visual representation of a frequency distribution. Note that class midpoints are used as labels for the classes.
It allows us to analyze extremely large datasets by reducing them to a single graphical representation which is used to show primary, secondary, and tertiary peaks in data, and also helps us by giving a visual representation of the statistical significance of those peaks.
Alternative of Histogram
An alternative to the histogram is kernel density estimation, which uses a kernel to smooth samples. This will construct a smooth probability density function, which will, in general, more accurately reflect the underlying variable.
Histograms for Continuous Grouped Data
To draw a histogram graph from the continuous grouped frequency distribution, the following steps are taken.
Mark the class boundaries of the classes along the x-axis.
Mark frequencies along the y-axis.
Draw a rectangle for each class such that the height of each rectangle is proportional to the frequency corresponding to that class. This is the case when classes are of equal width as they often are.
If the classes are of unequal width, then the area instead of the height of each rectangle is proportional to the frequency corresponding to that class, and the height of each rectangle is obtained by dividing the frequency of the class by the width of that class.
It may be noted that the area under a histogram graph can be calculated by adding up the areas of all the rectangles that constitute the histogram. The area of one rectangle is obtained by the multiplication of the width of the class by the corresponding frequency i.e.
Area of a single rectangle = width of the class x frequency of the class
Histogram for Discrete Data
Bar graphs are usually drawn for discrete and categorical data but there are some situations where there is a need to make an approximation, the histograms may be constructed. To construct a histogram graph for discrete grouped data, the following steps are taken:
Mark possible values on the x-axis.
Mark frequencies along the y-axis.
Draw a rectangle centered on each value with equal width on each side possibly 0.5 to either side of the value.
The advantages of the histograms as compared to the unprocessed data are:
It gives a range of data.
It gives the location of the data.
it gives a clue about the skewness of the data.
It gives information about the out-of-control situation.
Histograms are density estimates (give a good impression of the distribution of data.
Can be compared to the normal curve.
The disadvantages are:
Exact values cannot be read from histogram graph because data is grouped into categories and individuality of data vanishes in grouped data.
It is more difficult to compare two data sets.
It is used only for the continuous data set.
FAQs about Histogram
What is a histogram graph?
What is the difference between a bar chart and a histogram?
What are the important features of histograms?
What are the advantages and disadvantages of histogram graphs?
How one can draw a histogram for a discrete data set?
How one can draw a histogram for a continuous data set?
The coefficient of correlation (r) measures the strength and direction of a linear relationship between two variables. In this post, we will discuss about coefficient of correlation and the coefficient of determination.
Table of Contents
Correlation Coefficient Ranges
The correlation coefficient ranges from -1 to +1, where a value of +1 indicates the perfect positive correlation (as one variable increases, the other increases proportionally), the -1 value indicates the perfect negative correlation (as one variable increases, the other decreases proportionally), and the value of 0 indicates no linear correlation (no relationship between the variables).
The coefficient of correlation values between -1 and +1 indicates the degree of strength and direction of relationship:
The strength of correlation depends on the absolute value of r:
Range of Correlation Value
Interpretation
0.90 to 1.00
Very strong correlation
0.70 to 0.89
Strong correlation
0.40 to 0.69
Moderate correlation
0.10 to 0.39
Weak correlation
0.00 to 0.09
No or negligible correlation
The closer the value of the correlation coefficient is to ±1, the stronger the linear relationship.
Coefficient of Determination
We know that the ratio of the explained variation to the total variation is called the coefficient of determination, which is the square of the Correlation Coefficient Range and lies between $-1$ and $+1$. This ratio (coefficient of determination) is non-negative; therefore, denoted by $r^2$, thus
It can be seen that if the total variation is all explained, the ratio $r^2$ (Coefficient of Determination) is one, and if the total variation is all unexplained, then the explained variation and the ratio $r^2$ are zero.
The square root of the coefficient of determination is called the correlation coefficient, given by
We also know that the variance of any random variable is $\ge 0$, it could be zero i.e. $(Var(X)=0)$if and only if $X$ is a constant (almost surely), therefore
\[V(X^* \pm Y^*)=V(X^*)+V(Y^*)\pm2Cov(X^*,Y^*)\]
As $Var(X^*)=1$ and $Var(Y^*)=1$, the above equation would be negative if $Cov(X^*,Y^*)$ is either greater than 1 or less than -1. Hence \[1\geq \rho(X,Y)=\rho(X^*,Y^*)\geq -1\].
If $\rho(X,Y )=Cov(X^*,Y^*)=1$ then $Var(X^*- Y ^*)=0$ making $X^* = Y^*$ almost surely. Similarly, if $\rho(X,Y )=Cov(X^*,Y^*)=-1$ then $X^* = – Y^*$ almost surely. In either case, $Y$ would be a linear function of $X$ almost surely.