A scatterplot (also called a scatter graph or scatter Diagram) is used to observe the strength and direction between two quantitative variables. In statistics, the quantitative variables follow the interval or ratio scale from measurement scales.
Scatter Diagram
Usually, in a scatter, diagram the independent variable (also called the explanatory, regressor, or predictor variable) is taken on the X-axis (the horizontal axis) while on the Y-axis (the vertical axis) the dependent (also called the outcome variable) is taken to measure the strength and direction of the relationship between the variables. However, it is not necessary to take explanatory variables on the X-axis and outcome variables on the Y-axis. Because, the scatter diagram and Pearson’s correlation measure the mutual correlation (interdependencies) between the variables, not the dependence or cause and effect.
The diagram below describes some possible relationships between two quantitative variables ($X$ & $Y$). A short description is also given of each possible relationship.
A scatter diagram can be drawn between two quantitative variables. The length (number of observations) of both of the variables should be equal. Suppose, we have two quantitative variables $X$ and $Y$. We want to observe the strength and direction of the relationship between these two variables. It can be done in R language easily.
x <- c(5, 7, 8, 7, 2, 2, 9, 4, 11 ,12, 9, 6)
y <- c(99, 86, 87, 88, 111, 103, 87, 94, 78, 77, 85, 86)
plot(x, y)
From the above discussion, it is clear that the main objective of a scatter diagram is to visualize the linear or some other type of relationship between two quantitative variables. The visualization may also help to depict the trends, strength, and direction of the relationship between variables.
Limitations of Scatter Diagrams
- Limited to Two Variables: Scatter plots can only depict the relationship between two variables at a time. If there are more than two variables, one might need to use other visualization techniques.
- Strength of Correlation: While scatter diagrams can show the direction of a relationship, they don’t necessarily indicate the strength of that correlation. You might need to calculate correlation coefficients to quantify the strength.
In conclusion, scatter diagrams are a powerful and versatile tool for exploring relationships between variables. By understanding how to create and interpret them, one can gain valuable insights from the data and inform decision-making processes across various disciplines.
For more about correlation and regression analysis
Learn R Language for Statistical Computing