Effect Size Definition, Formula, Interpretation (2014)

Effect Size Definition

The Effect Size definition: An effect size is a measure of the strength of a phenomenon, conveying the estimated magnitude of a relationship without making any statement about the true relationship. Effect size measure(s) play an important role in meta-analysis and statistical power analyses. So reporting effect size in thesis, reports or research reports can be considered as a good practice, especially when presenting some empirical results/ findings because it measures the practical importance of a significant finding. Simply, we can say that effect size is a way of quantifying the size of the difference between the two groups.

Effect size is usually computed after rejecting the null hypothesis in a statistical hypothesis testing procedure. So if the null hypothesis is not rejected (i.e. accepted) then effect size has little meaning.

There are different formulas for different statistical tests to measure the effect size. In general, the effect size can be computed in two ways.

  1. As the standardized difference between two means
  2. As the effect size correlation (correlation between the independent variables classification and the individual scores on the dependent variable).

The Effect Size Dependent Sample T-test

The effect size of paired sample t-test (dependent sample t-test) known as Cohen’s d (effect size) ranging from $-\infty$ to $\infty$ evaluated the degree measured in standard deviation units that the mean of the difference scores is equal to zero. If the value of d equals 0, then it means that the difference scores are equal to zero. However larger the d value from 0, the more the effect size.

Effect Size Formula for Dependent Sample T-test

The effect size for the dependent sample t-test can be computed by using

\[d=\frac{\overline{D}-\mu_D}{SD_D}\]

Note that both the Pooled Mean (D) and standard deviation are reported in SPSS output under paired differences.

Let the effect size, $d = 2.56$ which means that the sample means difference and the population mean difference is 2.56 standard deviations apart. The sign does not affect the size of an effect i.e. -2.56 and 2.56 are equivalent effect sizes.

The $d$ statistics can also be computed from the obtained $t$ value and the number of paired observations by Ray and Shadish (1996) such as

\[d=\frac{t}{\sqrt{N}}\]

The value of $d$ is usually categorized as small, medium, and large. With Cohen’s $d$:

  • d=0.2 to 0.5 small effect
  • d=0.5 to 0.8, medium effect
  • d= 0.8 and higher, large effect.

Calculating Effect Size from $R^2$

Another method of computing the effect size is with r-squared ($r^2$), i.e.

\[r^2=\frac{t^2}{t^2+df}\]

Effect size is categorized into small, medium, and large effects as

  • $r^2=0.01$, small effect
  • $r^2=0.09$, medium effect
  • $r^2=0.25$, large effect.
Effect Size Definition Dependent t test

The non‐significant results of the t-test indicate that we failed to reject the hypothesis that the two conditions have equal means in the population. A larger value of $r^2$ indicates the larger effect (effect size), while a large effect size with a non‐significant result suggests that the study should be replicated with a larger sample size.

So larger value of effect size computed from either method indicates a very large effect, meaning that means are likely very different.

Choosing the Right Effect Size Measure

The appropriate effect size measure depends on the type of analysis being conducted (for example, correlation, group comparison, etc.) and the scale measurement of the data (continuous, binary, nominal, ration, interval, ordinal, etc.). It is always a good practice to report both effect size and statistical significance (p-value) to provide a more complete picture of your findings.

In conclusion, effect size is a crucial concept in interpreting statistical results. By understanding and reporting effect size, one can gain a deeper understanding of the practical significance of the research findings and contribute to a more comprehensive understanding of the field of study.

References:

  • Ray, J. W., & Shadish, W. R. (1996). How interchangeable are different estimators of effect size? Journal of Consulting and Clinical Psychology, 64, 1316-1325. (see also “Correction to Ray and Shadish (1996)”, Journal of Consulting and Clinical Psychology, 66, 532, 1998)
  • Kelley, Ken; Preacher, Kristopher J. (2012). “On Effect Size”. Psychological Methods 17 (2): 137–152. doi:10.1037/a0028086.

Learn more about Effect Size Definition and Statistical Significance

R Language Basics

Consistent Estimator: Easy Learning

Statistics is a consistent estimator of a population parameter if “as the sample size increases, it becomes almost certain that the value of the statistics comes close (closer) to the value of the population parameter”. If an estimator (statistic) is considered consistent, it becomes more reliable with a large sample ($n \to \infty$). All this means that the distribution of the estimates becomes more and more concentrated near the value of the population parameter that is being estimated, such that the probability of the estimator being arbitrarily closer to $\theta$ converges to one (sure event).

Consistent Estimator

The estimator $\hat{\theta}_n$ is said to be a consistent estimator of $\theta$ if for any positive $\varepsilon$;
\[limit_{n \rightarrow \infty} P[|\hat{\theta}_n-\theta| \le \varepsilon]=1\]
or
\[limit_{n\rightarrow \infty} P[|\hat{\theta}_n-\theta|> \varepsilon]=0]\]

Here $\hat{\theta}_n$ expresses the estimator of $\theta$, calculated by using a sample size of size $n$.

Consistent Estimator
  • The sample median is a consistent estimator of the population mean if the population distribution is symmetrical; otherwise, the sample median would approach the population median, not the population mean.
  • The sample estimate of standard deviation is biased but consistent as the distribution of $\hat{\sigma}^2$ is becoming more and more concentrated at $\sigma^2$ as the sample size increases.
  • A sample statistic can be an inconsistent estimator, whereas a consistent statistic is unbiased in the limit but an unbiased estimator may or may not be consistent.

Note that these two are not equivalent: (1) Unbiasedness is a statement about the expected value of the sampling distribution of the estimator, while (2) Consistency is a statement about “where the sampling distribution of the estimator is going” as the sample size.

A consistent estimate has insignificant (non-significant) errors (variations) as sample sizes increase indefinitely. More specifically, the probability that those errors will vary by more than a given amount approaches zero as the sample size increases. In other words, the more data you collect, the more consistent the estimator will be with the real population parameter you’re trying to measure. The sample mean ($\overline{X}$) and sample variance ($S^2$) are two well-known consistent estimators.

Statistics Help

R Language Lectures

Application of Regression in Medical: A Quick Guide (2024)

The application of Regression cannot be ignored, as regression is a powerful statistical tool widely used in medical research to understand the relationship between variables. It helps identify risk factors, predict outcomes, and optimize treatment strategies.

Considering the application of regression analysis in medical sciences, Chan et al. (2006) used multiple linear regression to estimate standard liver weight for assessing adequacies of graft size in live donor liver transplantation and remnant liver in major hepatectomy for cancer. Standard liver weight (SLW) in grams, body weight (BW) in kilograms, gender (male=1, female=0), and other anthropometric data of 159 Chinese liver donors who underwent donor right hepatectomy were analyzed. The formula (fitted model)

 \[SLW = 218 + 12.3 \times BW + 51 \times gender\]

 was developed with a coefficient of determination $R^2=0.48$.

Application of Regression Analysis

These results mean that in Chinese people, on average, for each 1-kg increase of BW, SLW increases about 12.3 g, and, on average, men have a 51-g higher SLW than women. Unfortunately, SEs and CIs for the estimated regression coefficients were not reported. Using Formula 6 in their article, the SLW for Chinese liver donors can be estimated if BW and gender are known. About 50% of the variance of SLW is explained by BW and gender.

The regression analysis helps in:

  • Identifying risk factors: Determine which factors contribute to the development of a disease (For example, gender, age, smoking, and blood pressure for heart disease).
  • Predicting disease occurrence: Estimate the likelihood of a patient developing a disease based on specific risk factors. for example, logistic regression is used to predict the risk of diabetes based on factors like BMI, age, and family history.

The following types of regression models are widely used in medical sciences:

  • Linear regression: Used when the outcome variable is continuous (e.g., blood pressure, cholesterol levels).
  • Logistic regression: Used when the outcome variable is binary (e.g., disease present/absent, survival/death).
  • Cox proportional hazards regression: Used for survival analysis (time to event data)

 Some other related articles (Application of Regression Analysis in Medical Sciences)

Reference of Article

  • Chan SC, Liu CL, Lo CM, et al. (2006). Estimating liver weight of adults by body weight and gender. World J Gastroenterol 12, 2217–2222.

R Programming Lectures

Using Mathematica Built-in Functions (2014)

Introduction to Mathematica Built-in Functions

There are thousands of thousands of Mathematica Built-in Functions. Knowing a few dozen of the more important will help to do lots of neat calculations. Memorizing the names of most of the functions is not too hard as approximately all of the built-in functions in Mathematica follow naming convention (i.e. names of functions are related to the objective of their functionality), for example, the Abs function is for absolute value, Cos function is for Cosine and Sqrt is for the square root of a number.

The important thing than memorizing the function names is remembering the syntax needed to use built-in functions. Remembering many of the built-in Mathematica functions will not only make it easier to follow programs but also enhance your programming skills.

Important and Widely Used Mathematica Built-in Functions

The following is a short list related to Mathematica Built-in Functions.

  • Sqrt[ ]:   used to find the square root of a number
  • N[ ]:   used for numerical evaluation of any mathematical expression e.g. N[Sqrt[27]]
  • Log[  ]: used to find the log base 10 of a number
  • Sin[  ]: used to find trigonometric function Sin
  • Abs[  ]: used to find the absolute value of a number

Common Mathematica built-in functions include

  1. Trigonometric functions and their inverses
  2. Hyperbolic functions and their inverses
  3. logarithm and exponential functions

Every built-in function in Mathematica has two very important features

  • All Mathematica built-in functions begin with Capital letters, such as for square root we use Sqrt, for inverse cosine we use the ArCos built-in function.
  • Square brackets are always used to surround the input or argument of a function.

For computing the absolute value -12, write on command prompt Abs[-12]  instead of for example Abs(-12) or Abs{-12} etc i.e.   Abs[-12] is a valid command for computing the absolute value of -12.

Mathematica Built-in Functions

Note that:

In Mathematica single square brackets are used for input in a function, double square brackets [[ and ]] are used for lists, and parenthesis ( and ) are used to group terms in algebraic expression while curly brackets { and } are used to delimit lists. The three sets of delimiters [ ], ( ), { } are used for functions, algebraic expressions, and lists respectively.

Introduction to Mathematica

R Programming Language

MCQs General Knowledge