The Akaike Information Criteria/Criterion (AIC) is a method used in statistics and machine learning to compare the relative quality of different models for a given dataset. The AIC method helps in selecting the best model out of a bunch by penalizing models that are overly complex. Akaike Information Criterion provides a means for comparing among models i.e. a tool for model selection.
- A too-simple model leads to a large approximation error.
- A too-complex model leads to a large estimation error.
AIC is a measure of goodness of fit of a statistical model developed by Hirotsugo Akaike under the name of “an information Criteria (AIC) and published by him in 1974 first time. It is grounded in the concept of information entropy in between bias and variance in model construction or between accuracy and complexity of the model.
The Formula of Akaike Information Criteria
Given a data set, several candidate models can be ranked according to their AIC values. From AIC values one may infer that the top two models are roughly in a tie and the rest far worse.
$$AIC = 2k-ln(L)$$
where $k$ is the number of parameters in the model, and $L$ is the maximized value of the likelihood function for the estimated model.
For a set of candidate models for the data, the preferred model is the one that has a minimum AIC value. AIC estimates relative support for a model, which means that AIC scores by themselves are not very meaningful
Akaike Information Criteria focuses on:
- Balances fit and complexity: A model that perfectly fits the data might not be the best because it might be memorizing the data instead of capturing the underlying trend. AIC considers both how well a model fits the data (goodness of fit) and how complex it is (number of variables).
- A lower score is better: Models having lower AIC scores are preferred as they achieve a good balance between fitting the data and avoiding overfitting.
- Comparison tool: AIC scores are most meaningful when comparing models for the same dataset. The model with the lowest AIC score is considered the best relative to the other models being evaluated.
Summary
The AIC score is a single number and is used as model selection criteria. One cannot interpret the AIC score in isolation. However, one can compare the AIC scores of different model fits to the same data. The model with the lowest AIC is generally considered the best choice.
The AIC is the most useful model selection criterion when there are multiple candidate models to choose from. It works well for larger datasets. However, for smaller datasets, the corrected AIC should be preferred. AIC is not perfect, and there can be situations where it fails to choose the optimal model.
There are many other model selection criteria. For more detail read the article: Model Selection Criteria