Block Design, Incidence, and Concurrence Matrix (2018)

Block Design Properties

The necessary conditions that the parameters of a Balanced Incomplete Block Design (BIB design) must satisfy are

  • $bk = vr$, where $r=\frac{bk}{v}$ each treatment has $r$ replications
  • no treatment appears more than once in any block
  • all unordered pairs of treatments appear exactly in $\lambda$ blocks (equi-concurrence)
    where $\lambda=\frac{r(k-1)}{v-1}=\frac{bk(k-1}{v(v-1)}$ is often referred to as the concurrence parameter of a BIB design.

A design say $d$ with parameters $(v, b, r, k, \lambda)$ can be represented as a $v \times b$ treatment block incidence matrix (having $v$ rows and $b$ columns). Let denote it by $N=n_{ij}$ whose elements $n_{ij}$ signify the number of units in block $j$ allocated to treatment $i$. The rows of the incidence matrix are labeled with varieties (treatments) of the design and the columns with the blocks.

We have to put 1 in the ($i$, $j$)th cell of the matrix if variety $i$ is contained in block $j$ and 0 otherwise. Each row of the incidence matrix has $r$ 1’s, each column has $k$ 1’s, and each pair of distinct rows has $\lambda$ column 1’s, leading to a useful identity matrix.
The matrix $NN’$ has $v$ rows and $v$ columns, referred to as the concurrence matrix of design $d$, and its entries, the concurrence parameters are denoted by $\lambda_{dij}$. For a BIBD, $n_{ij}$ is either one or zero, and $n_{ij}^2= n_{ij}$.

Theorem: If $N$ is the incidence matrix of a $(v, b, r, k, \lambda)$-design then $NN’=(r-\lambda)I+\lambda J$ where $I$ is $v\times v$ identity matrix and $J$ is the $v\times v$ matrix of all 1’s.

Example: For Block Design {1,2,3}, {2,3,4}, {3,4,1}, {4,1,2} construct incidence matrix

Block Design: incidence matrix
Incidence and Concurrence matrix


Denoting the elements of $NN’$ by $q_{ih}$, we see that $q_{ii}=\sum_j n_{ij}^2$ and $q_{ih}=\sum_j n_{ij} n_{hj}, (i \ne h)$. For any block design $NN’$, the treatment concurrence with diagonal elements equal to $q_{ii}=r$ and off-diagonal elements are $q_{ih}=\lambda, (i\ne h)$ equal to the number of times any pairs of treatment occur together within the block. In a balanced design, the off-diagonal entries in $NN’$ are all equal to a constant $\lambda$ i.e., the common replication for a BIBD is $r$, and the common pairwise treatment concurrence is $\lambda$.

$N$ is a matrix of $v$ rows and $b$ columns that $r(N)\le min(b, c)$. Hence, $t\le min(b, v)$. If design is symmetric $b=v$ and $N$ is square the $|NN’|=|N|^2$, so $(r-\lambda)^{v-1}r^2$ is a perfect square.

Using R Packages

MCQs General Knowledge

Heteroscedasticity Tests and Remedies (2018)

The post is about Heteroscedasticity Tests and Remedies of Heteroscedasticity.

There is a set of heteroscedasticity tests and remedies that require an assumption about the structure of the heteroscedasticity if it exists. That is, to use these tests you must choose a specific functional form for the relationship between the error variance and the variables that you believe determine the error variance. The major difference between these tests is the functional form that each test assumes.

Heteroscedasticity Tests

Breusch-Pagan Test

The Breusch-Pagan test assumes the error variance is a linear function of one or more variables.

Harvey-Godfrey Test

The Harvey-Godfrey test assumes the error variance is an exponential function of one or more variables. The variables are usually assumed to be one or more of the explanatory variables in the regression equation.

The White Test

The white test of heteroscedasticity is a general test for the detection of heteroscedasticity existence in the data set. It has the following advantages:

  1. It does not require you to specify a model of the structure of the heteroscedasticity if it exists.
  2. It does not depend on the assumption that the errors are normally distributed.
  3. It specifically tests if the presence of heteroscedasticity causes the OLS formula for the variances and the covariances of the estimates to be incorrect.

Remedies for Heteroscedasticity

Suppose that you find the evidence of existence of heteroscedasticity. If you use the oLS estimator, you will get unbiased but inefficient estimates of the parameters of the model. Also, the estimates of the variances and covariances of the parameter estimates will be biased and inconsistent, and as a result, hypothesis tests will not be valid. When there is evidence of heteroscedasticity, econometricians do one of the two things:

  • Use the OLS estimator to estimate the parameters of the model. Correct the estimates of the variances and covariances of the OLS estimates so that they are consistent.
  • Use an estimator other than the OLS estimator to estimate the parameters of the model.
Heteroscedasticity Tests

Many econometricians choose the first alternative. This is because the most serious consequence of using the OLS estimator when there is heteroscedasticity is that the estimates of the variances and covariances of the parameter estimates are biased and inconsistent. If this problem is corrected, then the only shortcoming of using OLS is that you lose some precision relative to some other estimator that you could have used.

Heteroscedasticity Pattern, Tests, and Remedy

However, to get more precise estimates with an alternative estimator, you must know the approximate structure of the heteroscedasticity. If you specify the wrong model of heteroscedasticity, then this alternative estimator can yield estimates that are worse than the OLS

Learn R Programming Language

Cronbach’s Alpha Reliability Analysis of Measurement Scales

Cronbach’s Alpha Reliability Analysis

Cronbach’s Alpha Reliability analysis is used to study the properties of measurement scales (Likert scale questionnaire) and the items (questions) that make them up. The reliability analysis method computes several commonly used measures of scale reliability. The reliability analysis also provides information about the relationships between individual items in the scale. The intraclass correlation coefficients can be used to compute the interrater reliability estimates.

Consider that you want to know if my questionnaire measures customer satisfaction in a useful way. For this purpose, you can use the reliability analysis to determine the extent to which the items (questions) in your questionnaire are correlated with each other. The overall index of the reliability or internal consistency of the scale as a whole can be obtained. You can also identify problematic items that should be removed (deleted) from the scale.

As an example open the data “satisf.save” already available in SPSS sample files. To check the reliability of Likert scale items follow the steps given below:

Cronbach's Alpha Reliability
Cronbach's Alpha Reliability Analysis Dialog box

Step 1: On the Menu bar of SPSS, Click Analyze > Scale > Reliability Analysis… option

Step 2: Select two more variables that you want to test and shift them from the left pan to the right pan of the reliability analysis dialogue box. Note, that multiple variables (items) can be selected by holding down the CTRL key and clicking the variable you want. Clicking the arrow button between the left and right pan will shift the variables to the item pan (right pan).

Step 3: Click on the “Statistics” Button to select some other statistics such as descriptives (for item, scale, and scale if item deleted), summaries (for means, variances, covariances, and correlations), inter-item (for correlations and covariances) and ANOVA table (for none, F-test, Friedman chi-square and Cochran chi-square) statistics etc.

Reliability Statistics

Click on the “Continue” button to save the current statistics options for analysis. Click the OK button in the Reliability Analysis dialogue box to get the analysis to be done on selected items. The output will be shown in SPSS output windows.

Reliability Analysis Output

The Cronbach’s Alpha Reliability ($\alpha$) is about 0.827, which is good enough. Note that, deleting the item “organization satisfaction” will increase the reliability of remaining items to 0.860.

A rule of thumb for interpreting alpha for dichotomous items (questions with two possible answers only) or Likert scale items (questions with 3, 5, 7, or 9, etc items) is:

  • If Cronbach’s Alpha is $\ge 0.9$, the internal consistency of scale is Excellent.
  • If Cronbach’s Alpha is $0.90 > \alpha \ge 0.8$, the internal consistency of scale is Good.
  • If Cronbach’s Alpha is $0.80 > \alpha \ge 0.7$, the internal consistency of scale is Acceptable.
  • If Cronbach’s Alpha is $0.70 > \alpha \ge 0.6$, the internal consistency of scale is Questionable.
  • If Cronbach’s Alpha is $0.60 > \alpha \ge 0.5$, the internal consistency of scale is Poor.
  • If Cronbach’s Alpha is $0.50 > \alpha $, the internal consistency of scale is Unacceptable.

However, the rules of thumb listed above should be used with caution since Cronbach’s Alpha reliability is sensitive to the number of items in a scale. A larger number of questions can result in a larger Alpha Reliability, while a smaller number of items may result in smaller $\alpha$.

Online MCQs Test Preparation Website with Answers

Standard Deviation: A Measure of Dispersion (2017)

The standard deviation is a widely used concept in statistics and it tells how much variation (measure of spread or dispersion) is in the data set. It can be defined as the positive square root of the mean (average) of the squared deviations of the values from their mean.
To calculate the standard deviation one has to follow these steps:

Calculation of Standard Deviation

  1. First, find the mean of the data.
  2. Take the difference of each data point from the mean of the given data set (which is computed in step 1). Note that, the sum of these differences must be equal to zero or near to zero due to rounding of numbers.
  3. Now compute the square of the differences obtained in Step 2, it would be greater than zero, and it will be a positive quantity.
  4. Now add up all the squared quantities obtained in step 3. We call it the sum of squares of differences.
  5. Divide this sum of squares of differences (obtained in step 4) by the total number of observations (available in data) if we have to calculate population standard deviation ($\sigma$). If you want t to compute sample standard deviation ($S$) then divide the sum of squares of differences (obtained in step 4) by the total number of observations minus one ($n-1$) i.e. the degree of freedom. Note that $n$ is the number of observations available in the data set.
  6. Find the square root (also known as under root) of the quantity obtained in step 5. The resultant quantity in this way is known as the standard deviation (SD) for the given data set.

The sample SD of a set of $n$ observation, $X_1, X_2, \cdots, X_n$ denoted by $S$ is

\begin{aligned}
\sigma &=\sqrt{\frac{\sum_{i=1}^n (X_i-\overline{X})^2}{n}}; Population\, SD\\
S&=\sqrt{ \frac{\sum_{i=1}^n (X_i-\overline{X})^2}{n-1}}; Sample\, SD
\end{aligned}

The standard deviation can be computed from variance too.

The real meaning of the standard deviation is that for a given data set 68% of the data values will lie within the range $\overline{X} \pm \sigma$ i.e. within one standard deviation from the mean or simply within one $\sigma$. Similarly, 95% of the data values will lie within the range $\overline{X} \pm 2 \sigma$ and 99% within $\overline{X} \pm 3 \sigma$.

Standard Deviation

Examples

A large value of SD indicates more spread in the data set which can be interpreted as the inconsistent behaviour of the data collected. It means that the data points tend to be away from the mean value. For the case of smaller standard deviation, data points tend to be close (very close) to the mean indicating the consistent behavior of the data set.

The standard deviation and variance are used to measure the risk of a particular investment in finance. The mean of 15% and standard deviation of 2% indicates that it is expected to earn a 15% return on investment and we have a 68% chance that the return will be between 13% and 17%. Similarly, there is a 95% chance that the return on the investment will yield an 11% to 19% return.

measures-of-dispersion

Online MCQs Test Preparation Website