Cronbach's alpha (Cronbach's ) or coefficient alpha (coefficient ), is a reliability coefficient and a measure of the internal consistency of tests and measures. It was devised by the American psychometrician Lee Cronbach. Today it enjoys such wide-spread usage that numerous studies warn against using Cronbach's alpha uncritically.
In his initial 1951 publication, Lee Cronbach described the coefficient as Coefficient alpha and included an additional derivation. Coefficient alpha had been used implicitly in previous studies, but his interpretation was thought to be more intuitively attractive relative to previous studies and it became quite popular.
To use Cronbach's alpha as an accurate estimate of reliability, the following conditions must be met:
However, under the definition of CTT, the errors are defined to be independent.
This is often a source of confusion for users who might consider some aspect of the testing process to be an "error" (rater biases, examinee collusion, self-report faking). Anything that increases the covariance among the parts will contribute to greater true score variance. Under such circumstances, alpha is likely to over-estimate the reliability intended by the user.
Reliability can be defined as one minus the error score variance divided by the observed score variance:
Cronbach's alpha is best understood as a direct estimate of this definitional formula with error score variance estimated as the sum of the variances of each "part" (e.g., items or testlets):
where:
The reason that the sum of the individual part variances estimates the error score variance is because
and the variance of a composite is equal to twice the sum of all covariances of the parts plus the sum of the variances of the parts: . Therefore estimates and estimates . It is much easier to compute alpha by summing the part variances (to estimate error score variance) than adding up all the unique part covariances (to estimate true score variance) .
Alternatively, alpha can be calculated through the following formula:
where:
Application of Cronbach's alpha is not always straightforward and can give rise to common misconceptions.
Many textbooks refer to as an indicator of homogeneity between items. This misconception stems from the inaccurate explanation of Cronbach (1951) that high values show homogeneity between the items. Homogeneity is a term that is rarely used in modern literature, and related studies interpret the term as referring to uni-dimensionality. Several studies have provided proofs or counterexamples that high values do not indicate uni-dimensionality. See counterexamples below.
in the uni-dimensional data above.
in the multidimensional data above.
The above data have , but are multidimensional.
The above data have , but are uni-dimensional.
Uni-dimensionality is a prerequisite for . One should check uni-dimensionality before calculating rather than calculating to check uni-dimensionality.
The term "internal consistency" is commonly used in the reliability literature, but its meaning is not clearly defined. The term is sometimes used to refer to a certain kind of reliability (e.g., internal consistency reliability), but it is unclear exactly which reliability coefficients are included here, in addition to . Cronbach (1951) used the term in several senses without an explicit definition. Cortina (1993) showed that is not an indicator of any of these.
Most psychometric software will produce a column labeled "alpha if item deleted" which is the coefficient alpha that would be obtained if an item were to be dropped. For good items, this value is lower than the current coefficient alpha for the whole scale. But for some weak or bad items, the "alpha if item deleted" value shows an increase over the current coefficient alpha for the whole scale.
Removing an item using "alpha if item deleted" may result in 'alpha inflation,' where sample-level reliability is reported to be higher than population-level reliability. It may also reduce population-level reliability. The elimination of less-reliable items should be based not only on a statistical basis but also on a theoretical and logical basis. It is also recommended that the whole sample be divided into two and cross-validated.