Can Researchers
Ignore Clustering Effects on the Variance Estimation
of
the Estimator for the Mokken’s Scalability Coefficient H ?
Marcia Andrade,
Pedro Luis Nascimento Silva
Cristiano Fernandes
In
practice, the measurement of latent variables through Mokken scales rests on
the assumption that the vectors of responses provided to a specified set of
items in an instrument are realizations of independent and identically
distributed random vectors. However, this assumption is untenable when samples
of respondents result from clustered sampling designs. This Brazilian paper examines
the effect of a single-stage cluster sampling design (CS1) on the estimation of
the variance of the estimator for the coefficient H, which plays a key role in the construction and quality
assessment of a Mokken scale from the responses to the specified set of items. For
this purpose, a simulation study was carried out by sampling repeatedly from a fixed
finite population that consists of pupils enrolled in 9th grade in public primary schools in the urban area of the state of Rio de
Janeiro who participated in the “Prova Brasil 2007”.
The
responses of all these pupils to a set of 10 dichotomized items that aim to measure
the ‘economic capital’ of the students’ families were used to calculate H. Then repeated samples were drawn from
the same population of pupils using two sampling designs: single-stage cluster
sampling (CS1) with clusters selected by simple random sampling without
replacement, and simple random sampling of pupils (SRS). The coefficient H was estimated from each sample, and
the variance of the estimators for H
were estimated using the sample replicates. The results show a strong
clustering effect on the variance of the estimator for the coefficient H when CS1 is used.
We suggest that if the clustering effect is not incorporated on estimation of
the variance of the scalability coefficient H
estimator, the conclusions based on traditional Mokken scale analysis and standard
statistical tests for scalability coefficients will be incorrect.
Key words: variance estimation; complex
sampling design; clustered sample; intracluster correlation;
Monte Carlo studies; socioeconomic status.