Revisiting the Bahadur representation of sample quantiles for the standard error of kernel equating
Gabriel Wallin and Jorge Gonzalez
Session 2B, 12:45 - 14:15, VIA
In educational measurement, the comparability of test scores coming from different test versions is essential for fair assessments of test takers. Statistical models for test score quating enable such comparisons using a functional parameter–the equating transformation–that maps the scores from the scale of one test form into their equivalents on the scale of another. The standard error of equating (SEE) is one of the most common measures used when evaluating an equating transformation. In kernel equating (KE), the SEE has been derived using the delta method (von Davier, Holland, & Thayer, 2004), by relying on the asymptotic normality of the maximum likelihood estimators of score probabilities. Thus, using the delta method for calculating the SEE would be theoretically valid only when estimated score probabilities are, at least, approximately normally distributed. Because score probabilities in KE are commonly estimated after presmoothing the score distributions using maximum likelihood estimates from loglinear models, this issue has not been of great concern.
In recent years, however, alternative methods of presmoothing have been suggested. Some of them do not necessarily lead to approximately normally distributed estimated score probabilities, which makes it possible to question the validity of the current method for calculating the SEE. An alternative method that does not rely on normality and that use the Bahadur representation of sample quantiles (Bahadur, 1966) was suggested by Liou, Cheng, and Johnson (1997). These authors derived SEE expressions for the nonequivalent groups with anchor test designs, considering both equating estimators that used the Gaussian kernel as well as the uniform kernel for the continuization of the score distributions. To the best of the author’s knowledge, no comparison has been made between the delta method of computing the SEE and the method that uses the Bahadur representation of sample quantiles. In this study, the alternative method suggested by Liou, Cheng, and Johnson is expanded to: i) include expressions for the SEE that can be used regardless of the choice of the kernel function, ii) include other data collection designs (e.g., the equivalent groups design), and iii) obtain an expression of the SEE for the chained equating transformation. With these new results at hand, the two different methods of calculating the SEE are compared for different data collection designs, kernel functions, presmoothing models, and using both post-stratification equating and chained equating.