Brown bag seminar: Utilizing Response Times to Inform the Treatment of Omitted Responses in Achievement Tests
Welcome to CEMO's weekly Brown Bag seminars in methodology. This week's seminar is omitted responses in achievement tests, and will be given by Professor II Andreas Frey from Centre for Educational Measurement.
Empirically obtained data of achievement tests typically contains missing item responses. Unfortunately, the treatment of these missing responses is all but trivial. Simple procedures like listwise deletion, ignoring (and thus scoring as not administered), or treating as incorrect (and thus scoring as 0) are based on strong assumptions about the processes underlying the missing responses. Since these assumptions are often violated, unrealistic, or untestable, the simple approaches are prone to produce biased parameter estimates. With the advent of computer-based assessments, however, relevant information about the answering process can be collected which can be used to make better decisions regarding the scoring of missing item responses. One of these additional sources of information, the time used to answer an item, is focused in the talk. Based on the classification system for missing data problems proposed by Rubin and colleagues (Rubin, 1976; Little & Rubin, 2002), a recursive method for handling omitted responses is presented. The method strives to find item-specific response time thresholds separating responses missing completely at random from responses missing not at random. In contrast to existing methods, both type-I and type-II error are considered. The method is illustrated with an empirical data set stemming from a calibration study of a computerized adaptive test (N = 766). The application of the method with type-I error = type-II error = .10 for all items and response time levels resulted in response time thresholds ranging from 6 to 16 seconds. 650 omitted responses were re-coded; 81 % of them were scored as incorrect and 19 % as not administered. The re-coding resulted in substantial differences in item difficulty and ability estimates, compared to considering the omitted items as not administered. When comparing to the more common way of considering omitted responses as incorrect, small changes of item and ability parameter estimates on an aggregate level were found, albeit these differences range up to 0.3 * SD on an individual level. In conclusion, the suggested method seems to be a promising procedure for a statistically informed treatment of omitted responses when item response times are available. Additional studies planned for the future, as well as the method’s limitations are discussed.
The seminar is open to everybody. Participants bring along their lunch, listen to the presentation, discuss scientific topics, give feedback, and socialize.