Can Multistage Testing Bridge the Cultural Measurement Divide?
Session 1B, 10:30 - 12:00, VIA
Due to language, geographical, or other cultural differences, international measurement across many dozens of participants is a marked challenge, theoretically and operationally. A foundational problem is to ensure that measured constructs are equivalent, particularly as tests are increasingly tailored for groups of countries (e.g., easy booklet design in PISA). Although empirical evidence from recent rounds of one international survey – PISA – have shown a high degree of measurement equivalence across participating countries and educational systems (OECD, 2017), concerns persist over whether a common scale can be used to measure everyone (Kreiner & Christensen, 2014; Rutkowski, Rutkowski, & Liaw, 2017). And in less economically developed countries, technology is a barrier, as PISA and other assessments move to a computerized platform. This is all the more prescient as the OECD weighs moving toward a multistage adaptive test (ETS, 2016), as this innovation offers promise and peril. A key advantage of a multistage adaptive test (MAT) is the possibility for the test to be more precisely targeted toward the test takers’ proficiency, while also limiting the operational burden that is typically associated with fully adaptive tests. However, testing organizations must balance this benefit against potential risks to trend measurement, cross-country comparability, and stable parameter estimates. In the current paper, I address these issues in the context of meaningful and expanding cross-cultural measurement variation. In particular, I discuss what can reasonably be gained by a MAT when countries vary widely in proficiency; the degree to which existing (trend) item banks can be brought to service; and whether items with characteristics typical of past PISA cycles can fulfill future MAT needs. I take both an empirical and simulation-based perspective to highlight several critical issues, should a MAT be adopted for upcoming rounds of PISA and in new PISA instantiations (e.g., PISA for Development).