Embracing Heterogeneity in International Surveys: Optimal Test Design and Parameter Estimation
Although international educational assessments and surveys are a useful tool for monitoring performance and progress, most current methods assume that a single set of questions is universally suitable for dozens of highly varied participating countries. But departures from this assumption have important consequences for scale score results and rankings. Disparate remedies have been implemented in a limited way, including model-based and design-based solutions; however, no unified approach exists, especially across international studies.
Over a 60 year history, modern international educational assessments and surveys have become influential educational policy tools, with rapid growth in the number of participating countries, measured content areas, and testing platforms. As international studies grow, so too do differences across participants, reflecting more and diverse languages, cultures, geographies, and levels of economic development. Although it is reasonable to expect variation across cultures, apparent differences can also be the result of nothing more than the way constructs are measured, obscuring our understanding of what study participants know, think, and feel. To date, these measurement issues have only been accounted for in limited ways and only in PISA. Other studies have been slow to acknowledge and account for cross-cultural measurement differences, which can have important impacts on achievement and other scale score estimates.
Research question / Method
What are the most optimal designs and methods to account for cross-cultural measurement differences in international educational assessments and surveys? To answer our research question, we use simulated data, the parameters of which we fully control to develop and test improvements in assessment design and modeling. In this way, we can effectively operate in an experimental laboratory-like setting where the ‘true’ state of the subjects is known. Our innovations will be validated through simulated and existing empirical data to test the degree to which our solutions work in practice. Although we will be using PISA as our test case, findings extend to any assessment that employs similar designs.
Relevance to society
Improvements to current methods and designs will provide Norwegian policy makers and the public with a more accurate picture of educational achievement and correlates and how Norwegian education compares internationally. In Norway this is especially relevant given the importance of international assessments in both public and policy debates around education. In addition, the study contributes to the CEMO mission of developing a research community within Norway that focuses on methodological issues around international assessments. This productive area of research will assist the government in meetings its goals of improving educational assessment research within the country. Finally, the international collaborative part of this project is aimed at knowledge sharing between some of the top educational assessment researchers in the world. This project will place Norway on the cutting edge of international assessment research and provide Norwegian educational stakeholders with a more accurate picture of achievement to better assess their standing in the global knowledge economy.