Performance assessment of learning in higher education (PAL)
Richard J. Shavelson, Olga Zlatkin-Troitschanskaia, Susanne Schmidt, & Klaus Beck
Session 5A, 13:00 - 14:30, HAGEN 2
The demand to measure higher education outcomes has gained worldwide momentum. While there are many approaches to measuring higher education learning outcomes, including self- reports of learning and multiple-choice tests, the PAL study’s focus lies on performance assessment of learning with particular focus on the measurement of so-called 21st century (generic) skills such as critical thinking (CT) and critical reasoning (CR), with a task that simulates real-life decision making and judgment situations (e.g., Shavelson, 2013; Shavelson et al., 2015, 2018).
The assessment framework is based on a performance task (PT) that demands CT and CR and resembles the myriad of complex everyday life situations. A real-world event is presented along with information more or less relevant to the problem. The problem requires CT and CR in terms of recognizing and evaluating the relevance, reliability and validity of the given information as well as evaluating the problem and finally making a decision. Information regarding decision making and thought processes, particularly regarding the ability to deal with a huge amount of partly irrelevant and unreliable information was gathered in a semi-structured cognitive interview after the completion of the PT (N=30 undergraduate students).
The PT is delivered on a computer and the information needed to solve the problem is presented within the task itself as well as in full length over the internet (such as newspaper or Wikipedia articles). Computers provide substantial leeway both in delivering tasks and in their fidelity to the real world they are intended to emulate. The format is open-ended, students constructed answers of varying length in response to the prompt inviting them to make a decision about the real-world event. The difficulty of the task is fine-tuned through the way the information is presented, the number of information sources and points to consider, including distractors (irrelevant information), and their trustworthiness and relative strength compared to one another as well as time constraints and response requirements.
For the response ratings, analytic categories were developed based on the construct definition of CT and CR (Shavelson et al., 2018). These categories consider the students’ use of (un)reliable and (in)valid information as well as their reflection and avoidance of heuristics that lead to errors in judgment and decision making. The students’ use of such information for justifying decisions, problem solving and/or recommendations for action are evaluated. Moreover, argumentation, the use of evidence to support claims, and clarity of communication is rated. Additional aspects were revealed within the cognitive interviews, which goes hand in hand with the analyses using mixed-methods. Based on the coding scheme of the interview (which is in line with the “Grounded Theory”), we quantified the codes in accordance to the construct definition of CT and CR. By doing so, analyses indicated that, for instance, many students knew that Wikipedia is no trustworthy reference but most of them used it for their argumentation within their statement. These and further cognitive processes are modeled within a multilevel mixed model following the approach by Brückner and Pellegrino (2017).