"Really bad data graphics" competition

The master programme in Assessment and Evaluation has taken off for a good start and the students have plenty to do. Recently they were challenged with an informal competition.

The new students found several examples of bad data graphics. Photo source: Unknown

The course MAE4000 Data Science is currently midway and the students have gotten exposed to the core foundations important for anyone doing data analyses. This includes data management, probability, descriptive statistics, data visualization, and the topic for this week: statistical inference. In parallel, the students have also been training in the use of the software environment R to wrangle data, compute statistics, visualize data, and even run small Monte Carlo simulations. Learning a programming syntax like R starts slow, but with practice and persistence everyone is now getting quite up to speed.

“Really bad data graphics” competition

Professor Johan Braeken is responsible for the course in Data Science. Photo: Shane Colvin/Faculty of Educational Sciences.

As part of last week topic on data visualization professor Johan Braeken organized a “really bad data graphics” competition. All students entered at least one graphic of their own finding to the pool of candidates and then there was a big joint evaluation round in class to judge which ones were the worst of the worst! The evaluation system was motivated on guidelines from the literature and can be summarized in bullet points as follows:

  • Graphical data integrity & low score on lie factor
  • Less is more: No chart junk & high data-ink ratio
  • Annotation & stand-alone readability
  • Keep it simple: Decoding & operations
  • Gestalt principles & illusions

These bullet points were the ammunition for a lively discussion.

Visual representation an important communication tool

Braeken explains what the figures show:

— Below is one of the figures that made our top 3. This one was very much appreciated for being ultra-low-effort using the exact same simple bar chart to represent a whole range of varying proportions without any link between the graphic and the numbers. 

— The other two figures that made the top 3 were a dreaded pie chart (of course) on “race distribution” in a particular region that forgot to label every piece of the pie, but those that were labeled were rather politically incorrect, and a promotional data figure that was contrasting number of jobs created under governments of two political parties in such a way that it scored very high on the lie factor and became entirely unreadable (we did get the clue though that we were supposed to be convinced to vote for one party).

The last of the contributions that made top 3 (source: city-data.com)
Another example of a bad visualization of data (Source: Unknown)

Braeken concludes:

— Visual representations of data & statistical modeling results are the communication tool par excellence in our field.  It is only logical that we should pay attention to develop proper visualizations that raise questions by facilitating comparisons and that provide answers by making the data stand out! Yet, recognizing what is bad practice is perhaps the easiest part, now we need to try and deliver good graphics ourselves.

Published Sep. 20, 2018 4:14 PM - Last modified Nov. 9, 2018 7:29 PM