How Much Statistics Should an Economist Learn?

2014 European elections

Statistics often appear to us to be objective, true or correct, but they can manipulate the viewer's perception and give the wrong impression through incorrect or distorted presentation of the data or lack of information.

Quotes on "lying with statistics":
  • "Don't trust any statistics that you haven't faked yourself!"
  • (Invented by German propaganda during World War II according to Wikipedia and attributed to Winston Churchill)

  • “For me, statistics are the means of information for the elderly. Those who can deal with it are less easy to manipulate. The sentence "You can prove anything with statistics" only applies to the comfortable who do not feel like looking closely. " (Elisabeth Noelle-Neumann)
These statements show very nicely how sensitive the handling of statistics is and that with the help of statistics one can support and disseminate false statements! It is therefore always appropriate to question statistics and, in some cases, even to meet them with suspicion and skepticism. If one is familiar with the tool of statistics, one can often very quickly unmask an error or a manipulation in a statistical representation.

Collections of nonsensical or incorrect statistical representations and interpretations can help develop an eye for what to look for when reading articles with statistics, such as these:
  • On the website of the Institute for Applied Statistics at the University of Linz, Andreas Quatember presents current examples from the media under the heading "Nonsense in the media - careless handling of data" and explains the nonsense in the said graphics and interpretations
  • The Berlin psychologist Gerd Gigerenzer, the Bochum economist Thomas Bauer and the Dortmund statistician Walter Krämer publish currently published figures and their interpretations under the heading "Unstatistics of the month" in order to critically question them.
All of these examples quickly make it clear why it is important to take a critical look at statistics and charts.

A modern statistical falsification lies less in changing the numbers themselves, but rather in the combination of determined numbers or an incorrect representation, for example. Lying with the help of statistics is child's play these days, because the magic of numbers is often believed unchecked.

But “lying” with statistics can also happen involuntarily.

Possible sources of error when creating and reading statistics are, for example:

  • Error collecting data: e.g. a wrong / unsuitable sample. For example, if you want to make a statement about the climate in your own class, it is not enough to just ask the 10 best students in the class; you would have to ask the entire class.

  • Error handling the data: e.g. distortions due to group / class formation. For example, if you create two new answer groups on a 5-point scale and combine three answer options in one group and only two in the second, you get the impression that two groups are the same, although the actual answer options on the scale are not evenly divided is done.

  • Incorrect usage of language When describing the numbers in the accompanying text: If, for example, the text says "41% of all respondents", but the percentages only refer to the group of female respondents, the wording in the text is incorrect and leads to misunderstandings (especially in the description of the numbers in crosstabs, one should pay attention to which basic variable the percentages relate to!).

  • Incorrect use of language can result Unclear statistical terms result: What do they stand for? What do they say
    • z. B. Percentages:
      • can provide information / statements about a relationship between the individual and the whole,
      • can swallow information,
      • often look better - especially with small numbers.
    • Mean values: Median <-> arithmetic mean
  • Error interpreting the data: An error in the interpretation of the data can e.g. the failure to differentiate between causality <-> Be correlation. A correlation does not necessarily establish a causal relationship. "Beer consumption per capita in Germany is significantly higher than in Finland" and "The Pisa results in Finland are significantly higher than in Germany". Of course, this does not mean that beer consumption is to blame for Germany's poor performance in the Pisa study.

  • Inadmissible generalizations or exaggerations of results: Results of the class survey cannot, for example, be transferred to the entire school or grade.

  • Incorrect image display: The form of representation is one of the most frequent sources of error, so you can often find forms of representation that support a tendency (e.g. by scaling that does not start at 0 -> increases / increases look much larger than they really are.)

  • Missing important additional information: By accidentally or intentionally withholding important additional information about the respondent group, the exact wording of the question (s), time of the survey, etc., the informative value of statistics is reduced, which can lead to misinterpretations.
Work order:
  • Based on the text, create brief instructions for your classmates on what they should pay attention to when critically looking at a statistic in order to avoid errors and detect fraud.
This working material is available here as a PDF document.