The Story of DataLand No. 1: The Trap of Averages and the Power of Medians – Changing How We See Data

Once upon a time, there was a strange and wondrous place called "DataLand," a nation filled with endless data points. Each citizen had their own unique data point. In DataLand, there were two remarkable cities, "Average City" and "Median City," each with its own way of measuring the well-being of its citizens and using this data to make policies.

In Average City, the mayor, Mr. Average, calculated the "average happiness" by adding up all the citizens' happiness scores and dividing by the number of people. For example, if the city had 100 residents and their total happiness was 800, then the average happiness would be 800 divided by 100, resulting in 8. In Median City, however, the mayor, Mr. Median, took a different approach. He arranged all the happiness scores from lowest to highest and chose the happiness score of the person in the middle as the city’s "representative happiness." So, if there were 100 citizens, the happiness of the 50th person would be the "median" for the city.

One year, a major change swept through DataLand: a wealthy person from a neighboring country moved to Average City. This millionaire had an extremely high happiness score of 100, far higher than the other citizens. Because of this, the average happiness of Average City shot up dramatically, going from 8 to 8.92. Mr. Average was thrilled, thinking, "The people are happier than before!" But in reality, nothing had changed for most citizens; their happiness was the same as ever. The wealthy new resident’s score simply pushed up the average for everyone, making it look as though life had improved for everyone.

In Median City, on the other hand, the arrival of this wealthy person had no effect. Mr. Median’s calculation of the median happiness remained exactly the same because the new resident’s happiness score did not affect the middle person’s happiness at all. The people of Median City understood that, unlike Average City, their lives hadn’t changed just because a wealthy newcomer arrived. No matter how high the millionaire’s happiness was, it didn’t change the median.

This event taught the people of DataLand an important lesson: the "average" is easily influenced by extreme values, which can give an overall picture but may miss individual circumstances. For example, if only a few people have extremely high incomes, the "average income" may go up, but most people may not feel the benefit.

By contrast, the "median" is less affected by extreme values, so it reflects the "typical" situation for most citizens. The people of DataLand learned that both averages and medians have their uses. The average provides insight into overall trends, while the median reveals a more typical situation. Considering both can give a more balanced perspective.

Moreover, the people of DataLand realized the importance of understanding the story behind the numbers. Simply looking at numbers isn’t enough; understanding how the data was collected and what it means is key to true knowledge. Each number isn’t just a digit but a unique story waiting to be told.

Example: Measuring the Effect of an Education Program

One day, DataLand’s Ministry of Education introduced a new program to improve school performance. To measure its effect, they conducted tests in both Average City and Median City. After the program was introduced, the average test score in Average City rose by 10 points, while the median in Median City increased by 5 points. Seeing this, the Ministry of Education celebrated, declaring the program a success. But a closer look showed different stories in each city.

In Average City, a few high-achieving students scored exceptionally well, pushing up the overall average. However, most students’ scores hadn’t changed much. In Median City, by contrast, many students had improved their scores a bit, resulting in a balanced increase and a median rise of 5 points.

This example shows the risks of relying on averages alone. The citizens of DataLand learned that to discover the "truth behind the data," it’s essential to use multiple metrics and interpret the data from different angles.

Explanation: The Story of DataLand No. 1 – The Trap of Averages and the Power of Medians

DataLand is a unique place where every citizen represents a unique data point. Here, there are two special cities: Average City and Median City, each with its approach to measuring happiness and shaping policies.

In Average City, Mayor Mr. Average calculates the "average happiness" by dividing the total happiness of all citizens by the number of people. For instance, if there are 100 citizens and their happiness totals 800, dividing 800 by 100 gives an average happiness of 8. In Median City, however, Mayor Mr. Median arranges the citizens’ happiness levels in order and picks the middle person’s score as the city’s "median" happiness, which reflects a representative level of happiness less influenced by extremes.

One year, an ultra-wealthy person moved to Average City with an unusually high happiness level of 100. This caused the city’s average happiness to jump, even though the everyday lives of most people hadn’t changed. For ordinary citizens, this "happiness rise" felt disconnected from their actual lives.

In Median City, however, this wealthy arrival had no effect on the median, as Mr. Median pointed out, explaining that "no matter how happy the newcomer is, the median remains unaffected." Through this experience, the people of DataLand learned the risk of relying solely on averages when evaluating data.

Thus, the people of DataLand began to appreciate the difference between averages and medians and the different insights each can provide. They especially learned the importance of considering the "story behind the data" and how and why the numbers came to be.

For example, when evaluating a new program, the Ministry of Education now looks at averages, medians, and other metrics to gain a more realistic view. Through the story of DataLand, we learn the importance of approaching data interpretation with a "multi-dimensional" perspective.

Like the people of DataLand, we, too, should read the story behind the data and seek the truth behind the numbers. Data is more than just numbers on a page; each number has its unique story. Understanding these stories unlocks deeper insights and knowledge.

Recommend