Thursday 10 November 2016

Simpson's paradox

“There are three kinds of lies: lies, damned lies, and statistics.”
During my travels across the Internet, I recently stumbled upon a paradox that exemplifies the quote1 above.

It is called the Simpson's Paradox2 and I had never heard of it before.

From what I can gather, it means that data that is aggregated might point in the exact opposite direction compared to the same data partitioned by a certain type/relation.

It depends on the situation which of the two should be preferred, in order to extract proper conclusions from the data.

The Wikipedia entry in the references has some very good examples of Simpson's Paradox and should be required reading for any budding statistician.

References

Wikipedia - Lies, Damned lies and statistics
https://en.wikipedia.org/wiki/Lies,_damned_lies,_and_statistics
wikipedia - Simpson's Paradox
https://en.wikipedia.org/wiki/Simpson%27s_paradox

No comments:

Post a Comment