Thursday, 10 November 2016

Simpson's paradox

“There are three kinds of lies: lies, damned lies, and statistics.”
During my travels across the Internet, I recently stumbled upon a paradox that exemplifies the quote1 above.

It is called the Simpson's Paradox2 and I had never heard of it before.

From what I can gather, it means that data that is aggregated might point in the exact opposite direction compared to the same data partitioned by a certain type/relation.

It depends on the situation which of the two should be preferred, in order to extract proper conclusions from the data.

The Wikipedia entry in the references has some very good examples of Simpson's Paradox and should be required reading for any budding statistician.


