What are good examples of misleading statistics?

  • Why are hackers constantly hacking social media accounts, do the sites (Twitter, Instagram, Facebook) not have good security?

    There are lots of reasons that hackers target social media accounts – but at the end of the day, all those reasons boil down to one thing: money. Hackers go after social media plat(Continue reading)

    Reuters put this graph telling story about the number of murders committed using firearms after Florida enacted its ‘Stand Your Ground’ law.

    If you observe this graph closely, you will find that the graph is drawn upside down which gives an illusion that Gun deaths in Florida have come down after the enactment of law. However, if you look at the scale and read the graph as below, it tells you a different story.

    The designer of the chart, Christine Chan, explained her decision on her Twitter feed, saying, “I prefer to show deaths in negative terms (inverted). It’s a preference really, can be shown either way.” However, we don’t know if chart was designed intentionally.

    This is one of those graphs which used statistics to mislead people.


    Thanks for reading my answer. Please upvote if you like as it will motivate me to write more such helpful stuff.

    My favorite is survivorship bias.

    This is a sample of bullets spread on American planes that have returned from a WWII mission:

    Most of the bullets in the sample were in the fuselage, least around engine. There was an optimization problem: how to make a compromise between armor and mobility. More armor meant less mobility. Officers from the army concluded that planes’ armor should be concentrated on the spots where the plane was hit most in order to better their chances of survival i.e. fuselage. With less armor they would have same or better protection with increased mobility.

    Abraham Wald from SRG didn’t agree. He assumed that surface of the plane should be hit equally at all places and looked for an answer to the problem not in the planes that came back to the base but to those that didn’t. He argued that if we had more hits in the fuselage on a plane that has returned, then those hits can be tolerated as the pilots survived. We should be more concerned about missing hits, the ones not shown on the picture above.

    He concluded that the reason planes got hit most at the fuselage and least around the engine is because those that got hit in the engine didn’t return which made the distribution of hits uneven!

    Another good example of this was my grandfather, who was a survivor of WW2. Bullet maimed his index finger and he had shrapnel remnants in both his legs. Most of his friends had the similar situation with peripheral injuries. This doesn’t mean that people weren’t hit in the chest but that most of those who did, didn’t survive.

    In modern day life, many people are raised on success stories, entrepreneurs who break into a big market, musicians who become superstars, athletes who join major teams. What you don’t hear about are the ones who fail.

    Trying to implement company wide reporting?

    See how Rumpl achieved this in a single day with Mode.

    One that I’ve used in class as an example of an unrepresentative sample is the following.

    Seemingly every insurance company makes the following claim in its advertisements. Those who switched to our insurance company saved on average $X per month. Of course that is going to be true. Think about it. Suppose you would lose money by switching, would you switch? Of course not and hence, you will not be included in the sample of those who switched. The sample of those who switched is pretty much guaranteed to include only people who saved a significant amount of money or else why would they have switched? As far as I know, 99% of the population would lose money by switching to company X and yet it would be still be true that of those who switched, they saved a significant amount on average. Unless you tell me what percentage did not switch and what these people would have lost or gained on average, the cited statistic is meaningless.

    Become a Data Scientist: The hottest job of the 21st century.

    Course in collaboration with IBM. Get hands-on exposure to Data Science, ML, R, Python, AI & more.

    So someone shared a pictograph of the average female height per country, and everyone on social media was left scratching their heads.

    I mean..

    Why will they not?!!

    Have a look :

    You see, the graph, all painted in pink, showed symbols of ladies laid out according to their height in inches. But the proportions were so off that Indian women looked like tiny smurfs compared to Latvian giants.

    So an Indian woman named Sabah tweeted this ridiculous pictograph and then it went viral;


    The height of Latvian women has surely got a lot of attention, thanks to the viral pictograph.The researchers observed that both men and women have grown taller over the last century in every country around the world.

    Although the actual author of this pictograph is not known, it’s fair to say that what they tried to convey didn’t quite work out.

    The pictograph in general can be a very useful tool to make the reader understand the subject without reading the legend or the by-text. The choice of icons clearly plays the most important role here, since they function as a form of visual language that can be understood around the world by virtually anyone.

    A good pictograph needs to be arranged carefully and in a logical manner, which this example is clearly lacking.

    Although this pictograph clearly failed what they were trying to convey, but this is the best thing I’ve seen this month!!!

    😂😂(laughs in 5’4″).

    Pragati Chahar’s answer to How do short people look besides tall people?

    Probably this:

    This is a spurious correlation, with a bit of fucked up graph scales!

    Cancer screening & prevention services AND abortions are not correlated causally, this is not a correlation graph, even though it’s made to look like this.

    If you are smart, you can also see that the graph is totally wrong. Look at the numbers, where are the scales, and why do the arrows crosses?

    I will give some other funny examples:

    15 Insane Things That Correlate With Each Other

    “In statistics, a spurious relationship or spurious correlation is a mathematical relationship in which two or more events or variables are not causally related to each other, yet it may be wrongly inferred that they are, due to either coincidence or the presence of a certain third, unseen factor (referred to as a “common response variable”, “confounding factor”, or “lurking variable“).”

    Spurious relationship – Wikipedia

    One statistics test that may prevent that is the use of homoscedasticity tests in your regressions.

    Using rate of change instead of raw numbers.

    “Our user-base grew by 100% in last week” sounds a lot cooler than “10 new users signed up last week”. It is very well illustrated in this xkcd’s comic.
    [1]


    Correlation vs Causation

    The following conclusion was drawn out from the study conducted at UPenn’s medical center which was later published in 1999.

    Young children who sleep with the light on are much more likely to develop myopia in later life. Therefore, sleeping with the light on causes myopia.[2]

    However, another study at Ohio State University found a strong connection between parental myopia and the development of child myopia, also noting that myopic parents were more likely to leave a light on in their children’s bedroom, rendering the above conclusion false.

    Footnotes

    My most favorite misleading statistics is a nearly perfect positive correlation between divorce rate in Maine and per capita consumption of margarine in US. Yes, you read it right, it is divorce rate and consumption of margarine.

    Out of statistics context, it is nearly weird to find a reasonable relationship between divorce rate & consumption of margarine. However, this is an example of spurious correlation. Correlation is not an index that shows causation between two variables, it just shows how strong the correlation is. In this case, the nearly perfect correlation can be caused by coincidence or the presence of unseen variable between them.

    The famous weird relationship between Maine divorce rate and per capita consumption of margarine is not the only example of spurious correlation. There are my other weird favorites:

    1. Honey producing bee colonies & juvenile arrests for possession of marijuana
    2. Age of Miss America & murders by steam, hot vapours and hot objects
    3. Per capita consumption of cheese & number of people who died by becoming tangled in their bedsheets

    [1] Divorce rate in Maine correlates with Per capita consumption of margarine (US)

    [2] How divorce rates are linked to consumption of margarine

    Footnotes

Buy CBD Oil Florida