73.6% of all Statistics are Made Up --- How to Interpret Analysts Reports

73.6% of all Statistics are Made Up — How to Interpret Analysts Reports

The problem of skewing results:

(…) How is it skewed? There are so many ways to present data to tell the story you want that I can’t even list every way data is skewed. Here are some examples:

You ask a small sample set so that data isn’t statistically significant . This is often naiveté rather than malicious

You ask a group that is not unbiased. (…) This type of statistical error is known as " selective bias ."

Also common, you look at a large data set of questions asked about consumer preferences. You pick out the answers that support your findings and leave out the ones that don’t support it from your report. This is an " error of omission ."

You change the specific words asked in the survey such that you subtly change the meaning for the person reading your conclusions. But subtle changes in words can totally change the way that the reader interprets the results.

Also common is that the survey itself asks questions in a way that leads the responder to a specific answer.

There are malicious data such as on Yelp where you might have a competitor that types in bad results on your survey to bring you down (…)

(…) As my MBA statistics professor used to say, “seek disconfirming evidence.” That always stuck with me.

Here’s how to interpret data:

In the end make sure you’re suspicious of all data. Ask yourself the obvious questions:

who did the primary research on this analysis?

who paid them? Nobody does this stuff free. You’re either paid up front “sponsored research” or you’re paid on the back-end in terms of clients buying research reports.

what motives might these people have had?

who was in the sample set? how big was it? was it inclusive enough?

and the important thing about data for me … I ingest it religiously. I use it as one source of figuring out my version of the truth. And then I triangulate. I look for more sources if I want a truer picture. I always try to think to myself, “what would the opposing side of this data analysis use to argue its weaknesses?”