Statistics Denial Myths #5-6, Mischaracterizing Statistical Significance

This is the ninth instalment in the series of essays on Statistics Denial by Randy Bartlett, Ph.D. To read other articles in the series, click here.
Myth #5: For a large number of observations (Big Data: Volume), all the variables are significant so statistics does not work.
Myth #6: Statistics does not accommodate ‘consequential’ statistical significance.
Myth #5 builds upon the old confusion around significance testing that comprises this second ‘ancient’ myth (#6).  Suppose that you are building a predictive model based upon a billion observations (n) and using 500 variables (p).  After you press the magic ‘make model’ button, all of the parameters are ‘significantly different’ … from zero.  Conclusion, statistics does not work … WRONG.
Here is a quote typifying the misunderstanding:
‘“One big reason [why statistics does not work]… is that everything passes statistical tests with significance,” he says. “If you have a million records, everything looks like it’s good [significant].”’ According to the same person, ‘there’s a difference between statistical significance and what he calls operational [consequential] significance.’
First, if you are building predictive models, then why are you using hypothesis testing?  Second, why are you using hypothesis testing?  Third, if you must use hypothesis testing, then try the right one.  Use the ‘new’ breakthrough from Neyman-Pearson (ca 1932), which addresses (consequential) statistical significance.  Now we will be more specific about these three disconnects.
First, if you are building predictive models, then why are you using hypothesis testing? 
For predictive models, whether coefficients are significantly different from zero is not the primary consideration.  The point is whether the model predicts.  We know, you seek parsimony by dropping parameters, which have coefficients that are not significant from zero.  Hint: There are better statistical avenues to parsimony; use statistics designed for that task.
Recall that there are four modeling objectives: coefficient estimation, prediction, grouping, and ranking.  Hypothesis testing was conceived for decision making largely in the context of coefficient estimation.  As such, it is only an important sideshow to the main show of statistics—the logic of numbers with uncertainty.
Second, why are you using hypothesis testing?
Confidence intervals generally have more utility than hypothesis tests.  We know, sometimes you just want or need a hypothesis test, yet not for prediction.  Also, confidence regions nicely address multidimensional needs.
Third, if you must use hypothesis testing, then try the right one.  Use the new breakthrough from Neyman-Pearson (ca 1932), which addresses (consequential) statistical significance. 
Now we have arrived at Statistics Denial Myth #6, the old confusion between the Fisherian school of hypothesis testing and Neyman-Pearson.  Fisher was the first to address the matter of hypothesis testing and he developed a logical approach, which compares unknown parameters to zero.  Neyman-Pearson hypothesis testing famously expands this work by insisting on an alternative hypothesis and adding a term, δ , as a cutoff for (consequential) statistical significance.  This has been called practical significance, economic significance, etc. and now operational significance.  This portmanteau hypothesis test allows coefficients to be compared to any value, δ .
For example, suppose that if a coefficient exceeds some consequential value δ , then retaining it is statistically significant.  The hypotheses might take the following form:
where δ is the cutoff for a consequential difference.  Neyman would say that the alternative hypothesis, δ , should represent the consequential scenario.  (See ‘Encyclopedia of Research Design,’ Vol. I, SAGE (2010), p. 298).  Hence, consequential significance is statistical significance.  As always, see a professional for your advanced statistics needs.
Confusion about hypothesis testing is completely understandable, yet not acceptable for self-professed experts.
While these misunderstandings have an amusing side, they also have an edge.  At the extreme, we have seen hucksters broadcasting mischaracterizations of statistics to better position their lesser qualifications or to blunt legitimate criticism of their blatant mistakes.  One common claim coming from hucksters innocent of statistics is that we do not need statistics anymore because we have access to them.
In Blogs 2 & 3, we discussed the harm caused by promotional hype extreme enough to adulterate statistics and circumvent best practice—our best tools for extracting the information.
We sure could use Deming, right now.  Many of us who embrace the explicit rigorous logic and protocols of these tenets of data analysis hang out in the new LinkedIn group, About Data Analysis.  Come see us.

  1. gamefly free trial 9 months ago

    Do you have any video of that? I’d care to find out some additional information.

  2. gamefly free trial 9 months ago

    Piece of writing writing is also a excitement, if you be acquainted with after that you
    can write otherwise it is difficult to write.

  3. gamefly free trial 9 months ago

    Hey, I think your site might be having browser compatibility issues.

    When I look at your website in Firefox, it looks fine but when opening in Internet
    Explorer, it has some overlapping. I just wanted to give you a quick
    heads up! Other then that, great blog!

  4. I’m not sure where you’re getting your info, but
    good topic. I needs to spend some time learning more or understanding more.
    Thanks for wonderful information I was looking for this information for
    my mission. natalielise pof

  5. minecraft games 6 months ago

    you are truly a good webmaster. The site loading pace is incredible.
    It sort of feels that you are doing any distinctive trick.

    Furthermore, The contents are masterpiece. you’ve performed a fantastic task in this

  6. quest bars cheap 5 months ago

    What’s up, yes this post is actually good and I have learned lot of
    things from it concerning blogging. thanks.

  7. a coconut oil 3 months ago

    Hmm is anyone else experiencing problems with
    the images on this blog loading? I’m trying to find out if its a problem on my end
    or if it’s the blog. Any feed-back would be greatly appreciated.

  8. 3 months ago

    Normally I don’t read post on blogs, but I wish to say that this write-up very pressured me
    to check out and do it! Your writing taste has been surprised me.
    Thanks, very nice article.

  9. I really like your blog.. very nice colors & theme.
    Did you design this website yourself or did you hire someone to do it
    for you? Plz answer back as I’m looking to design my own blog and would like to know where u
    got this from. kudos

  10. ps4 games 2 months ago

    Superb post but I was wondering if you could
    write a litte more on this topic? I’d be very thankful if you could elaborate a little bit
    more. Kudos!

Leave a Comment

Your email address will not be published.

You may also like

Pin It on Pinterest