Your data-driven claims must still be probed severely

Posted on June 4, 2018 by Mayo

Vagelos Education Center

Below are the slides from my talk today at Columbia University at a session, Philosophy of Science and the New Paradigm of Data-Driven Science, at an American Statistical Association Conference on Statistical Learning and Data Science/Nonparametric Statistics. Todd was brave to sneak in philosophy of science in an otherwise highly mathematical conference.

Philosophy of Science and the New Paradigm of Data-Driven Science : (Room VEC 902/903)
Organizer and Chair: Todd Kuffner (Washington U)

Deborah Mayo (Virginia Tech) “Your Data-Driven Claims Must Still be Probed Severely”
Ian McKeague (Columbia) “On the Replicability of Scientific Studies”
Xiao-Li Meng (Harvard) “Conducting Highly Principled Data Science: A Statistician’s Job and Joy

Categories: slides, Statistics and Data Science | 5 Comments

5 thoughts on “Your data-driven claims must still be probed severely”

June 6, 2018

Mayo

In data science, it is frequently the case that the metric that is being optimised in an ML model's cost function is not what you *really* want to optimise for, because your problem is usually a function of the ML model's metric (e.g. optimise log loss to improve accuracy). 1/n

— Orestis Tsinalis (@orestistsinalis) June 5, 2018

Reply
June 7, 2018

Orestis Tsinalis

Posting the full thread that I originally posted on Twitter (
https://twitter.com/orestistsinalis/status/1004097202887757824)

In data science, it is frequently the case that the metric that is being optimised in an ML model’s cost function is not what you *really* want to optimise for, because your problem is usually a function of the ML model’s metric (e.g. optimise log loss to improve accuracy).

Therefore the best H becomes really the best *observed* H of correlated (e.g. acc is somewhat correlated with log loss) or, worse, accidental but desirable properties of the model.

To severely probe an ML model in the context of a *specific problem*, one needs to show that a change in the model was expected to influence the problem solution in a specific way and not others.

People also tune hyperparameters to death via so-called ‘grid search’. This is a prototypical example in my opinion of how to *not* learn from error. For me, a severe test for hyperparam tuning is to show a plausible *path* of your hyperparam search.

PS: Important position paper on machine learning practices: “On Pace, Progress, and Empirical Rigor” https://t.co/A2qMyMx204 https://t.co/M1XeC4ncbr

Reply
June 9, 2018

Mayo

This is beyond troubling about data science https://t.co/q8rmxhWiYi

— Frank Harrell (@f2harrell) June 9, 2018

Reply

June 9, 2018

Mayo

This conversation began with a tweet by Frank Harrell:

Controversial question of the day: Is 'data scientist' the only job title that confers the status of 'scientist' to someone who has no advanced degree? If that's so, should it be the case?

— Frank Harrell (@f2harrell) June 8, 2018

Reply

June 9, 2018

Mayo

Simple requirement of being a scientist that would disqualify many data scientists: Being willing to refuse a project when the design or available data cannot meet the project's goals (this would perhaps disqualify many translational biomedical "scientists" and epidemiologists).

— Frank Harrell (@f2harrell) June 9, 2018

Reply

I welcome constructive comments that are of relevance to the post and the discussion, and discourage detours into irrelevant topics, however interesting, or unconstructive declarations that "you (or they) are just all wrong". If you want to correct or remove a comment, send me an e-mail. If readers have already replied to the comment, you may be asked to replace it to retain comprehension. Cancel reply

Blog at WordPress.com.

Your data-driven claims must still be probed severely

Post navigation

5 thoughts on “Your data-driven claims must still be probed severely”

The Statistics Wars & Their Casualties

Blog links (references)

Reviews of Statistical Inference as Severe Testing (SIST)

Interviews & Debates on PhilStat (2020)

Interviews on PhilStat (2019)

LSE PH500 Research Seminar (May 21-June 25, 2020): Controversies in Phil Stat

Summer Seminar 2019 (article)

Top Posts & Pages

Conferences & Workshops

RMM Special Topic

Mayo & Spanos, Error Statistics

Follow Blog via Email

My Websites

Recent Posts: PhilStatWars

THE STATISTICS WARS AND THEIR CASUALTIES VIDEOS & SLIDES FROM SESSIONS 3 & 4

Final session: The Statistics Wars and Their Casualties: 8 December, Session 4

SCHEDULE: The Statistics Wars and Their Casualties: 1 Dec & 8 Dec: Sessions 3 & 4

WORKSHOP

The Statistics Wars and Their Casualties Videos & Slides from Sessions 1 & 2

LOG IN/OUT

Archives

© Deborah G. Mayo, Error Statistics Philosophy, 2011-2018 All Rights Reserved.

© Deborah G. Mayo, Error Statistics Philosophy, 2011-2018. All Rights Reserved.

Your data-driven claims must still be probed severely

Related

Post navigation

5 thoughts on “Your data-driven claims must still be probed severely”

The Statistics Wars & Their Casualties

Blog links (references)

Reviews of Statistical Inference as Severe Testing (SIST)

Interviews & Debates on PhilStat (2020)

Interviews on PhilStat (2019)

LSE PH500 Research Seminar (May 21-June 25, 2020): Controversies in Phil Stat

Summer Seminar 2019 (article)

Top Posts & Pages

Conferences & Workshops

RMM Special Topic

Mayo & Spanos, Error Statistics

Follow Blog via Email

My Websites

Recent Posts: PhilStatWars

THE STATISTICS WARS AND THEIR CASUALTIES VIDEOS & SLIDES FROM SESSIONS 3 & 4

Final session: The Statistics Wars and Their Casualties: 8 December, Session 4

SCHEDULE: The Statistics Wars and Their Casualties: 1 Dec & 8 Dec: Sessions 3 & 4

WORKSHOP

The Statistics Wars and Their Casualties Videos & Slides from Sessions 1 & 2

LOG IN/OUT

Archives

© Deborah G. Mayo, Error Statistics Philosophy, 2011-2018 All Rights Reserved.

© Deborah G. Mayo, Error Statistics Philosophy, 2011-2018. All Rights Reserved.