Intuitive biostatistics: a nonmathematical guide to statistical thinking 4th edition harvey motulsky

Page 1


https://ebookmass.com/product/intuitive-biostatistics-a-

Instant digital products (PDF, ePub, MOBI) ready for you

Download now and discover formats that fit your needs...

Statistical Thinking From Scratch: A Primer For Scientists M. D. Edge

https://ebookmass.com/product/statistical-thinking-from-scratch-aprimer-for-scientists-m-d-edge/

ebookmass.com

Concise Guide to Critical Thinking 1st Edition Lewis Vaughn

https://ebookmass.com/product/concise-guide-to-critical-thinking-1stedition-lewis-vaughn/

ebookmass.com

Wireless connectivity: an intuitive and fundamental guide Popovski

https://ebookmass.com/product/wireless-connectivity-an-intuitive-andfundamental-guide-popovski/

ebookmass.com

Awaken: The Path to Purpose, Inner Peace, and Healing

https://ebookmass.com/product/awaken-the-path-to-purpose-inner-peaceand-healing-rajendra-sisodia/

ebookmass.com

The Study Hard Romances Books 1-3 Mika Lane

https://ebookmass.com/product/the-study-hard-romances-books-1-3-mikalane/

ebookmass.com

API Management: An Architect's Guide to Developing and Managing APIs for Your Organization 2nd Edition Brajesh De

https://ebookmass.com/product/api-management-an-architects-guide-todeveloping-and-managing-apis-for-your-organization-2nd-editionbrajesh-de/

ebookmass.com

Atlas of Small Animal Wound Management and Reconstructive Surgery 4th Edition

https://ebookmass.com/product/atlas-of-small-animal-wound-managementand-reconstructive-surgery-4th-edition/

ebookmass.com

Ontology and Phenomenology of Speech: An Existential Theory of Speech Marklen E. Konurbaev

https://ebookmass.com/product/ontology-and-phenomenology-of-speech-anexistential-theory-of-speech-marklen-e-konurbaev/

ebookmass.com

Global Sports and Contemporary China: Sport Policy, International Relations and New Class Identities in the People’s Republic Oliver Rick

https://ebookmass.com/product/global-sports-and-contemporary-chinasport-policy-international-relations-and-new-class-identities-in-thepeoples-republic-oliver-rick/

ebookmass.com

Norms, storytelling and international institutions in China : the imperative to narrate Xiaoyu Lu

https://ebookmass.com/product/norms-storytelling-and-internationalinstitutions-in-china-the-imperative-to-narrate-xiaoyu-lu/

ebookmass.com

Intuitive Biostatistics

Intuitive Biostatistics

A Nonmathematical Guide to Statistical Thinking

HARVEY MOTULSKY, M.D.

GraphPad Software, Inc.

FOURTH EDITION

Oxford University Press is a department of the University of Oxford. It furthers the University’s objective of excellence in research, scholarship, and education by publishing worldwide.

Oxford New York

Auckland Cape Town Dar es Salaam Hong Kong Karachi Kuala Lumpur Madrid Melbourne Mexico City Nairobi New Delhi Shanghai Taipei Toronto

With offices in

Argentina Austria Brazil Chile Czech Republic France Greece

Guatemala Hungary Italy Japan Poland Portugal Singapore

South Korea Switzerland Thailand Turkey Ukraine Vietnam

Copyright © 2018 by Oxford University Press.

For titles covered by Section 112 of the US Higher Education Opportunity Act, please visit www.oup.com/us/he for the latest information about pricing and alternate formats.

Published by Oxford University Press

198 Madison Avenue, New York, New York 10016 www.oup.com

Oxford is a registered trademark of Oxford University Press.

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior permission of Oxford University Press.

CIP data is on file at the Library of Congress 978-0-19-064356-0

987654321

Printed by LSC Communications, Inc. United States of America

I dedicate this book to my wife, Lisa, to my kids (Wendy, Nat, Joey, and Ruby), to readers who encouraged me to continue with a fourth edition, and to future scientists who I hope will avoid common mistakes in biostatistics.

PRAISE FOR INTUITIVE BIOSTATISTICS

Intuitive Biostatistics is a beautiful book that has much to teach experimental biologists of all stripes. Unlike other statistics texts I have seen, it includes extensive and carefully crafted discussions of the perils of multiple comparisons, warnings about common and avoidable mistakes in data analysis, a review of the assumptions that apply to various tests, an emphasis on confidence intervals rather than P values, explanations as to why the concept of statistical significance is rarely needed in scientific work, and a clear explanation of nonlinear regression (commonly used in labs; rarely explained in statistics books).

In fact, I am so pleased with Intuitive Biostatistics that I decided to make it the reference of choice for my postdoctoral associates and graduate students, all of whom depend on statistics and most of whom need a closer awareness of precisely why. Motulsky has written thoughtfully, with compelling logic and wit. He teaches by example what one may expect of statistical methods and, perhaps just as important, what one may not expect of them. He is to be congratulated for this work, which will surely be valuable and perhaps even transformative for many of the scientists who read it.

—Bruce Beutler, 2011 Nobel Laureate, Physiology or Medicine Director, Center for the Genetics of Host Defense UT Southwestern Medical Center

GREAT FOR SCIENTISTS

This splendid book meets a major need in public health, medicine, and biomedical research training—a user-friendly biostatistics text for non-mathematicians that clearly explains how to make sense of statistical results, how to avoid common mistakes in data analysis, how to avoid being confused by statistical nonsense, and (new in this edition) how to make research more reproducible. You may enjoy statistics for the first time!

—Gilbert S. Omenn, Professor of Medicine, Genetics, Public Health, and Computational Medicine & Bioinformatics, University of Michigan

I am entranced by the book. Statistics is a topic that is often difficult for many scientists to fully appreciate. The writing style and explanations of Intuitive Biostatistics makes the concepts accessible. I recommend this text to all researchers. Thank you for writing it.

—Tim Bushnell, Director of Shared Resource Laboratories, University of Rochester Medical Center

GREAT FOR STUDENTS

After struggling with books that weren’t right for my class, I was delighted to find Intuitive Biostatistics. It is the best starting point for undergraduate students seeking to learn the fundamental principles of statistics because of its unique presentation of the important concepts behind statistics. Lots of books give you the “recipe” approach, but only Intuitive Biostatistics explains what it all means. It meticulously goes through common mistakes and shows how to correctly choose, perform, and interpret the proper statistical test. It is accessible to new learners without being condescending.

The University of Texas at Austin

This textbook emphasizes the thinking needed to interpret statistical analysis in published research over knowledge of the mathematical underpinnings. The basics of choosing tests and doing simpler analyses are covered very clearly and simply. The language is easy to understand yet accurate. It brings in the higher level of intuitive understanding that we hope students will have at the end of an honors undergraduate or MSc program, skipping over the mathematical details that are now handled by software anyway. It is the prefect approach and level for undergraduates beginning research.

—Janet E. Kübler, Biology Department, California State University at Northridge

I read many statistics textbooks and have come across very few that actually explain statistical concepts well. Yours is a stand-out exception. In particular, I think you’ve done an outstanding job of helping readers understand P values and confidence intervals, and yours is one of the very first introductory textbooks to discuss the crucial concept of false discovery rates. I have already recommended your text to postgraduate students and postdoctoral researchers at my own institute.

—Rob Herbert Neuroscience Research Australia

GREAT FOR EVERYONE

I’ve read several statistics books but found that some concepts I was interested in were not mentioned and other concepts were hard to understand. You can ignore the “bio” in Intuitive Biostatistics, as it is the best applied statistics books I have come across, period. Its clear, straightforward explanations have allowed me to better understand research papers and select appropriate statistical tests. Highly recommended.

preface xxv

part a Introducing Statistics

1. Statistics and Probability are not Intuitive 3

2. The Complexities of Probability 14

3. From Sample to Population 24

part b Introducing Confidence Intervals

4. C onfidence Interval of a Proportion 31

5. C onfidence Interval of Survival Data 46

6. Confidence Interval of Counted Data (Poisson Distribution) 55

part c Continuous Variables

7. Graphing Continuous Data 63

8. Types of Variables 75

9. Quantifying Scatter 80

10. The Gaussian Distribution 89

11. The Lognormal Distribution and Geometric Mean 95

12. Confidence Interval of a Mean 101

13. The Theory of Confidence Intervals 110

14. Error Bars 118

part d P Values and Statistical Significance

15. Introducing P Values 129

16. Statistical Significance and Hypothesis Testing 145

17. C omparing Groups with Confidence Intervals and P Values 157

18. Interpreting a Result that is Statistically Significant 165

19. Interpreting a Result that is Not Statistically Significant 179

20. Statistical Power 186

21. Testing for Equivalence or Noninferiority 193

part e Challenges in Statistics

22. Multiple Comparisons Concepts 203

23. The Ubiquity of Multiple Comparisons 214

24. Normality Tests 224

25. Outliers 232

26. Choosing a Sample Size 239

part f Statistical Tests

27. Comparing Proportions 263

28. Case-Control Studies 273

29. Comparing Survival Curves 284

30. Comparing Two Means: Unpaired t Test 294

31. Comparing Two Paired Groups 306

32. Correlation 318

part g Fitting Models to Data

33. Simple Linear Regression 331

34. Introducing Models 350

35. Comparing Models 357

36. Nonlinear Regression 366

37. Multiple Regression 378

38. Logistic and Proportional Hazards Regression 395

part h The Rest of Statistics

39. Analysis of Variance 407

40. Multiple Comparison Tests After Anova 418

41. Nonparametric Methods 431

42. S ensitivity, Specificity, and Receiver Operating Characteristic Curves 442

43. Meta-Analysis 452

part i Putting It All Together

44. The Key Concepts of Statistics 463

45. Statistical Traps To Avoid 468

46. Capstone Example 487

47. Statistics and Reproducibility 502

48. Checklists for Reporting Statistical Methods and Results 511

part j appendices 517

references 533 index 548

CONTENTS

part a

preface xxv

Who is this Book for?

What makes the Book Unique? What’s New?

Which Chapters are Essential? Who Helped? Who Am I?

Introducing Statistics

1. Statistics and Probability are not Intuitive 3

We Tend to Jump to Conclusions

We Tend to Be Overconfident

We See Patterns in Random Data

We Don’t Realize that Coincidences are Common

We Don’t Expect Variability to Depend on Sample Size

We Have Incorrect Intuitive Feelings about Probability

We Find It Hard to Combine Probabilities

We Don’t Do Bayesian Calculations Intuitively

We are Fooled by Multiple Comparisons

We Tend to Ignore Alternative Explanations

We are Fooled By Regression to the Mean

We Let Our Biases Determine How We Interpret Data

We Crave Certainty, but Statistics Offers Probabilities

Chapter Summary

Term Introduced in this Chapter

2. The Complexities of Probability 14

Basics of Probability

Probability as Long-Term Frequency

Probability as Strength of Belief

Calculations With Probabilities Can be Easier If You Switch to Calculating with Whole Numbers

Common Mistakes: Probability

Lingo

Probability in Statistics

Q & A

Chapter Summary

Terms Introduced in this Chapter

3. From Sample to Population 24

Sampling from a Population Sampling Error and Bias Models and Parameters

Multiple Levels of Sampling

What if Your Sample is the Entire Population?

Chapter Summary

Terms Introduced in this Chapter

part b

Introducing Confidence Intervals

4. Confidence Interval of a Proportion 31

Data Expressed as Proportions

The Binomial Distribution: From Population to Sample Example: Free Throws in Basketball Example: Polling Voters

Assumptions: Confidence Interval of a Proportion What Does 95% Confidence Really Mean? Are You Quantifying the Event You Care About?

Lingo

Calculating the CI of a Proportion Ambiguity if the Proportion is 0% or 100% An Alternative Approach: Bayesian Credible Intervals

Common Mistakes: CI of a Proportion

Q & A

Chapter Summary

Terms Introduced in this Chapter

5. Confidence Interval of Survival Data 46

Survival Data

Censored Survival Data

Calculating Percentage Survival at Various Times Graphing Survival Curves with Confidence Bands

Summarizing Survival Curves

Assumptions: Survival Analysis

Q & A

Chapter Summary

Terms Introduced in this Chapter

6. C onfidence Interval of Counted

Data (Poisson Distribution) 55

The Poisson Distribution

Assumptions: Poisson Distribution

Confidence Intervals Based on Poisson Distributions

How to Calculate the Poisson CI

The Advantage of Counting for Longer Time Intervals (Or in Larger Volumes)

Q & A

Chapter Summary

Term Introduced in this Chapter

7. Graphing Continuous Data 63

Continuous Data

The Mean and Median

Lingo: Terms Used to Explain Variability Percentiles

Graphing Data to Show Variation

Graphing Distributions

Beware of Data Massage

Q & A

Chapter Summary

Terms Introduced in this Chapter

8. Types of Variables 75

Continuous Variables

Discrete Variables

Why It Matters?

Not Quite as Distinct as They Seem

Q & A

Chapter Summary

Terms Introduced in this Chapter

9. Quantifying Scatter 80

Interpreting a Standard Deviation

How it Works: Calculating SD

Why n – 1?

Situations in Which n Can Seem Ambiguous

SD and Sample Size

Other Ways to Quantify and Display Variability

Q & A

Chapter Summary

Terms Introduced in this Chapter

10. The Gaussian Distribution 89

The Nature of The Gaussian Distribution

SD and the Gaussian Distribution

The Standard Normal Distribution

The Normal Distribution does not Define Normal Limits

Why The Gaussian Distribution is so Central to Statistical Theory

Q & A

Chapter Summary

Terms Introduced in this Chapter

11. The Lognormal Distribution and Geometric Mean 95

The Origin of a Lognormal Distribution

Logarithms?

Geometric Mean

Geometric SD

Common Mistakes: Lognormal Distributions

Q & A

Chapter Summary

Terms Introduced in this Chapter

12. Confidence Interval of a Mean 101

Interpreting A CI of a Mean

What Values Determine the CI of a Mean?

Assumptions: CI of a Mean

How to Calculate the CI of a Mean

More about Confidence Intervals

Q & A

Chapter Summary

Terms Introduced in this Chapter

13. The Theory of Confidence Intervals 110

CI of a Mean Via the t Distribution

CI of a Mean Via Resampling

CI of a Proportion Via Resampling

CI of a Proportion Via Binomial Distribution

Q & A

Chapter Summary

Terms Introduced in this Chapter

14. Error Bars 118

SD Versus Sem

Which Kind of Error Bar Should I Plot?

The Appearance of Error Bars

How are SD and Sem Related to Sample Size?

Geometric SD Error Bars

Common Mistakes: Error Bars

Q & A

Chapter Summary

Terms Introduced in this Chapter

part d P Values and Statistical Significance

15. Introducing P Values 129

Introducing P Values

Example 1: Coin Flipping

Example 2: Antibiotics on Surgical Wounds

Example 3: Angioplasty and Myocardial Infarction

Lingo: Null Hypothesis

Why P Values are Confusing One- Or Two-Tailed P Value?

P Values are Not Very Reproducible

There is much more to Statistics than P Values

Common Mistakes: P Values

Q & A

Chapter Summary

Terms Introduced in this Chapter

16. Statistical Significance and Hypothesis Testing 145

Statistical Hypothesis Testing

Analogy: Innocent Until Proven Guilty

Extremely Significant? Borderline Significant?

Lingo: Type I and Type II Errors

Tradeoffs When Choosing a Significance Level

What Significance Level Should You Choose?

Interpreting A CI, A P Value, and A Hypothesis Test

Statistical Significance vs. Scientific Significance

Common Mistakes: Statistical Hypothesis Testing

Q & A

Chapter Summary

Terms Defined in this Chapter

17. C omparing Groups with Confidence Intervals and P Values 157

CIS and Statistical Hypothesis Testing are Closely Related

Four Examples with CIS, P Values, and Conclusion about Statistical Significance

Q & A

Chapter Summary

18. Interpreting a Result that is Statistically Significant 165

Seven Explanations for Results that are “Statistically Significant”

How Frequently do Type I Errors (False Positives) Occur?

The Prior Probability Influences the FPRP (A Bit of Bayes)

Bayesian Analysis

Accounting for Prior Probability Informally

The Relationship Between Sample Size and P Values

Common Mistakes

Q & A

Chapter Summary

Terms Introduced in this Chapter

19. Interpreting a Result that is not Statistically Significant 179

Five Explanations For “Not Statistically Significant” Results

“Not Significantly Different” does not Mean “No Difference”

Example: α2-Adrenergic Receptors on Platelets

Example: Fetal Ultrasounds

How to Get Narrower CIS

What if the P Value is Really High?

Q & A

Chapter Summary

20. Statistical Power 186

What is Statistical Power?

Distinguishing Power From Beta and the False Discovery Rate

An Analogy to Understand Statistical Power Power of the Two Example Studies

When does It Make Sense to Compute Power?

Common Mistakes: Power

Q & A

Chapter Summary

Terms Introduced in this Chapter

21. Testing For Equivalence or Noninferiority 193

Equivalence must be Defined Scientifically, not Statistically If the Mean is Within the Equivalence Zone

If the Mean is Outside of the Equivalence Zone

Applying the Usual Approach of Statistical Hypothesis

Testing to Testing for Equivalence

Noninferiority Tests

Common Mistakes: Testing for Equivalence

Q & A

Chapter Summary

Terms Introduced in this Chpater

part e Challenges in Statistics

22. Multiple Comparisons Concepts 203

The Problem of Multiple Comparisons

Correcting for Multiple Comparisons is not Always Needed

The Traditional Approach to Correcting for Multiple Comparisons

Correcting for Multiple Comparisons with the False Discovery Rate

Comparing the Two Methods of Correcting for Multiple Comparisons

Q & A

Chapter Summary

Terms Introduced in this Chapter

23. The Ubiquity of Multiple Comparisons 214

Overview

Multiple Comparisons in Many Contexts

When are Multiple Comparisons Data Torture or P-Hacking?

How to Cope with Multiple Comparisons

Q & A

Chapter Summary

Terms Introduced in this Chapter

24. Normality Tests 224

The Gaussian Distribution is an Unreachable Ideal What A Gaussian Distribution Really Looks Like

QQ Plots

Testing for Normality

Alternatives to Assuming a Gaussian Distribution

Common Mistakes: Normality Tests

Q & A

Chapter Summary

Terms Introduced in this Chapter

25. Outliers 232

How do Outliers Arise?

The Need for Outlier Tests

Five Questions to Ask before Testing for Outliers

Outlier Tests

Is It Legitimate to Remove Outliers?

An Alternative: Robust Statistical Tests

Lingo: Outlier

Common Mistakes: Outlier Tests

Q & A

Chapter Summary

Terms Introduced in this Chapter

26. Choosing a Sample Size 239

Sample Size Principles

An Alternative Way to think about Sample Size Calculations

Interpreting a Sample Size Statement

Lingo: Power

Calculating the Predicted FPRP as Part of Interpreting a Sample Size Statement Complexities when Computing Sample Size

Examples

Other Approaches to Choosing Sample Size

Common Mistakes: Sample Size

Q & A

Chapter Summary

Terms Introduced in this Chapter part f Statistical Tests

27. Comparing Proportions 263

Example: Apixaban for Treatment of Thromboembolism

Assumptions

Comparing Observed and Expected Proportions

Common Mistakes: Comparing Proportions

Q & A

Chapter Summary

Terms Introduced in this Chapter

28. Case-Control Studies 273

Example: Does a Cholera Vaccine Work?

Example: Isotretinoin and Bowel Disease

Example: Genome-Wide Association Studies

How are Controls Defined?

How are Cases Defined?

Epidemiology Lingo

Common Mistakes: Case-Control Studies

Q & A

Chapter Summary

Terms Introduced in this Chapter

29. Comparing Survival Curves 284

Example Survival Data

Assumptions when Comparing Survival Curves

Comparing Two Survival Curves

Why Not Just Compare Mean or Median Survival Times or Five-Year Survival?

Intention to Treat

Q & A

Chapter Summary

Terms Introduced in this Chapter

30. Comparing Two Means: Unpaired t Test 294

Interpreting Results from an Unpaired t Test

Assumptions: Unpaired t Test

The Assumption of Equal Variances

Overlapping Error Bars and the t Test

How It Works: Unpaired t Test

Common Mistakes: Unpaired t Test

Q & A

Chapter Summary

Terms Introduced in this Chapter

31. Comparing Two Paired Groups 306

When to Use Special Tests for Paired Data

Example of Paired t Test

Interpreting Results from a Paired t Test

The Ratio Paired t Test

McNemar’s Test for a Paired Case-Control Study

Common Mistakes: Paired t Test

Q & A

Chapter Summary

Terms Introduced in this Chapter

32. Correlation 318

Introducing the Correlation Coefficient

Assumptions: Correlation

Lingo: Correlation

How It Works: Calculating the Correlation Coefficient

Common Mistakes: Correlation

Q & A

Chapter Summary

Terms Introduced in this Chapter

part g Fitting Models to Data

33. Simple Linear Regression 331

The Goals of Linear Regression

Linear Regression Results

Assumptions: Linear Regression

Comparison of Linear Regression and Correlation

Lingo: Linear Regression

Common Mistakes: Linear Regression

Q & A

Chapter Summary

Terms Introduced in this Chapter

34. Introducing Models 350

Lingo: Models, Parameters, and Variables

The Simplest Model

The Linear Regression Model

Why Least Squares?

Other Models and other Kinds of Regression

Common Mistakes: Models

Chapter Summary

Terms Introduced in this Chapter

35. Comparing Models 357

Comparing Models is a Major Part of Statistics

Linear Regression as a Comparison of Models

Unpaired t Test Recast as Comparing the Fit of Two Models

Common Mistakes: Comparing Models

Q & A

Chapter Summary

Terms Introduced in this Chapter

36. Nonlinear Regression 366

Introducing Nonlinear Regression

An Example of Nonlinear Regression

Nonlinear Regression Results

How Nonlinear Regression Works

Assumptions: Nonlinear Regression

Comparing Two Models

Tips for Understanding Models

Learn More About Nonlinear Regression

Common Mistakes: Nonlinear Regression

Q & A

Chapter Summary

Terms Introduced in this Chapter

37. Multiple Regression 378

Goals of Multivariable Regression

Lingo

An Example of Multiple Linear Regression Assumptions

Automatic Variable Selection

Sample Size for Multiple Regression

More Advanced Issues with Multiple Regression

Common Mistakes: Multiple Regression

Q & A

Chapter Summary

Terms Introduced in this Chapter

38. L ogistic and Proportional Hazards

Regression 395

Logistic Regression

Proportional Hazards Regression

Assumptions: Logistic Regression

Common Mistakes: Logistic Regression

Q & A

Chapter Summary

Terms Introduced in this Chapter part

The Rest of Statistics

39. Analysis of Variance 407

Comparing the Means of Three or More Groups

Assumptions: One-Way Anova

How It Works: One-Way Anova

Repeated-Measures One Way Anova

An Example of Two-Way Anova

How Two-Way Anova Works

Repeated Measures Two-way Anova

Common Mistakes: Anova

Q & A

Chapter Summary

Terms Introduced in this Chapter

40. Multiple Comparison Tests after Anova 418

Multiple Comparison Tests for the Example Data

The Logic Of Multiple Comparisons Tests

Other Multiple Comparisons Tests

How It Works: Multiple Comparisons Tests

When Are Multiple Comparisons Tests Not Needed?

Common Mistakes: Multiple Comparisons

Q & A

Chapter Summary

Terms Introduced in this Chapter

41 Nonparametric Methods 431

Nonparametric Tests Based on Ranks

The Advantages and Disadvantages of Nonparametric Tests

Choosing Between Parametric and Nonparametric Tests: Does It Matter?

Sample Size for Nonparametric Tests

Nonparametric Tests that Analyze Values (Not Ranks)

Common Mistakes: Nonparametric Tests

Q & A

Chapter Summary

Terms Introduced in this Chapter

42 S ensitivity, Specificity, and Receiver Operating

Characteristic Curves 442

Definitions of Sensitivity and Specificity

The Predictive Value of a Test

Receiver-Operating Characteristic (ROC) Curves

Bayes Revisited

Common Mistakes

Q & A

Chapter Summary

Terms Introduced in this Chapter

43. Meta-Analysis 452

Introducing Meta-Analysis

Publication Bias

Results from a Meta-Analysis

Meta-Analysis of Individual Participant Data

Assumptions of Meta-Analysis

Common Mistakes: Meta-Analysis

Q & A

Chapter Summary

Terms Introduced in this Chapter

44. The Key Concepts of Statistics 463

Term Introduced in this Chapter

45. Statistical Traps to Avoid 468

Trap #1: Focusing on P Values and Statistical Significance Rather than Effect Size

Trap #2: Testing Hypotheses Suggested by the Data

Trap #3: Analyzing Without a Plan—“P-Hacking”

Trap #4: Making a Conclusion about Causation When the Data Only Show Correlation

Trap #5: Overinterpreting Studies that Measure a Proxy or Surrogate Outcome

Trap #6: Overinterpreting Data from an Observational Study

Trap #7: Making Conclusions about Individuals when the Data Were only Collected for Groups

Trap #8: Focusing Only on Means Without asking about Variability or Unusual Values

Trap #9: Comparing Statistically Significant with Not Statistically Significant

Trap #10: Missing Important Findings Because Data Combine Populations

Trap #11: Invalid Multiple Regression Analyses as a Result of an Omitted Variable

Trap #12: Overfitting

Trap #13: Mixing Up the Significance Level with the FPRP

Trap #14: Not Recognizing How Common False Positive Findings are

Trap #15: Not Realizing How Likely it is that a “Significant” Conclusion From a Speculative Experiment is a False Positive

Trap #16: Not Realizing That Many Published Studies have Little Statistical Power

Trap #17: Trying to Detect Small Signals When there is Lots of Noise

Trap #18: Unnecessary Dichotomizing

Trap #19: Inflating Sample Size by Pseudoreplication

Chapter Summary

Terms Introduced in this Chapter

46. Capstone Example 487

The Case of the Eight Naked IC50 S Look Behind the Data

Statistical Significance by Cheating

Using a t Test That Doesn’t Assume Equal SDs

Unpaired t Test as Linear or Nonlinear Regression

Nonparametric Mann–Whitney Test

Just Report the Last Confirmatory Experiment?

Increase Sample Size?

Comparing the Logarithms of IC50 Values

Sample Size Calculations Revisited Is it Ok to Switch Analysis Methods?

The Usefulness of Simulations

Chapter Summary

47. Statistics and Reproducibility 502

The Repoducibility Crisis

Many Analyses are Biased to Inflate the Effect Size

Even Perfectly Performed Experiments are Less Reproducible than Most Expect

Summary

48. C hecklists for Reporting Statistical Methods and Results 511

Reporting Methods Used for Data Analysis

Graphing Data

Reporting Statistical Results

part j appendices 517

Appendix A: Statistics with Graphpad

Appendix B: Statistics with Excel

Appendix C: Statistics with R

Appendix D: Values of the t Distribution Needed to Compute CIs

Appendix E: A Review of Logarithms

Appendix F: Choosing a Statistical Test

Appendix G: Problems and Answers references 533 index 548

PREFACE

My approach in this book is informal and brisk (at least I hope it is), not ceremonious and plodding (at least I hope it isn’t).

Intuitive Biostatistics provides a comprehensive overview of statistics without getting bogged down in the mathematical details. I’ve been gratified to learn that many people have found my approach refreshing and useful. Some scientists have told me that statistics had always been baffling until they read Intuitive Biostatistics. This enthusiasm encouraged me to write this fourth edition.

WHO IS THIS BOOK FOR?

I wrote Intuitive Biostatistics for three main audiences:

• Medical (and other) professionals who want to understand the statistical portions of journals they read. These readers don’t need to analyze any data, but they do need to understand analyses published by others and beware of common statistical mistakes. I’ve tried to explain the big picture without getting bogged down in too many details.

• Undergraduate and graduate students, postdocs, and researchers who analyze data. This book explains general principles of data analysis, but it won’t teach you how to do statistical calculations or how to use any particular statistical program. It makes a great companion to the more traditional statistics texts and to the documentation of statistical software.

• Scientists who consult with statisticians. Statistics often seems like a foreign language, and this text can serve as a phrase book to bridge the gap between scientists and statisticians. Sprinkled throughout the book are “Lingo” sections that explain statistical terminology and point out when ordinary words are given very specialized meanings (the source of much confusion).

I wrote Intuitive Biostatistics to be a guidebook, not a cookbook. The focus is on how to interpret statistical results, rather than how to analyze data. This book presents few details of statistical methods and only a few tables required to complete the calculations.

Turn static files into dynamic content formats.

Create a flipbook