There are 3 statisticians in chapter-1. Which one of those is close to your area? In other words, can you put yourself in a role similar to one of those scientists and apply your knowledge of statistics in your area. If you have another statistician, or scientist in your field that you would like to share with us. Tell us if you see yourself in his or her shoes. You know there are quite a few women scientist

Exploring Statistics Tales of Distributions

12th Edition

Chris Spatz

Outcrop Publishers Conway, Arkansas

Exploring Statistics: Tales of Distributions 12th Edition Chris Spatz

Cover design: Grace Oxley Answer Key: Jill Schmidlkofer Webmaster & Ebook: Fingertek Web Design, Tina Haggard Managers: Justin Murdock, Kevin Spatz

Copyright © 2019 by Outcrop Publishers, LLC All rights reserved. No part of this publication may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical methods, without the prior written permission of the publisher, except in the case of brief quotations embodied in critical reviews and certain other noncommercial uses permitted by copyright law. For permission requests, contact [email protected] or write to the publisher at the address below.

Outcrop Publishers 615 Davis Street Conway, AR 72034 Email: [email protected] Website: outcroppublishers.com Library of Congress Control Number: [Applied for]

ISBN-13 (hardcover): 978-0-9963392-2-3 ISBN-13 (ebook): 978-0-9963392-3-0 ISBN-13 (study guide): 978-0-9963392-4-7

Examination copies are provided to academics and professionals to consider for adoption as a course textbook. Examination copies may not be sold or transferred to a third party. If you adopt this textbook, please acept it as your complimentary desk copy.

Ordering information: Students and professors – visit exploringstatistics.com Bookstores – email [email protected]

Photo Credits – Chapter 1 Karl Pearson – Courtesy of Wellcomeimages.org Ronald A. Fisher – R.A. Fisher portrait, 0006973, Special Collections Research Center, North Carolina State

University Libraries, Raleigh, North Carolina Jerzy Neyman – Paul R. Halmos Photograph Collection, e_ph 0223_01, Dolph Briscoe Center for American History,

The University of Texas at Austin Jacob Cohen – New York University Archives, Records of the NYU Photo Bureau

Printed in the United States of America by Walsworth ® 1 2 3 4 5 6 7 24 23 22 21 20 19 18

Online study guide available at http://exploringstatistics.com/studyguide.php

v About The Author

Chris Spatz is at Hendrix College where he twice served as chair of the Psychology Department. Dr. Spatz’s undergraduate education was at Hendrix and his PhD in experimental psychology is from Tulane University in New Orleans. He subsequently completed postdoctoral fellowships in animal behavior at the University of California, Berkeley, and the University of Michigan. Before returning to Hendrix to teach, Spatz held positions at The University of the South and the University of Arkansas at Monticello.

Spatz served as a reviewer for the journal Teaching of Psychology for more than 20 years. He co-authored a research methods textbook, wrote several chapters for edited books, and was a section editor for the Encyclopedia of Statistics in Behavioral Science.

In addition to writing and publishing, Dr. Spatz enjoys the outdoors, especially canoeing, camping, and gardening. He swims several times a week (mode = 3). Spatz has been an opponent of high textbook prices for years, and he is happy to be part of a new wave of authors who provide high-quality textbooks to students at affordable prices.

About The Author

vi Dedication

With love and affection,

this textbook is dedicated to

Thea Siria Spatz, Ed.D., CHES

Introduction CHAPTER

1

O B J E C T I V E S F O R C H A P T E R 1

After studying the text and working the problems in this chapter, you should be able to:

1. Distinguish between descriptive and inferential statistics 2. Define population, sample, parameter, statistic, and variable as they are

used in statistics 3. Distinguish between quantitative and categorical variables 4. Distinguish between continuous and discrete variables 5. Identify the lower and upper limits of a continuous variable 6. Identify four scales of measurement and distinguish among them 7. Distinguish between statistics and experimental design 8. Define independent variable, dependent variable, and extraneous variable

and identify them in experiments 9. Describe statistics’ place in epistemology 10. List actions to take to analyze a data set 11. Identify a few events in the history of statistics

WE BEGIN OUR exploration of statistics with a trip to London. The year is 1900. Walking into an office at University College

London, we meet a tall, well-dressed man about 40 years old. He is Karl Pearson, Professor of Applied Mathematics and Mechanics. I ask him to tell us a little about himself and why he is an important person. He seems authoritative, glad to talk about himself. As a young man, he says, he wrote essays, a play, and a novel, and he also worked for women’s suffrage. These days, he is excited about this new branch of biology called genetics. He says he supervises lots of data gathering.

1

Karl Pearson

2 Chapter 1

Pearson, warming to our group, lectures us about the major problem in science—there is no agreement on how to decide among competing theories. Fortunately, he just published a new statistical method that provides an objective way to decide among competing theories, regardless of the discipline. The method is called chi square.1 Pearson says, “Now, arguments will be much fewer. Gather a thousand data points and calculate a chi square test. The result gives everyone an objective way to determine whether or not the data fit the theory.”

Exploration Notes from a student: Exploration off to good start. Hit on a nice, easy-to- remember date to start with, visited a founder of statistics, and had a statistic called chi square described as a big deal.

Our next stop is Rothamsted Experiment Station just north of London. Now the year is 1925. There are fields all around the agricultural research facility, each divided into many smaller plots. The growth in the fields seems quite variable.

Arriving at the office, the atmosphere is congenial. The staff is having tea. There are two topics—a new baby and a new book. We get introduced to Ronald Fisher, the chief statistician. Fisher is a small man with thick glasses and red hair.

He tells us about his new child2 and then motions to a book on the table. Sneaking a peek, we read the title: Statistical Methods for Research Workers. Fisher becomes focused on his book, holding forth in an authoritative way.

He says the book explains how to conduct experiments and that an experiment is just a comparison of two or more conditions. He tells us we don’t need a thousand data points. He says that small samples, randomly selected, are the way for science to progress. “With an experiment and my technique of analysis of variance,” he exclaims, “you can determine why that field out there”—here he waves toward the window—“is so variable. We can find out what makes some plots lush and some mimsy.” Analysis of variance,3 he says, works in any discipline, not just agriculture.

Exploration Notes: Looks like statistics had some controversy in it.4 Also looks like progress. Statistics is used for experiments, too, and not just for testing theories. And Fisher says experiments can be used to compare anything. If that’s right, I can use statistics no matter what I major in.

1 Chi square, which is explained in this book in Chapter 14, has been called one of the 20 most important inventions in the 20th century (Hacking, 1984). 2 (in what will become a family with eight children). 3 explained in Chapters 11-13 4 The slight sniping I’ve built into this story is just a hint of the strong animosity between Fisher and Pearson.

Ronald A. Fisher

3 Introduction

Next we go to Poland to visit Jerzy Neyman at his office at the University of Warsaw. It is 1933. As we walk in, he smiles, seems happy we’ve arrived, and makes us feel completely welcome.

Motioning to an envelope on his desk, he tells us it holds a manuscript that he and Egon Pearson5 wrote. “The problem with Fisher’s analysis of variance test is that it focuses exclusively on finding a difference between groups. Suppose the statistical test doesn’t detect a difference. Does that prove there is no difference? No, of course not. It may be that the test was just not sensitive enough to detect the difference. Right?”

At his question, a few of us nod in agreement. Seeing uncertainty, he notes, “Maybe a larger sample is needed to find the difference, you see? Anyway, what we’ve done is expand statistics to cover not just finding a difference, but also what it means when the test doesn’t find a difference. Our approach is what you people in your time will call null hypothesis significance testing.”

Exploration Notes: Statistics seems like a work in progress. Changing. Now it is not just about finding a difference but also about what it means not to find a difference. Also, looks like null hypothesis significance testing is a phrase that might turn up on tests.

Our next trip is to libraries, say, anytime between 1940 and 2000. For this exploration, the task is to examine articles in professional journals published in various disciplines. The disciplines include anthropology, biology, chemistry, defense strategy, education, forestry, geology, health, immunology, jurisprudence, manufacturing, medicine, neurology, ophthalmology, political science, psychology, sociology, zoology, and others. I’m sure you get the idea—the whole range of disciplines that use quantitative measures in their research. What this exploration produces is the discovery that all of these disciplines rely on a data analysis technique called null hypothesis significance testing (NHST).6 Many different statistical tests are employed. However, for all the tests in all the disciplines, the phrase, “p < .05” turns up frequently.

Exploration Notes: It seems that all that earlier controversy has subsided and scientists in all sorts of disciplines have agreed that NHST is the way to analyze quantitative data. All of them seem to think that if there is a comparison to be made, applying NHST is a necessary step to get correct conclusions. All of them use “p < .05,” so I’ll have to be sure to find out exactly what that means.

5 Egon Pearson was Karl Pearson’s son. 6 Null hypothesis significance testing is first explained in Chapters 9 and 10.

Jerzy Neyman

4 Chapter 1

Our next excursion is a 1962 visit with Jacob Cohen at New York University in New York City. He is holding his article about studies published in the Journal of Abnormal and Social Psychology, a leading psychology journal. He tells us that the NHST technique has problems. Also, he says we should be calculating an effect size statistic, which will show whether the differences observed in our experiments are large or small.

Exploration Notes: The idea of an effect size index makes a lot of sense. Just knowing there is a difference isn’t enough. How big is the difference? Wonder what “problems with NHST” is all about.

Back to the library for a final excursion to check out recent events. We come across a 2014 article by Geoff Cumming on the “new statistics.” We find things like, “avoid NHST and use better techniques” (p. 26) and “we should not trust any p value” (p. 13). This seems like awfully strong advice. Are researchers taking this advice? Looking through more of today’s research in journals in several fields, we find that most statistical analyses use NHST and there are many instances of “p < .05.”

Exploration Notes, Conclusion: These days, it looks like statistics is in transition again. There’s a lot of controversy out there about how to analyze data from experiments. The NHST approach is still very common, though, so it’s clear I must learn it. But I want to be prepared for changes. I hope knowing NHST will be helpful for the future.7

Welcome to statistics at a time when the discipline is once again in transition. A well- established tradition (null hypothesis significance testing) has been in place for almost a century but is now under attack. New ways of thinking about data analysis are emerging, and along with them, a collection of statistics that do not include the traditional NHST approach. As for the immediate future, though, NHST remains the method most widely used by researchers in many fields. In addition, much of the thinking required for NHST is required for other approaches.

Our exploration tour is over, so I’ll quit supplying notes; they are your responsibility now. As your own experience probably shows, making up your own summary notes improves retention of what you read. In addition, I have a suggestion. Adopt a mindset that thinks growth. A student with a growth mindset expects to learn new things. When challenges arise, as they

7 Not only helpful, but necessary, I would say.

Jacob Cohen

5 Introduction

Disciplines that Use Quantitative Data

inevitably do, acknowledge them and figure out how to meet the challenge. A growth mindset treats ability as something to be developed (see Dweck, 2016). If you engage yourself in this course, you can expect to use what you learn for the rest of your life.

The main title of this book is “Exploring Statistics.” Exploring conveys the idea of uncovering something that was not apparent before. An attitude of searching, wondering, checking, and so forth is what I want to encourage. (Those who object to traditional NHST procedures are driven by this exploration motivation.) As for this book’s subtitle, “Tales of Distributions,” I’ll have more to say about it as we go along.

Which disciplines use quantitative data? The list is long and more variable than the list I gave earlier. The examples and problems in this textbook, however, come from psychology, biology, sociology, education, medicine, politics, business, economics, forestry, and everyday life. Statistics is a powerful method for getting answers from data, and this makes it popular with investigators in a wide variety of fields.

Statistics is used in areas that might surprise you. As examples, statistics has been used to determine the effect of cigarette taxes on smoking among teenagers, the safety of a new surgical anesthetic, and the memory of young school-age children for pictures (which is as good as that of college students). Statistics show which diseases have an inheritance factor, how to improve short-term weather forecasts, and why giving intentional walks in baseball is a poor strategy. All these examples come from Statistics: A Guide to the Unknown, a book edited by Judith M. Tanur and others (1989). Written for those “without special knowledge of statistics,” this book has 29 essays on topics as varied as those above.

In American history, the authorship of 12 of The Federalist papers was disputed for a number of years. (The Federalist papers were 85 short essays written under the pseudonym “Publius” and published in New York City newspapers in 1787 and 1788. Written by James Madison, Alexander Hamilton, and John Jay, the essays were designed to persuade the people of the state of New York to ratify the Constitution of the United States.) To determine authorship of the 12 disputed papers, each was graded with a quantitative value analysis in which the importance of such values as national security, a comfortable life, justice, and equality was assessed. The value analysis scores were compared with value analysis scores of papers known to have been written by Madison and Hamilton (Rokeach, Homant, & Penner, 1970). Another study, by Mosteller and Wallace, analyzed The Federalist papers using the frequency of words such as by and to (reported in Tanur et al., 1989). Both studies concluded that Madison wrote all 12 essays.

Here is an example from law. Rodrigo Partida was convicted of burglary in Hidalgo County, a border county in southern Texas. A grand jury rejected his motion for a new trial. Partida’s attorney filed suit, claiming that the grand jury selection process discriminated against Mexican-Americans. In the end (Castaneda v. Partida, 430 U.S. 482 [1976]), Justice Harry

6 Chapter 1

Inferential statistics Method that uses sample evidence and probability to reach conclusions about unmeasurable populations.

Descriptive statistic A number that conveys a particular characteristic of a set of data.

Mean Arithmetic average; sum of scores divided by number of scores.

Blackmun of the U.S. Supreme Court wrote, regarding the number of Mexican-Americans on grand juries, “If the difference between the expected and the observed number is greater than two or three standard deviations, then the hypothesis that the jury drawing was random (is) suspect.” In Partida’s case, the difference was approximately 12 standard deviations, and the Supreme Court ruled that Partida’s attorney had presented prima facie evidence. (Prima facie evidence is so good that one side wins the case unless the other side rebuts the evidence, which in this case did not happen.) Statistics: A Guide to the Unknown includes two essays on the use of statistics by lawyers.

Gigerenzer et al. (2007), in their public interest article on health statistics, point out that lack of statistical literacy among both patients and physicians undermines the information exchange necessary for informed consent and shared decision making. The result is anxiety, confusion, and undue enthusiasm for testing and treatment.

Whatever your current interests or thoughts about your future as a statistician, I believe you will benefit from this course. A successful statistics course teaches you to identify questions a set of data can answer; determine the statistical procedures that will provide the answers; carry out the procedures; and then, using plain English and graphs, tell the story the data reveal.

The best way for you to acquire all these skills (especially the part about telling the story) is to engage statistics. Engaged students are easily recognized; they are prepared for exams, are not easily distracted while studying, and generally finish assignments on time. Becoming an engaged student may not be so easy, but many have achieved it. Here are my recommendations. Read with the goal of understanding. Attend class. Do all the assignments (on time). Write down questions. Ask for explanations. Expect to understand. (Disclaimer: I’m not suggesting that you marry statistics, but just engage for this one course.)

Are you uncertain about whether your background skills are adequate for a statistics course? For most students, this is an unfounded worry. Appendix A, Getting Started, should help relieve your concerns.

What Do You Mean, “Statistics”?

The Oxford English Dictionary says that the word statistics came into use almost 250 years ago. At that time, statistics referred to a country’s quantifiable political characteristics—characteristics such as population, taxes, and area. Statistics meant “state numbers.” Tables and charts of those numbers turned out to be a very satisfactory way to compare different countries and to make projections about the future. Later, tables and charts proved useful to people studying trade (economics) and natural phenomena (science). Statistical thinking spread because it helped. Today, two different techniques are called statistics.

Descriptive statistics8 produce a number or a figure that summarizes or describes a set of data. You are already familiar with some descriptive statistics. For example, you know about the arithmetic average, called

7 Introduction

8 Boldface words and phrases are defined in the margin and also in Appendix D, Glossary of Words. 9 A summary of this study can be found in Ellis (1938). The complete reference and all others in the text are listed in the References section at the back of the book.

the mean. You have probably known how to compute a mean since elementary school—just add up the numbers and divide the total by the number of entries. As you already know, the mean describes the central tendency of a set of numbers. The basic idea of descriptive statistics is simple: They summarize a set of data with one number or graph. This book covers about a dozen descriptive statistics.

The other statistical technique is inferential statistics. Inferential statistics use measurements from a sample to reach conclusions about a larger, unmeasured population. There is, of course, a problem with samples.

Samples always depend partly on the luck of the draw; chance helps determine the particular measurements you get.

If you have the measurements for the entire population, chance doesn’t play a part—all the variation in the numbers is “true” variation. But with samples, some of the variation is the true variation in the population and some is just the chance ups and downs that go with a sample. Inferential statistics was developed as a way to account for the effects of chance that come with sampling. This book will cover about a dozen and a half inferential statistics.

Here is a textbook definition: Inferential statistics is a method that takes chance factors into account when samples are used to reach conclusions about populations. Like most textbook definitions, this one condenses many elements into a short sentence. Because the idea of using samples to understand populations is perhaps the most important concept in this course, please pay careful attention when elements of inferential statistics are explained.

Inferential statistics has proved to be a very useful method in scientific disciplines. Many other fields use inferential statistics, too, so I selected examples and problems from a variety of disciplines for this text and its auxiliary materials. Null hypothesis significance testing, which had a prominent place in our exploration tour, is an inferential statistics technique.

Here is an example from psychology that uses the NHST technique. Today, there is a lot of evidence that people remember the tasks they fail to complete better than the tasks they complete. This is known as the Zeigarnik effect. Bluma Zeigarnik asked participants in her experiment to do about 20 tasks, such as work a puzzle, make a clay figure, and construct a box from cardboard.9 For each participant, half the tasks were interrupted before completion. Later, when the participants were asked to recall the tasks they worked on, they listed more of the interrupted tasks (average about 7) than the completed tasks (about 4).

One good question to start with is, “Did interrupting make a big difference or a small difference?” In this case, interruption produced about three additional memory items compared to the completion condition. This is a 75% difference, which seems like a big change, given our experience with tests of memory. The question of “How big is the difference?” can often be answered by calculating an effect size index.

8 Chapter 1

clue to the future

So, should you conclude that interruption improves memory? Not yet. It might be that interruption actually has no effect but that several chance factors happened to favor the interrupted tasks in Zeigarnik’s particular experiment. One way to meet this objection is to conduct the experiment again. Similar results would lend support to the conclusion that interruption improves memory. A less expensive way to meet the objection is to use inferential statistics such as NHST.

NHST begins with the actual data from the experiment. It ends with a probability—the probability of obtaining data like those actually obtained if it is true that interruption has no effect on memory. If the probability is very small, you can conclude that interruption does affect memory. For Zeigarnik’s data, the probability was tiny.

Now for the conclusion. One version might be, “After completing about 20 tasks, memory for interrupted tasks (average about 7) was greater than memory for completed tasks (average about 4). The approximate 75% difference cannot be attributed to chance because chance by itself would rarely produce a difference between two samples as large as this one.” The words chance and rarely tell you that probability is an important element of inferential statistics.

My more complete answer to what I mean by “statistics” is Chapter 6 in 21st Century Psychology: A Reference Handbook (Spatz, 2008). This 8-page chapter summarizes in words (no formulas) the statistical concepts usually covered in statistics courses. This chapter can orient you as you begin your study of statistics and later provide a review after you finish your course.

clue to the future

The first part of this book is devoted to descriptive statistics (Chapters 2–6) and the second part to inferential statistics (Chapters 7–15). Inferential statistics is the more comprehensive of the two because it combines descriptive statistics, probability, and logic.

Calculating effect size indexes is first addressed in Chapter 5. It is also a topic in Chapters 9-14.

Statistics: A Dynamic Discipline

Many people continue to think of statistics as a collection of techniques that were developed long ago, that have not changed, and that will be the same in the future. That view is mistaken. Statistics is a dynamic discipline characterized by more than a little controversy. New techniques in both descriptive and inferential statistics continue to be developed. Controversy

9 Introduction

Some Terminology

continues too, as you saw at the end of our exploration tour. To get a feel for the issues when the controversy entered the mainstream, see Dillon (1999) or Spatz (2000) for nontechnical summaries. For more technical explanations, see Nickerson (2000). To read about current approaches, see Erceg-Hurn and Mirosevich (2008), Kline (2013), or Cumming (2014).

In addition to controversy over techniques, attitudes toward data analysis shifted in recent years. The shift has been toward the idea of exploring data to see what it reveals and away from using statistical analyses to nail down a conclusion. This shift owes much of its impetus to John Tukey (1915–2000), who promoted Exploratory Data Analysis (Lovie, 2005). Tukey invented techniques such as the boxplot (Chapter 5) that reveal several characteristics of a data set simultaneously.

Today, statistics is used in a wide variety of fields. Researchers start with a phenomenon, event, or process that they want to understand better. They make measurements that produce numbers. The numbers are manipulated according to the rules and conventions of statistics. Based on the outcome of the statistical analysis, researchers draw conclusions and then write the story of their new understanding of the phenomenon, event, or process. Statistics is just one tool that researchers use, but it is often an essential tool.

Family incomes of college students in the fall of 2017 Weights of crackers eaten by obese male students Depression scores of Alaskans Gestation times for human beings Memory scores of human beings10

Population All measurements of a specified group.

Sample Measurements of a subset of a population.

Like most courses