### NSTA WebNews Digest

### Science Scope : Science Sampler

### News Categories

### Science Topics

### Education Topics

## Using Simple Statistics to Ensure Science-Fair Success

**4/1/2007 - Wilson Gonzalez-Espada**

As a science-fair judge in several school-level, regional, and state science-fair competitions in Arkansas, each year I look forward to science-fair season. Judging science fairs is a way to see students’ enthusiasm for science, their courage as they overcome nervousness to share findings with a real scientist, and their pride when they show their hard work, condensed in a nicely prepared display. Many students show an excellent mastery of the science content they researched and the rationale for the science processes they followed. Science-fair projects offer a unique opportunity for students to be engaged in hands-on science experiences, some of them years long.

Obviously, not all science projects are of the same quality. Despite the best teacher efforts, I still judge experimental projects where students fail to control for confounding variables; that is, variables that might affect the experiment but are not the variable of interest. An example of this is a project in which a student aims to find what type of soil helps seeds grow better by planting different types of seeds in potting soil, backyard soil, and sandy soil. Because there is the chance that different types of seeds grow at different rates, measurements of the plants’ height will be affected by more than one factor, making meaningful height comparison all but impossible.

I also continue to see and judge projects with no experimental treatment. Examples of these are exclusively encyclopedia-type research projects and demonstration projects like the baking soda volcano simulation or the tornado simulation. As a judge that recognizes the importance of experimental research in science without downplaying other ways of knowing in science, I consider nonexperimental projects to be at a disadvantage compared with experimental projects in the same category.

Projects with a sample size of one also continue to make an appearance. An example of this is when a student aims to find what type of soil helps seeds grow better by planting a seed in one cup each of potting soil, backyard soil, and sandy soil. In this case, simple seed variability might throw off any meaningful conclusions. On more than one occasion, students have told me of an instance where the one plant assigned to a given treatment has died and they had to modify their project.

Occasionally, however, only one thing separates a science-fair project from excellence: the lack of simple statistical analysis. Other aspects of the project may look fine—thorough background research, focused purpose, well-worded hypothesis, detailed procedure, excellent description of efforts to control variables other than the one of interest, measurements in SI units, multiple subjects for each treatment group, and use of a control group. This type of project is especially painful to judge. Students argue passionately that the average for one or several treatments is better than the average for the control group, or students are fully convinced that the average for treatment A is better than the average for other treatments. A closer analysis of the data suggests that the variability in the raw data from which the averages were calculated is greater than the difference in the averages. In other words, the averages appear to be different but the difference is due to chance rather than treatment effect. Students are especially heartbroken when this fundamental misinterpretation is pointed out. How can students avoid the trap of apparent differences between treatments or between control and experimental treatments? The answer is statistics.

Before you flash back to mathematical algorithms and formulas from college classes and stop reading, keep this in mind: Statistical software available on the internet makes the use of statistics very user-friendly. The *t*-test is one of the simplest types of statistical tests and one of the easiest to use via the internet. It also happens to be the test that is most applicable to many experimental science-fair projects.

The *t*-test was developed by the chemist William S. Gossett around 1910 to determine to what extent samples deviated from the standard formula in a brewery. This test can be applied to determine whether the difference in the averages for two or more treatments is mostly caused by the experimental treatment or whether the difference can be explained by random variation (Weinberg and Goldberg 1990). What do students need to use this test in the context of a school science-fair project?

- Two or more comparison groups (control and one treatment, or two or more treatments).
- A sample size of 10 or more for each experimental group.
- Numerically measured data (no categories, even if labeled with numbers).

For example, assume a student is interested in determining whether the amount of water added to a plant influences its growth after four weeks. In this case, the student should have at least 10 cups where a seed will receive 15 mL of water (control), 10 cups where a seed will receive 30 mL of water (treatment A), and 10 cups where a seed will receive 45 mL of water (treatment B). The student will keep other variables (such as lighting, temperature, type of soil, amount of soil, type of seed, and seed depth when planted) constant through the experiment. The only variable that will change (usually referred to as the *independent variable*) is the amount of water the seeds will receive. After four weeks, the plant height (the *dependent variable* in this case) will be determined for each plant in each group (see Figure 1 for data that will be used as an example here).

The question is, are the averages about the same? Averages of 6.8 cm and 6.7 cm are close. Are they basically the same number or is the difference significant? How about 6.1 cm and 6.7 cm? Are they really different or not? What about 6.1 cm and 6.8 cm?

The student might argue that the best plant growth will occur when 30 mL of water is added to the plant. However, the small difference between the treatments is hard to ignore. The only reliable way to say that any two averages are close to each other or way apart (significantly different) is with a statistical test. The student cannot ballpark or guesstimate it just by looking at the data. This is where the online *t*-test comes in handy (the *t*-test I commonly use is available at http://graphpad.com/quickcalcs/ttest1.cfm).

Keeping the default commands (“enter up to 50 rows” and “unpaired *t*-test”), if you enter the 10 height values for the control group and the 10 height values for treatment A and click on calculate, you will obtain the *P-value*, or probability value. This coefficient is defined between 0 and 1, and it indicates the chance that the averages between the two groups are similar to each other. For a school science-fair project, a *P*-value of more than 0.10, or 10%, suggests the averages are most likely not different from each other (many sources, including the online *t*-test software suggested here, indicate 0.05, or 5%, as the cutoff point; but unless the assumptions of the *t*-test are fully met, the more flexible cutoff point, such as 0.10, or 10%, is suggested). This can be interpreted as the lack of effect of the variable of interest on the data. A *P*-value of less than 0.10 suggests the averages are most likely different from each other. This can be interpreted as an effect of the variable of interest on the data.

After analyzing the sample data shown in Figure 1, the *P*-value between the control group and treatment A is 0.0033, or approximately 0.3%, suggestive of a real difference in plant height between the plants that received 15 mL of water and those that received 30 mL of water. To compare the data between the control group and treatment B and between treatments, a new page is opened on the website and the data are entered. The data show that there is a difference between the control group and treatment B (*P*-value is 0.012, or 1.2%). With a *P*-value of 0.85, or 85%, it appears that there is no real difference between treatments A and B. Note that only two groups can be compared at one time. This is not the recommended course of action for more formal research (for *n* > 30, an ANOVA test with post-hoc analysis is suggested) but it will work fine for student projects.

The use of a statistical test does not guarantee that the conclusions cannot be challenged. For example, a cutoff *P*-value of 0.10, or 10%, implies that there is a 90% chance that the group averages are different due to chance and a 10% chance that the difference was caused by variable manipulation. Ten percent is a small number, but it is not zero. Even a *P*-value of 0.001, or 0.1%, fails to discard error as a possible (though unlikely) explanation. Random and systematic experimental errors are always present and must not be ignored during the discussion of the findings.

What type of projects can be addressed using a* t*-test? Many options come to mind:

*Do students of different grade levels have different reaction times?*For this project students will measure reaction time for at least 10 students per grade level.*Who can solve a puzzle faster, boys or girls?*This project requires timing how long at least 10 boys and 10 girls take in solving a puzzle.*What brand of battery will last longer?*For this project, 10 batteries, each a different brand, are connected to a light bulb or a personal fan and timed to see how long they take to go dead.*How many pennies can different brands of paper towels hold?*This project requires holding a sheet of paper towel by the corners and adding pennies one at a time until the paper towel breaks. This must be repeated at least 10 times for each brand of paper towel.

From the perspective of an experienced science-fair judge, good data analysis is important in science-fair projects. The *t*-test is one of the most accessible choices for teachers and students, especially when online statistical software is used. However, the fact that students recognized the need for and used a statistical test, even in less than ideal circumstances, demonstrates a better understanding of the experimental process. The website mentioned above includes many other statistical tests with detailed descriptions. Some of them, especially the chi-square, may also be suitable for science-fair projects.

Science teachers and students do not need extensive knowledge of statistical tests to use them. We use computers, cars, and other sophisticated equipment without necessarily being aware of their inner workings. All students need to know is that there is a way to help them determine whether average differences are real or apparent.

I am delighted to see projects with statistical analysis and students with the ability to explain the importance of a *t*-test or to describe what statistical test was the most appropriate for their project. Regardless of how simple or complex the project, students who use statistics demonstrate an increased knowledge of science content and processes. These projects will likely be judged as top in their category.

* Wilson Gonzalez-Espada* (wilson.gonzalezespad@atu.edu)

*is an assistant professor of Physical Science/Science Education at Arkansas Tech University in Russellville, Arkansas.*

Reference

Weinberg, S.L., and K.P. Goldberg. 1990. *Statistics for the behavioral sciences*. Cambridge: Cambridge University Press.