Exploring harmful algal blooms with real-world data
By AMY HAMMETT AND CHAD DORSEY
The modern world is practically defined by data. Students in classrooms today will enter an age in which fundamental interactions with the world—from consumer purchases to climate decisions, political and community actions, and even behavioral interactions—will intertwine with data analysis, representation, and understanding. How can we prepare today’s learners to become data-fluent? How do we do so while readying them for a future we can scarcely predict? Attempting to navigate toward answers while awash in data can challenge even the most seasoned data scientist. Nevertheless, a few tips and examples can serve as guideposts to creating experiences that equip students to thrive in a world of messy data.
To learn with data, students need data to explore. This can be deceptive—data-rich experiences typically involve much more than a straightforward science lab. Solving real problems with data means identifying authentic questions that are meaningful to students and provide a foundation for deep inquiry. Such situations often lend themselves well to project-based learning approaches and are great opportunities for integration across subject areas. Perhaps most critically, they begin with a question of genuine interest to learners and genuine connection to the world at large. For purposes of illustration, we choose one such topic—the frequency of harmful algal blooms (HABs) in waters within and around the United States.
Algal blooms have increased significantly as temperatures have risen, and while some blooms may be simply visual nuisances, a surprising number pose dangers to humans and animal life (Figures 1 and 2). Such harmful algal blooms can modify food webs, alter the taste or quality of seafood or drinking water, or even produce cyanotoxins harmful to both water and land organisms (U.S. Geological Survey n.d.). These toxins have been implicated in human and animal illness and death in at least 43 states (USGS n.d.), and more than 80% of samples throughout all 50 U.S. states and Canada have tested positive for the most potent class of the toxins, called microcystins.
The topic of HABs is a good example of a subject primed for data-centered inquiry. Problems caused by HABs are highly relevant across the United States, and thus are of potential interest to students in many different locales. The causes of HABs are a complex blend of physical, chemical, biological, hydrological, and meteorological conditions (USGS n.d.). Moreover, despite the problem’s prevalence, many important unanswered scientific questions surround the functioning and occurrence of these toxins, making them an ideal launching pad for authentic investigation.
Most importantly for our purposes, research on HABs requires the use of data. These data can be produced by students through investigation of local lakes and ponds or even drinking water from the tap. Data sets from existing scientific surveys provide a second source for potential exploration. Additionally, data can be produced by and fed into statistical models, allowing students to generate and test ideas about the scientific mechanisms at play through mathematical modeling.
Ready to launch into your first data-rich investigation? Some basic guidelines can help ensure that you and your students are headed in the right direction and make progress toward interesting and meaningful learning.
As students first begin investigating a topic, their relationship with the topic and its data are crucial considerations. Ideal topics support deep investigation and are relevant and motivating to students. By focusing on HABs in our example, we’ve chosen an anchoring phenomenon that inherently grabs students’ interest right where they live. (After all, how could scummy green algae growing in your drinking water not grab anyone’s interest?)
Additionally, with the HAB investigation, students begin by producing their own data from local sources. When beginning data investigations such as this one, students benefit enormously when they are able to leverage hands-on understanding of the processes used to produce the data. For example, our students collect freshwater from their local drinking water reservoir for water quality testing in their school lab (Figure 3). Back at the lab, they filter the water using a standard vacuum pump to quantify nitrogen compounds (NO3 and NH4); total suspended solids (TSS); chlorophyll; and phycocyanin, the pigment in blue-green cyanobacteria (Figure 4).
After measuring light penetration with a Secchi disk (Figure 5), students use sensors, microscopes, and chemical tests to measure turbidity, pH, dissolved oxygen, temperature, light, phosphorus compounds, nitrogen compounds, chlorides, and phytoplankton.
Experiencing and investigating a data-related phenomenon firsthand in this way—before looking at larger, messier, and more abstract data—allows students to first examine self- collected data with the same data tools they will also use to explore larger data sets. (See Using CODAP to explore HAB data.)
The absence of a preformed curriculum map in such investigations is both a liberating opportunity and an inherent pedagogical challenge. A systems lens can be a powerful tool to bring along for the journey. When moving into this new territory, invite students to first define the system under investigation—in particular, to specify its boundaries. Such boundaries supply scaffolds as students define their investigations and determine qualities of relevant data sets. Have students consider how the phenomenon is interconnected within and outside the defined system, like how the hydrosphere, geosphere, atmosphere, and biosphere interconnect in this student-defined system (Figure 6).
Then, with this systems view of the landscape, co-create a plan to investigate these systems and how they may or may not be changing. Both their driving questions (Figure 7) and their systems models outline a variety of approaches that your group of students might take to investigate and to generate claims and evidence about the phenomenon (Figure 8).
Preparing students for a future drenched in data means dipping them in the data pool—but not drowning them! Identifying the right level of complexity for the data sets they investigate is an important aspect of any data experience. Several guidelines can be helpful here. The first question is one of size itself. Helping students get a feel for “big data” does indeed mean ensuring their data sets contain enough data points (“cases”) and parameters (“attributes”) such that they can make interesting and original discoveries. However, it doesn’t mean sending them into a forest with no hope of return. The goal should be to find a “just right” data set (Rubin 2019), one with plentiful variation and enough variables to provide interesting avenues of investigation. How do you know you’ve found the right one? While it’s a bit of an art, ideal data sets should be just overwhelming enough to encourage open investigation, but not so complicated that students throw up their hands in despair at first contact.
For our HAB example, data sets from the USGS Water Data for the Nation website (see Figure 9 and “On the web”) can be narrowed to a specific body of water of interest and truncated further to emphasize specific parameters. However, the more comprehensive Water Quality Samples for the Nation (see “On the web”) can be accessed as investigations grow deeper and more complex.
Building data-rich investigations also means considering other facets of data complexity. Melissa Kjelvik and Elizabeth Schultheis (2019) have proposed a variety of features to consider when engaging students with authentic scientific data explorations. These include considering
In the HAB data, these various features of authentic data are front and center. When producing their own data, for example, students have the opportunity to determine variables to measure and make some choices about techniques they might use to measure them. This selection can be limited by providing more or less scaffolding or pre-determining measuring instruments, for example. Similarly, obtaining professional HAB data directly from the USGS website is a meaningful exercise in true-life data messiness; in succeeding, students must call upon relevant data fluency skills that go far beyond making graphs into computational tools, file types, and data structures. Making different choices, such as providing students with pre-curated data sets, avoids the need for this wrangling, but deprives students of the personal agency and the opportunity to build data science chops. The right choice is the one you need to determine as you weigh all the considerations.
When students begin to investigate data sets initially, they can benefit from beginning with a smaller data set at first, like this short timeframe of thermal stratification (Figure 10) from their student-constructed thermistor chain (Figure 11). Doing so allows them to appreciate the individual attributes of the data and to think about the relationships involved. With a smaller data set, students can begin to make sense of the data set’s context and explore its variation, both essential aspects of gaining familiarity with a data-focused scenario.
When investigating HAB data, students may start with only the data they have produced through a personal investigation. A few key prompts help them evaluate this new “data landscape.” What attributes are involved in their data set? What are their maximum and minimum values? What visualizations are appropriate to employ in examining the data set? Are there any typical or expected values? What is the “shape” of the data? Once students have gained an understanding of their personal data and can describe the representations they are generating and seeing, moving to a larger data set brings them into a new world of investigation (Figure 12). As they enter that world, their previous experience acts as a map for redoubled sense-making.
Once students are familiar with their data, true inquiry can begin. With robust data related to a core anchoring phenomenon at the helm, the full realm of science and engineering practices lie open for the journey.
That’s the joy—and often the problem, too. As a teacher, helping lead students through this new land can sometimes feel uncertain. If there’s anything to know about the world of data science, it’s that big, messy data sets harbor as many answers as there are question-askers. With an interesting data set, the chances of not finding something worth exploring are extremely slim. Of course, that’s the peril as well—taking stock of students’ status and helping guide them on their journey are both important, but not as important as giving your learners some latitude to find and investigate their own questions and the freedom to follow where the data leads.
Recall the purpose of the endeavor. As teachers of science, our role is not to be the leaders, but the guides. And, as we prepare students for an unknown, data-filled future, our role is to do whatever it takes to ensure that they are so bold and so tenacious that they are empowered to seek answers along trails they blaze themselves. We do well to recall the words of Jean Piaget when he professed that the principal goal of education is to “create men and women who are capable of doing new things, [and] not simply repeating what other generations have done” (1964). Guiding students through data-rich investigations is the first step in building these capabilities for years to come.
Duckworth E. 1964. Piaget rediscovered: A report of the conference on cognitive studies and curriculum development. Paper presented at the Cognitive Studies and Curriculum Development Conference, March. Ithaca, NY: Cornell University.
Kjelvik M., and Schultheis E. 2019. Getting messy with authentic data: Exploring the potential of using data from scientific research to support student data literacy. CBE—Life Sciences Education 82 (2): 1–8.
National Research Council. 2012. A framework for K–12 science education: Practices, crosscutting concepts, and core ideas. Washington, DC: National Academies Press.
Rubin A. 2019. Facebook or Instagram? Teens explore data about technology use. Hands On! Spring/Summer 2019.
U.S. Environmental Protection Agency. 2019. Cyanobacteria Assessment Network (CyAN) Mobile Application in the Google PlayTM Store [Press release].
U.S. Geological Survey. n.d. NWQP Research on Harmful Algal Blooms (HABs)