Why do assessment?
In recent years, a number of National Science Foundation and National Research Council reports have advocated the need to improve undergraduate science instruction and enhance science literacy for all students (e.g., NRC 1996; 2002). Many undergraduates, especially women and traditionally underrepresented groups, avoid higher-level science and mathematics. Several students who switch from science and mathematics majors in college report “poor teaching by faculty” as a significant reason for switching (Seymour and Hewitt 1997, p. 32). Examples of poor teaching in the science fields at the undergraduate level include an emphasis on memorizing facts, lack of application of concepts, dullness, and failure to encourage connections among concepts (Kardash and Wallace 2001).
One effort to improve the learning of science is through case study teaching. Teachers use realistic or true narratives to provide opportunities for students to integrate multiple sources of information in an authentic context, and may engage students with ethical and societal problems related to their discipline (Lundeberg, Levin, and Harrington 1999; Herreid 1994). Although such methods are being used in some university-level courses in the fields of science, mathematics, business, and education, relatively little empirical research has examined whether and how these case-based teaching approaches have the desired effects of promoting deep understanding, enabling transfer of ideas to new contexts, and making learning more motivating or valuable for certain student populations, especially traditionally underrepresented groups (Lundeberg, Levin, and Harrington 1999; Lundeberg et al. 2002). What happens in classes that use case study teaching? Do students learn more in case-based science courses? Are they able to make more connections among concepts? Can they apply these concepts to real-life situations? How might case study teaching in science promote scientific literacy in students?
We propose that empirical research, particularly well-designed classroom experiments, has the potential to lead to a strong line of research regarding case-based teaching in science. In this article, we illustrate how such experiments might be done, describing what needs to be measured and how this approach is superior to much of the current research on case study teaching. We urge faculty to think systematically about principles of scientific inquiry in education (Shavelson, Towne, and the Committee on Scientific Principles for Education Research 2002), to use a research design that will provide valid and reliable evidence for claims they want to make and/or processes they want to better understand, to carefully consider measurement issues, and to avoid common problems in evaluation.
Why do investigations in classrooms?
Shavelson, Towne, and the Committee on Scientific Principles for Education Research (2002, p.2) propose that the basic core of scientific inquiry is the same in all fields, including education, and that the scientific enterprise is guided by the following set of norms, or principles, that shape inquiry:
- Pose significant questions that can be investigated empirically.
- Link research to relevant theory.
- Use methods that permit direct investigation of the question.
- Provide a coherent and explicit chain of reasoning.
- Replicate and generalize across studies.
- Disclose research to encourage professional scrutiny and critique.
Linking classroom investigations of case study teaching to prior research and relevant theories is essential for posing significant questions that lead to quality research. Research questions fall into three categories:
- Description—What is happening?
- Cause—Is there a systematic effect?
- Process or mechanism—Why or how is it happening? (Shavelson, Towne, and the Committee on Scientific Principles for Education Research 2002, p. 99)
Ideally, research questions drive the research design, and methods for investigation are chosen to answer the questions and eliminate alternative explanations. However, one crucial difference in educational research is the importance of context in inquiry. Although scientific research in the lab generally follows a particular paradigm and is carefully controlled to advance or dispute a specific theory, classroom research presents complexity that is not readily illuminated by a single theory nor measured by a single method. “Most educational interventions can probably be captured better with diverse theorizing and multiple methods…. Educational intervention research should be diversely theoretical because single theories never fully capture complex phenomena.” (Pressley, Graham, and Harris, Forthcoming).
Berliner (2002, p. 18) called education research the “hardest science of all.” He pointed out that educational researchers face particular problems that are context sensitive, which limit generalizations and theory building. Any teaching situation has many variables that influence the interactions among and between the teachers and students. Such variables include students’ motivation to learn, their socioeconomic status, whether they have been given choices among cases, and their academic major. Indeed, research in education needs to “account for influential contextual factors within the process of inquiry and in understanding the extent to which findings can be generalized” (Shavelson, Towne, and the Committee on Scientific Principles for Education Research 2002, p.80).
Establishing strong research depends on selecting insightful questions that fill a gap in the knowledge base, by referring to prior research, relevant theory, and classroom practice (Shavelson, Towne, and the Committee on Scientific Principles for Education Research 2002). Researchers studying case-based science teaching might ask multiple questions in conducting rigorous evaluation research. For example, in seeking to understand whether case-based teaching had an effect—caused improvements in the outcome of interest, such as motivation or student understanding—a classroom experiment that minimizes unconfounded comparisons would add to the knowledge base. However, to capture the context of how cases were used in the classrooms under investigation, detailed descriptions (“What is happening?”) and investigations into how case-based teaching influenced outcomes (“How is it happening?”) are necessary (Shavelson, Towne, and the Committee on Scientific Principles for Education Research 2002).
Classroom-based experiments generally have more external validity than randomized experiments in education; university settings rarely permit random assignment to courses, and research conducted within course settings may have very different results than educational research conducted within a laboratory setting (Lundeberg and Fox 1991). However, for researchers interested in designing classroom experiments, we caution them to pay attention to internal validity. Internal validity refers to how confidently we can conclude that the change in the dependent variable was as a result of the treatment (independent variable) and not some extraneous variable (see Campbell and Stanley 1966 for explanations of the eight extraneous variables that can impact a study’s internal validity—history, maturation, testing, instrumentation, statistical regression, selection, experimental mortality, and selection interactions). Pressley and Allington (1999, p. 4) also raised valid questions that might help researchers assess whether internal validity criteria have been met:
- Did the control group receive an intervention?
- Were the control participants exposed to the same material as the intervention participants?
- Was there counterbalancing of instructors and experimental condition?
- Were the treatment conditions explicitly described so that replicability studies could be carried out?
- Was there equivalent instructional time in the intervention and control treatments?
- Was there an evaluation of whether the intervention and control treatments were delivered as intended?
- Was the sample size sufficiently large to detect meaningful effects?
- Was the false-positive error rate controlled appropriately and were the analyses appropriate?
- Were the effect sizes reported?
Rules of thumb for designing a good study
To investigate the potential of emerging case-based approaches, it is important to develop theoretical understandings of case-based teaching in science, including theories of learning and problem solving in classroom settings. Researchers are advised to design a classroom experiment using counterbalancing to avoid confounding comparisons. A powerful classroom experiment also requires a careful consideration of measurement issues.
One example of a good study design
A research design for a study involving two instructors teaching the same course to two different classes might look something like Table 1.
In this design, the two science classes are counter-balanced for the use of case-based instruction to teach units A and B. Thus, an instructor teaching Class 1 will teach Unit A (Infectious Disease) using the case-based method, while the instructor teaching Class 2 will be using a traditional lecture-based method (or another control such as a lab) to teach Unit A. When the two instructors teach Unit B (Genetics), the two instructors will switch the teaching method, with the students in Class 2 experiencing the case-based teaching and Class 1 being taught with a traditional method. Ideally, this would be replicated one more time during the semester with units C and D to add additional data. Thus, the units are balanced for the method of teaching (case-based vs. lecture-based) to minimize potential instructor bias, balanced over two or more units to control for subject effects, and balanced over two courses containing different students to control for student effects. Such a research design isolates the impact of case-based instruction on students’ conceptual understanding and may help avoid any instructor bias that might have been present had there been only one instructor.
How and what should we measure?
Research with case-based instruction goes beyond the research design. We need to consider how to assess the impact of case-based instruction on student understanding and be confident that our measures assess what it is that we want to assess. The development and use of methodologically sophisticated, qualitative methods—interviews, journal entries, observations and case studies of particular students—should be considered as alternatives to performance data, such as standardized objective tests or constructed case analysis tests. Such methods would allow us to access students’ cognition and understanding of science concepts, and to understand more about how and why cases affect student understanding in science.
To assess the impact of case-based instruction, familiarize yourself with educational theories of student learning, and think carefully about what you want to know about how cases are potentially affecting students. Do you want to assess instructional outcomes such as student performance, and/or other outcomes such as motivation, persistence, and attendance? Do you wish to measure students’ conceptual understanding and their ability to transfer their understanding to new contexts, rather than their ability to remember facts and figures? These kinds of assessment tools may not be readily available and are unlikely to be typical course exams provided by textbook companies. If you are going to develop open-ended assessments then you must use a reliable, valid rubric, give pre- and postassessments blind to the time of testing, and collect data demonstrating the reliability of the rating system. You should use multiple measures to provide more insightful kinds of evidence illuminating what (and how) cases may be affecting student performance. Be sure the measures can be replicated and have an absence of ceiling and floor effects.
We support the advice provided by Pressley, Graham, and Harris, (Forthcoming): “As a general rule of thumb, the more implementation and process data collected, the better the chance of explaining why interventions work relative to comparison conditions, when they do, and, why they fail, when they do not. Without such data, the researcher can only conjecture about why the intervention produces the effects it produces….”
We think that the most informative educational intervention research programs of the future are going to use multiple research approaches, with analyses complementing one another to provide information about various aspects and impacts of the intervention.
Let’s consider the previous example of assessing the impact of case study teaching in science on students’ understanding of Infectious Disease (Unit A) and Genetics (Unit B). Measures of student understanding in this study might include pre- and posttests of conceptual understanding (e.g., typical problems) and use of two more cases to measure transfer of knowledge to new situations not discussed in class, one for each of the two units. Thus, the students from Class 1 and 2 would be given a case on infectious disease and genetics (different from the one Class 1 discussed in the class), and this case would serve as a transfer case. The students would then be asked to solve the problem posed in the transfer case. Because only Class 1 uses case-based instruction for infectious disease principles, such a measure would assess student understanding of infectious disease concepts and the impact of case-based instruction on students’ conceptual understanding, as compared to the control group of students. Similar procedures would be used to assess whether students acquired conceptual understanding of the genetics concepts and the impact of case-based instruction.
Ideally, the conceptual tests as well as the transfer tests should be scored by both instructors who would be blind to whether the tests were pretests or posttests. Moreover, the instructors would code inter-rater reliability to ensure that their scoring was consistent. The instructors might videotape the case discussions, and conduct focus-group interviews with the students to add additional insight into how and what students were learning from the case situations. They might trade classes to conduct the focus-group interviews, since students might be more honest with a person other than their course instructor. An additional point of data might be to give students their own pre- and posttests to compare and ask them to analyze and reflect on changes in their performance.
We encourage science professors to collaborate with education professors and/or educational psychologists to investigate the use of case study teaching in science. Such collaborations lead to stronger designs and the accumulation of research that will strengthen the field. Finally, we encourage professors to collaborate across sections of courses and across institutions to investigate the efficacy of case study teaching in science.
Mary A. Lundeberg (mlunde@msu.edu) is a professor in the Department of Teacher Education and Aman Yadav is a doctoral candidate in the Learning, Technology, and Culture program at Michigan State University in East Lansing, Michigan.
This is Part I of a two-part series dealing with the assessment of case study teaching. Part II will look at the problems with previous studies and what we know in spite of the difficulties.
Acknowledgment
This material is based upon work supported by the NSF under Grant No. 0341279. Any opinions, findings, conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF.
References
Berliner, D.C. 2002. Educational research: The hardest science of all. Educational Researcher 31 (2): 18–20.
Campbell, D.T., and J.C. Stanley. 1966. Experimental and quasi-experimental designs for research. Chicago: Rand McNally.
Herreid, C.F. 1994. Case studies in science—A novel method in science education. Journal of College Science Teaching 23 (4): 221–29.
Kardash, C.M., and M.L. Wallace. 2001. The perceptions of science classes survey: What undergraduate science reform efforts really need to address. Journal of Educational Psychology 93 (1): 199–210.
Lundeberg, M.A., and P.W. Fox. 1991. Do laboratory findings on text expectancy generalize to classroom outcomes? Review of Educational Research 61 (1): 94–106.
Lundeberg, M.A., B.B. Levin, and H.L. Harrington. 1999. Where do we go from here? Reflections on methodologies and future directions. In Who learns what from cases and how: The research base for teaching and learning with cases, eds. M.A. Lundeberg, B.B. Levin, and H.L. Harrington. Mahwah, NJ: Lawrence Erlbaum Associates.
Lundeberg, M.A., K. Mogen, M. Bergland, K. Klyczek, D. Johnson, and E. MacDonald. 2002. Fostering ethical awareness about human genetics through multimedia-based cases. Journal of College Science Teaching 32 (1): 64–69.
National Research Council (NRC). 1996. National science education standards. Washington, DC: National Academy Press.
National Research Council (NRC). 2002. Investigating the influence of standards. Washington, DC: National Academy Press.
Pressley, M., and R. Allington 1999. What should reading instructional research be research of? Issues in Education 5 (1): 1–35.
Pressley, M., S. Graham, and K. Harris. Forthcoming. The state of educational intervention research as viewed through the lens of literacy intervention. British Journal of Educational Psychology.
Seymour, E., and N.M. Hewitt. 1997. Talking about leaving: Why undergraduates leave science. Boulder, CO: Westview Press.
Shavelson, R.J., L. Towne, and the Committee on Scientific Principles for Education Research, eds. 2002. Scientific research in education. Washington, DC: National Academy Press.