Skip to main content

Research & Teaching

Exemplar Teaching Practices in STEM Courses in U.S. Universities

Journal of College Science Teaching—March/April 2022 (Volume 51, Issue 4)

By Corbin M. Campbell

Based on the largest multi-institutional observational study of undergraduate courses in the United States, this article describes exemplar teaching practices in engineering courses as an interdisciplinary science field. The College Educational Quality (CEQ) research project studied 587 courses in nine different U.S. colleges and universities. This article reports on findings from engineering courses in the study. The article describes in-depth subject-matter teaching, using students’ prior knowledge and cognitive complexity, and also discusses the course contexts (e.g., size, faculty, mode) in which there is a greater likelihood of using exemplary practices in U.S. engineering courses.


College teaching is one of the most important aspects of student outcomes in higher education (Mayhew et al., 2016), yet it receives comparatively little attention in the higher education literature base (Pallas & Neumann, 2019). Additionally, there is evidence that many courses in science, technology, engineering, and mathematics (STEM) do not use best educational practices, which causes students to drop out of STEM, causing pipeline problems for STEM occupations. In the United States, the National Science Foundation (NSF) has created funding streams to improve teaching in STEM courses in universities.

Although there have been many studies of students’ views of college teaching through course evaluation surveys and national surveys, such as the National Survey of Student Engagement, few studies have examined college teaching from the perspective of expert observers. Observational studies demonstrate that student and faculty surveys typically overestimate the effectiveness of the teaching compared to expert observers (Campbell et al., 2018).

The purpose of this article is two-fold. First, this article describes four foundational teaching practices in higher education that have applications to STEM courses. Second, the article describes an observational study of college teaching and how engineering courses fare compared to non-engineering courses with regard to use of the exemplary practices, considering engineering as an interdisciplinary STEM field.

Four exemplary teaching practices

The four exemplary teaching practices described in the following sections have been cited in prior literature as important for student learning in university courses.

In-depth subject-matter knowledge

Exemplary university teachers focus on the core ideas of the discipline and study these in depth during their teaching (Neumann, 2014). An expert teacher understands the subject-matter ideas that are core to the disciplinary knowledge for the course. This teacher is able to “map” how these core ideas build on each other and therefore appropriately sequences the course ideas for students to develop subject-matter understanding. In addition to sequencing, the professor focuses on teaching the most important core ideas in depth rather than focusing on breadth (Shulman, 2004). The instructor carefully and intentionally selects the core ideas, then teaches these core ideas using multiple representations, engaging students in different ways with this content, such as by teaching the same core idea in different ways (e.g., lecture followed by group work or pair and share), with different kinds of examples, and with different media. Finally, an exemplary university teacher will connect the core ideas of the course to how these ideas play out in the field and in the discipline (Neumann, 2014).

Using students’ prior knowledge

There is evidence that students learn subject-matter knowledge better when a teacher connects the core ideas of the course to students’ prior knowledge and understandings (Neumann, 2014; Shulman, 2004). Students’ prior knowledge is both what they know about the subject-matter before the course and also how they understand that subject matter (i.e., their assumptions that undergird their knowledge). This prior knowledge and understanding can come from learning in prior courses and schools as well as from lived and cultural experiences that could provide knowledge from home and community and that may be based in students’ identities (e.g., race and gender). An exemplary university teacher will both surface and explore in depth what students already know about and how they understand the subject matter. They will then use students’ prior knowledge instructively in the course. For example, students who have grown up with parents who are engineers may have a different prior knowledge of the subject matter based on their lived experiences in their household.

Supporting learning and changing understandings

An exemplary university teacher will use students’ prior knowledge as a bridge to understanding new ideas (Neumann, 2014). Ideally, students’ prior knowledge can be a springboard for them to understand new ideas. For example, students’ prior experiences with falling can help them understand gravity. However, students’ prior knowledge will sometimes conflict with the new learned content. For example, if a student has grown up in a family that does not believe in climate change, understanding global warming will be a challenging concept. Likewise, for girls who have grown up in households or societies that are biased against women in STEM, the very nature of STEM learning may threaten their prior understandings and conceptions of themselves. In these cases, an exemplary teacher who already understands the students’ prior knowledge will help support the student through the dissonance they may experience with the new ideas. The professor will do this both cognitively and emotionally. For example, the professor may slowly, intentionally, and methodically reveal inaccuracies in prior thinking (cognitive) and provide a nonjudgmental, accepting, and supportive environment for students to come to terms with the new ideas.

Cognitive complexity

The fourth exemplary teaching practice is cognitive complexity. An exemplary university teacher will build up to higher levels of cognitive complexity in the course. The revised Bloom’s taxonomy contends that academic work can engage students in six increasingly complex levels: remember, understand, apply, analyze, evaluate, and create (Anderson & Krathwohl, 2001). There is evidence that connects higher-order cognitive assignments and several important student outcomes at the university level (Nelson Laird et al., 2014). In this way, teachers who require rote memorization will not be as effective as those who help students understand content, apply it to real-world settings, analyze its usefulness in the field, and then hypothesize revisions to the content for better use.


The College Educational Quality (CEQ) research study used data from a multi-institutional quantitative observational study of university classrooms across nine institutions and 587 courses (engineering courses = 120; non-engineering = 467). By quantitative observation, the study refers to an observation protocol that uses a closed-ended, highly structured rubric and coding scheme with raters specifically trained to rate according to the conceptual framework used in this study (Stallings & Mohlman, 1988; ).


The CEQ research team recruited institutions through contacts with two institutional consortia: the Higher Education Data Sharing (HEDS) Consortium and the American Association of State Colleges and Universities. This recruitment process yielded 18 institutions, and we purposefully selected nine to participate in this study. While this sample is not representative of U.S. higher education institutions, we included several different institutional types (e.g., private, public, research, liberal arts) to understand teaching and course rigor in engineering courses across different institutional types.

We employed a stratified random sampling of undergraduate courses within each institution, stratifying by faculty category (tenure line), class size, and discipline (division). We selected 350 courses from each institution, and sampled faculty could agree to consent for their course to be observed. More than one third (34.3%) of faculty who were invited to participate for one or more courses agreed to participate (similar to national faculty survey response rates). Observed courses were representative of course mode (online versus on-site) but slightly over-represented by tenure-line faculty.


Site teams of between 7 and 10 observers visited each institution for 1 week during the middle of the semester. While some observers rated across sites, many were placed on only one site team—as such, our design was not fully crossed. The raters completed an extensive training procedure, which included approximately 30 hours of training on the theory and evidence supporting each teaching construct, how to rate using the rubric according to the conceptual frameworks, the logistics of rating, and practice ratings. Raters were required to pass a test on their knowledge of the conceptual frameworks for the study and the observer procedures, as well as an inter-rater reliability certification. Each class was observed by two raters (with some exceptions due to scheduling conflicts, in which case one rater observed). The raters rated the entire duration of one class period in the middle of the semester.

Data sources

We created rubrics to assess teaching and academic rigor, assisted by both content experts (in college teaching and academic rigor) and methodological experts (in survey and rubric design) who tested the rubrics for content and response process validity.

Inter-rater reliability

To calculate the inter-rater reliability of the observation data, we used a one-way, absolute, average-measure, mixed-effects intra-class correlation (ICC) calculation (Hallgren, 2012). For observation, the ICC across all items was 0.705, with ICCs of sub-scales ranging from 0.664 to 0.787 (Cicchetti’s [1994] cut-off values: 0.60–0.74 good; > = 0.75 excellent).

Scale creation

The CEQ study created three teaching practices scales (subject matter knowledge, prior knowledge, and supporting changing views) and one academic rigor scale (cognitive complexity; Campbell et al., 2019). The scales were validated by confirmatory factor analyses (CFA) for construct validity and relationships among the four constructs in this study and one additional construct (not included in this article). Model fit indices indicated excellent fit of a five-factor intercorrelated model (RMSEA = 0.049, CI [0.041, 0.057], CFI = 0.965, TLI = 0.956, SRMR = 0.047). Each item was adequately tapping the latent construct of interest, as evidenced by standardized loadings ranging between 0.566 and 0.983. The constructs were highly reliable (Coefficient-H ranged from 0.809 to 0.970; Hancock & Mueller, 2006).

To create the scales used in this study, the ratings of the two observers were averaged, but they excluded any scores where the discrepancy between the two raters was greater than two response options. Scale scores are means of the individual items that were scored within each construct. For example, if a class had lecture and instructor questions, but no class discussion or activities, the cognitive complexity scale score would have been calculated as the mean of the lecture and instructor scores.


Given that the research questions explored whether engineering courses had different in-class teaching and rigor than other courses, descriptive differences rather than causal inference were of interest, and t-tests (p < 0.05) were employed. Engineering courses were tested in comparison to other courses on the effectiveness of teaching practices (i.e., subject matter knowledge, prior knowledge, and supporting changing views) and level of academic rigor (i.e., cognitive complexity). Second, we examined whether there were differences in effectiveness of the teaching practices within engineering courses across faculty category (tenure track or non-tenure track) and class size.


Comparing engineering to other disciplines

All four exemplar teaching practices in this study were statistically higher in non-engineering courses compared to engineering courses (p < 0.01). Means that show the differences can be found in Table 1.

Engineering courses by faculty category

Within engineering courses, certain teaching practices showed differences according to faculty category. Using students’ prior knowledge and supporting learning and changing understandings were both statistically higher for tenure-track than non-tenure-track faculty (p < 0.01 and p <0.05, respectively). By contrast, in-depth subject-matter knowledge and the level of cognitive complexity were not statistically different across faculty category. Means can be found in Table 2.

Engineering courses by class size

Supporting learning and changing understandings was the only exemplar teaching practice that was statistically different in small (25 students or less) versus medium and large class sizes (more than 25 students), with small courses having a higher mean (p < 0.01). Means can be found in Table 3.


In general, this study found that engineering courses in U.S. institutions that were included in this study largely were not using the four exemplary teaching practices effectively. Means for the first three exemplar teaching practice scales in engineering courses ranged from 2.5 to 3.4 across the four practices. In the rubric rating for the first three exemplar practices, 2 was “ineffective,” 3 was “somewhat effective,” and 4 was “effective.” The fourth exemplar practice (cognitive complexity) had a mean of 3.2 in engineering courses, indicating that, on average, the highest level of cognitive complexity reached during the class session was “apply.” Little analyzing and hypothesizing took place in these courses. By contrast, the majority of the class time was spent “understanding” the content.

Additionally, across all four scales, engineering courses scored lower on the four exemplar practices than non-engineering courses. These two pieces of evidence (the absolute ratings in engineering and the comparative ratings against all courses) seem to indicate that engineering courses in U.S. universities that were included in this study could improve in the use of the exemplar practices.

This finding is significant because the United States has a gap in fulfilling the STEM pipeline—not enough undergraduate students are enrolling in, remaining in, and graduating from STEM degree programs. Given that teaching practices have an important influence on student outcomes in higher education (Mayhew et al., 2016), improving teaching in engineering courses could improve the STEM pipeline in U.S. universities. STEM courses in the United States have often used more didactic models of teaching, which are professor-centered rather than student-centered, or required more passive rather than active learning techniques. The four exemplar practices discussed in this article require more focus on the student and the way the student understands and interacts with the subject matter. Changing the teaching practices may require a paradigm shift in STEM education that has important implications for all undergraduate science courses (Michel et al., 2018).

Likewise, the CEQ study found that within engineering courses, there are some course characteristics that may support the use of these practices. The tenure process is associated with greater use of students’ prior knowledge and supporting learning and changing understandings in engineering courses. Perhaps these practices require more involvement in the university and understanding of the student populations, which may be facilitated by a tenure-track or tenured faculty member.

One surprising finding is that class size was largely not associated with the use of the four exemplary practices. The only exception to this was supporting learning and changing understandings, where smaller class size was associated with stronger use of this exemplar practice. Of the four exemplar teaching practices, this practice requires knowing students the most (Bransford et al., 2000; Ladson-Billings, 1995). It requires not only knowledge of students’ understandings and experiences but also use of this knowledge and bridging students’ knowledge with the new course ideas. This practice may require a smaller class size to facilitate. Likewise, the kind of emotional support that some students may need to grasp the new course ideas may be difficult to facilitate in larger courses.

However, the finding that three exemplar practices (in-depth subject matter knowledge, using students’ prior knowledge, and the level of cognitive complexity) did not vary by class size is important. Some science professors may assume that student-centered active learning techniques are more feasible with a smaller class size. This study found that engineering faculty were able to enact several exemplar practices in classes that had more than 25 students in similar ways to faculty teaching smaller courses. As such, creating multiple representations of the course content or teaching core ideas of the course may be taught equally in courses with both small and large enrollments. Likewise, using students’ prior knowledge and the level of cognitive complexity were both similar across class sizes. This may indicate that faculty may be successful when enacting these practices in larger courses. This finding warrants further study with different thresholds for class size, given that small and large courses may be different across disciplines and institution types.


Engineering as a field is interdisciplinary in nature and therefore may reflect broader trends in undergraduate science education. Therefore, the CEQ study contributes to understanding undergraduate engineering education, but because the study is a large, multi-institutional observational one, insights may be gleaned for broader science education as well. This study demonstrates that there is room for improvement and that STEM faculty in particular may need training on these important teaching techniques. Further research is required to understand the difference in teaching for non-majors. Additionally, this study is limited in that it only studied four exemplar practices, when other practices may also be important for STEM courses, such as active learning, collaborative learning, and lab practices.


This work was supported by a fellowship from the National Academy of Education and the Spencer Foundation.

Corbin M. Campbell ( is associate dean of academic affairs and associate professor in the School of Education at American University in Washington, DC.


Anderson, L. W., & Burns, R. B. (1989). Research in classrooms: The study of teachers, teaching and instruction. Pergamon Press.

Anderson, L. W., & Krathwohl, D. R. (2001). A taxonomy for learning, teaching, and assessing: A revision of Bloom’s taxonomy of educational objectives. Addison-Wesley Longman.

Bransford, J. D., Brown, A. L., & Cocking, R. R. (Eds.). (2000). How people learn: Brain, mind, experience, and school. National Academies Press.

Campbell, C. M., Jimenez, M., & Arrozol, C. A. (2019). Education or prestige: The teaching and rigor of courses in prestigious and non-prestigious institutions in the U.S. Higher Education, 77(4), 717–738.

Campbell, C. M., Michel, J. O., Cervantes, D., & Wang, D. (2018, Nov.). Whose view? Comparing student survey, faculty survey and observers in research on college teaching [Roundtable presentation]. Association for the Study of Higher Education Annual Meeting, Orlando, FL, United States.

Cicchetti, D. V. (1994). Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology. Psychological Assessment, 6(4), 284–290.

Hallgren, K. A. (2012). Computing inter-rater reliability for observational data: An overview and tutorial. Tutor Quantitative Methods Psychology, 8(1), 23–34.

Hancock, G. R., & Mueller, R. O. (2006). Structural equation modeling: A second course. Information Age Publishing.

Ladson-Billings, G. (1995). But that’s just good teaching: The case for culturally relevant pedagogy. Theory into Practice, 34(3), 159–165.

Mayhew, M. J., Rockenbach, A. N., Bowman, N. A., Seifert, T. A., Wolniak, G. C., Pascarella, E. T., & Terenzini, P. T. (2016). How college affects students: Vol. 3. 21st-century evidence that higher education works. Jossey-Bass.

Michel, J. O., Campbell, C. M., & Dilsizian, K. (2018). Is STEM too hard? Using Biglan to understand academic rigor and teaching practices across disciplines. Journal of the Professoriate, 9(2), 28–56.

Nelson Laird, T. F., Seifert, T. A., Pascarella, E. T., Mayhew, M. J., & Blaich, C. F. (2014). Deeply affecting first-year students’ thinking: Deep approaches to learning and three dimensions of cognitive development. Journal of Higher Education, 85(3), 402–432.

Neumann, A. (2014). Staking a claim on learning: What we should know about learning in higher education and why. The Review of Higher Education, 37(2), 249–267.

Pallas, A., & Neumann, A. (2019). Convergent teaching: Tools to spark deeper learning in college. Johns Hopkins University Press.

Shulman, L. S. (2004). The wisdom of practice: Essays on learning, teaching, and learning to teach. Jossey-Bass.

Stallings, J. A., & Mohlman, G. G. (1988). Classroom observation techniques. In J. P. Keeves (Ed.), Educational research methodology, and measurement: An international handbook (pp. 469–474). Pergamon.

New Science Teachers Professional Learning STEM Teacher Preparation Teaching Strategies Postsecondary

Asset 2