Research and Teaching
Journal of College Science Teaching—January/February 2020 (Volume 49, Issue 3)
By David J. Weiss, Patrick McGuire, Wendi Clouse, and Raphael Sandoval
Although clicker use in general chemistry courses has increased, many chemistry faculty still do not believe they improve student learning, and they are not used in the majority of courses (Emenike & Holme, 2012; Gibbons et al., 2017; Terrion & Aceti, 2012). Gibbons et al. (2017) recently reported that 75% of chemistry faculty do not use clickers and that clickers are most commonly used in large courses of 300 to 500 students. Various approaches to using clickers in chemistry courses have been reported with mixed results. Holme (1998) reported that students did better on inclass clicker questions when they talked with each other and used peerassisted learning, defined as cooperative learning, where students work in groups to solve a problem. Students seemed to do better in the course overall when they worked in groups. Other authors investigated the effect of students using clickers individually compared with students not using clickers (King & Joshi, 2008). King and Joshi (2008) found a weak correlation between exam performance and using clickers individually, suggesting more quantitative research needed to be done.
Pearson (2017) evaluated the effectiveness of students working individually with clickers compared with students working in an assigned group with one clicker total in the group. He found that pharmacy students learning chemistry performed better and increased their course grade by 3.5%, when working in assigned groups rather than working individually. Stockwell, Stockwell, and Jiang (2017) noted that in a biochemistry course, students working in groups (compared with students working individually) resulted in better quiz performance when they worked together on clicker questions at the end of the class period. Warfa (2016) also investigated the effect of working cooperatively in groups and found that students working together performed better in the course compared with independent traditional learning. In general, research has supported that group work has some benefits over working individually on problems during lecture.
However, not all chemical literature has demonstrated that clickers are beneficial to learning. Morice, Michinov, Delaval, Sideridou, and Ferrières (2015) found that comparing results where students worked individually with those where students engaged in peer learning in a chromatography course did not result in a significant difference in performance. Moreover, Liu et al. (2017) suggested that the learning improvement from clickers is unclear in chemistry courses and that more research needs to be conducted. MacArthur and Jones (2008) also reviewed the current literature on the use of clickers in chemistry courses and indicated that although clickers might improve engagement, the results were mixed on whether they improved student performance in the course. It is generally agreed that engagement increases when clickers are used in chemistry courses; however, it is less clear that there is overall learning improvement (Emenike & Holme, 2012; Hoekstra, 2008; King & Joshi, 2008; Liu et al., 2017).
Research into clickers outside of chemistry has focused on the effect on student learning in different disciplines. For example, McDonough and Foote (2015) investigated the effect on students working independently using clickers compared with the effect on students who worked in a group sharing one clicker in an English course. They found that students working in one group together with one clicker performed better on clicker questions in class than those who worked by themselves; this finding is similar to Pearson’s (2017) work in chemistry courses. Research has also been performed to determine if using clickers improved course performance in general. BlascoArcas, Buil, HernándezOrtega, and Sese (2013) investigated the effect of adding clickers to a business course and found some correlation between student learning and cooperative group work, but it was a statistically weak correlation. These authors suggested that a comparison between a group of nonclicker users be compared with clicker users for further study. Lopez, Love, and Watters (2014) reported that the discipline may affect how clickers are used and how effective they are. They recommended that more studies be done to draw a conclusion on clickers’ effectiveness (2014). Liu et al. (2017) reported a similar result in the biosciences where some literature suggested an improvement in student learning, but others did not. In general, little empirical work has been done relating clicker use to student course and exam performance over multiple semesters.
Clearly there is not agreement regarding which clicker procedures are best to incorporate in one’s teaching strategies in chemistry courses or whether these strategies are effective in student learning improvement. Most work compares individual clicker work to group work, or group work to traditional lecture with only a minimal number of semesters’ worth of data. To address this research gap, we report a larger comparison of multiple semesters of data from over 1,500 students to explore the group differences. In addition, we compare groups of students as they themselves form groups in class (unassigned groups) with formally assigned groups, which has not been previously reported within the literature.
More specifically, in this work we evaluated whether clickers were effective in a General Chemistry I course of around 100 students by comparing three teaching strategies: (a) traditional lecture, (b) clickers used in unassigned peerlearning groups where students work either in groups or individually, and (c) required clicker participation with assigned peerlearning groups where students work together on all clicker questions during lecture. The research questions driving this study are: (a) Is there a statistical difference among student performance in traditional lecture, in unassigned groups using clickers, and in assigned peerlearning groups using clickers; and (b) In what ways do the different teaching and learning approaches impact the number of DFWs?
Data for this study were collected over a 13year time frame (2001–2014) across 26 semesters of General Chemistry I. Although the courses offered occurred over the 13year period, the general course composition (including primary learning objectives, content coverage, quizzes, and major assessments) remained consistent. The same professor taught all sections of the course analyzed in this study. Class sizes for general chemistry ranged from 63 to 104 students during fall and spring semester courses and from 10 to 26 students during summer semester courses. Fall, spring, and summer terms were included, and the total contact time and material with students were commensurate. No summer terms were taught using the traditional instruction method. A comparative description of each instructional framework is included in Table 1.
Table 1  


General Chemistry I focuses on atoms, molecules and ions, stoichiometry, reactions in aqueous solutions, thermodynamics, atomic structure and periodicity, electron configuration, bonding and structure, hybridization and molecular orbital theory, gases, intermolecular forces, and colligative properties. General Chemistry I sections were taught three times a week for 15 weeks at 75 minutes each, with a lab associated with the course that was worth 20% of the course grade. Student performance during each course was quantified using three midterm exams, four short quizzes, and one final cumulative examination. We used two editions of the same textbook, which were very similar over this time period (Kotz & Triechel, 1999; Kotz, Triechel, & Weaver, 2006).
Our research plan involved analyzing the course across three different teaching and learning frameworks over a decade. Initially, the course was taught for multiple semesters in the traditional manner where the faculty lectured and did some example problems during the lecture, asking the class questions along the way. Other semesters involved asking students to form their own groups, which we are calling “unassigned groups.” The professor observed that some students worked together on clickers and others worked individually, with perhaps half of the students working in changing groups as students sat in different seats throughout the semester. Finally, multiple semesters were taught where the faculty assigned students into groups of four based on where they were sitting in the room and asked them not to move seats during the semester. Clicker questions were the same for each course whether students were in unassigned groups or assigned groups, and the course grading was the same, with the same grading scale used in comparison of course grades. The same number of hourly and final exams, quizzes, and homework sets with similar difficulty were assigned.
Univariate analysis of variance (ANOVA) is a hypotheses testing method that compares the significance of mean differences on the dependent variable between or among several treatment groups (Agresti & Finlay, 1997); thus, because we were interested in the group differences in course outcomes based on the different teaching strategies and were testing one quantitative dependent variable for each research question with one categorical independent variable, ANOVA as used in Crossgrove and Curran (2008) was the appropriate measure. The question of which teaching strategy resulted in the best learning outcomes was germane to the analysis; therefore, in addition to the primary ANOVA, we completed pairwise comparison procedures (Tukey Honest Significance Difference [HSD]) to determine which groups were different from other groups (e.g., does one instructional framework result in a higher course score than the others?). All tests were performed in SPSS Statistics 25.
For appropriate use of any inferential statistical test results, certain data assumptions must be met. In the case of univariate ANOVA, these assumptions include independence of samples, normal distributions, and homogeneity of variance. Within the literature, we find that oneway ANOVA is not sensitive to violations of the normality and homogeneity of variance assumptions (Gravetter & Wallnau, 1999; Kennedy & Bush, 1985); therefore, screening methods focused on missing data and independence of samples. Our research design includes observations of different individuals in each group, and there are no participants in more than one group at each time; therefore, we can assume that the data fulfill the criteria for independence of observations.
Course grading records from 2001 to 2014 were merged with student preparation data, which is housed in central university databases. The resulting population included 1,856 cases spanning general chemistry courses over 27 semesters. Of the 1,856 cases in the population, 305 were removed from the analysis because the student withdrew from the course before the official institutional census date, which left 1,551 active cases. Of the remaining cases, 176 were missing ACT scores from the master data set. Because the Math ACT scores were only used for demographic screening to ensure populations were similar, no additional steps were taken to infer missing values. We believe the demographics provide an accurate snapshot of our student body and their capabilities. To ensure that the student population did not vary over time, a oneway ANOVA was used to screen demographics for differences between the instructional groups. Variables used for screening were incoming student academic credentials (high school GPA and ACT math scores). Table 2 contains descriptive statistics for the student population.
Table 2  


ANOVA screening for demographic differences determined that there were no statistically significant differences at the p < .05 level in high school GPA between groups, F(2, 1345) = 0.90, p = .403, or mean ACT Math Score, F(2, 1372) = 2.958, p = .052. Based on this analysis, we concluded the student population did not change significantly over time and student level of preparation would not confound the final ANOVA results.
To compare student performance, we analyzed the distribution of letter grades across each instructional framework. All course grades are reported on the same grading scale using the following percentage ranges: 0%–55% = F; 56%–60% = D; 61%–64% = C; 65%–69% = C+; 70%–74% = B; 75%–79% = B; 80%–84% = B+; 85%–89% = A; 90%–100% = A. In Table 3 we include the total percentage of students who earned an A through C grade (grouped together to demonstrate success) as well as the percentage of students who earned a D or an F, along with the percentage of students who dropped (W) after the institutional official census date (usually after 6 out of 15 weeks).
Table 3  


The percentage of students earning letter grades of A, B, or C increased from 72% in traditional lecture sections to approximately 77% in the peerassisted clickers with assigned groups sections, demonstrating a nearly 5% increase of students receiving passing grades, which is higher than the 3.63% difference observed between the traditional and unassigned group. We observed a large increase in percentage of A and B grades, and a precipitous drop in C grades as we moved from traditional lecture to peerassisted learning with unassigned groups and assigned groups, as well as a decrease in the percentage of Ds in the course. The drop rate decreased from roughly 14% to 12%, and the rate of withdraw (W) or incomplete is inversely proportional to the level of active learning (e.g., more active learning = lower drop rate) in the form of unassigned or assigned peerassisted learning. Additionally, the percentage of students who earned an F decreased comparing the traditional lecture format to the peerassisted learning with clickers in groups’ instructional framework. The effect persisted in the DFW range in general, as well as with the percentage of students who earned a DFW, decreasing from 27.8% (traditional) to 23.5% peerassisted learning with clickers and assigned groups.
If course grades are affected on average, we would expect performance on assessments to change with teaching strategy as well. A comparison of course performance measures by instructional strategy shows that student performance increases over four learning assessments (homework, quiz, midterm exams, final exam) when the course approach was changed from traditional lecture to unassigned groups, and finally to assigned groups. We observed an increase in performance in homework and quiz scores going from traditional lecture to peerassisted learning with either unassigned or assigned groups. In both areas, the mean difference between the traditional lecture and the assigned group was statistically significant (p < .05); however, the mean difference in both homework and quiz scores between the unassigned group and assigned group were not statistically significant. The most substantive improvement between the traditional lecture strategy and the peerassisted learning with assigned groups was with midterm exams and the final examination. In those cases, the percentage change between the traditional lecture courses and the clickers with assigned groups was +11.2% for final examination scores and +5.61% for midterm examination scores, with a statistically significant increases (p <. 05) from traditional lecture.
To evaluate the impact of the instructional approaches on student performance, we evaluated final course grades in each course section. Table 4 presents the results of average course grades across each instructional approach, with an average course grade of 72.30% for the traditional sections, improving to 75.02% in the sections for clickers with unassigned groups, and to 76.97% for clickers with assigned groups.
Table 4  


We can see that the overall course grade was better (Table 2, as denoted by total points); however, to test whether these improvements were statistically significant, a oneway between subjects ANOVA was conducted to compare the effect of instructional framework on overall course grade. We observed a significant effect of instructional framework on overall course grade at the p <. 05 level for three conditions, F(2, 1537) = 6.135, p = 0.002. Post hoc comparisons using Tukey HSD test indicated that the mean course outcome scores were significantly different. The mean course score for the traditional instruction group (M = 73.31, SD = 19.92) was significantly lower than the mean course score for the peerassisted learning with assignedgroups category (M = 76.97, SD = 18.94). This post hoc testing indicated there was not a significant difference between the traditional lecture framework and the peerassisted learning with the unassigned groups. It may be that students working independently or in unassigned groups on clicker questions do not improve their performance on quantitative questions such as those in our course. Our analysis demonstrates that students need to work with other students consistently in the assigned groups to achieve a real impact on their grades. Stockwell et al. (2017) recently observed that students working in groups performed better on their course assessments when working individually to solve more complex problems. Perhaps we are also observing something related here: deeper learning or better course performance from working with other students and the ability to use that learning to approach new problems.
This study includes several limitations. First, because it was conducted over more than a decade, it was impossible to use identical questions on all exams and quizzes across all semesters. However, questions of the same difficulty, with different compounds and numbers, were used on quizzes, midterm exams, and final exams, and the overall grading criteria for the course remained consistent. In addition, fall, spring, and summer semesters were included in these results, and we did not do an analysis on the effect of whether it was the fall, spring, or summer semester on performance. Third, this study only analyzed the effects of teaching strategies on one course (General Chemistry I). Finally, the same lecture notes were used, as well as clicker questions for all courses, but there might be some growth in the faculty member’s teaching ability. However, students did not report experiencing this, as faculty evaluations were consistent for the faculty member over the entire time frame of the study. Future work should be done to investigate the effects of instructional frameworks across different contexts such as different courses, class sizes, and course levels.
The results of this study demonstrate how active learning can have a positive impact on undergraduate student course grades in a general chemistry course, particularly when embedded in peerassisted/cooperative learning environments. In future work, we plan to further enhance opportunities for cooperative learning by integrating peer leaders into the classroom (Jenay, Lewis, Oueini, & Mapugay, 2016). Peer leaders will provide support to their peers during the problembased learning sessions of the course, effectively reducing the studenttoteacher ratio and providing targeted support for struggling classmates. We will measure the impact of peer leaders on undergraduate students’ course grades and retention. Additional studies will be run to investigate the effects of instructional framework on instructor performance and student perceptions.
This shift from passive to active learning in moving from traditional lecture to using clickers did not result in a change in lecture material in our courses. We used the same examples in the traditional class as in the courses where clickers were used, but in the clicker courses they were clicker questions. Therefore, the change in instructional approach to clickers did not mean that the course material and assessments needed to be changed, and this new course approach did not require a complete revision of the material presented. This report demonstrates the effect of different instructional approaches, and we predict using clickers with assigned groups can be implemented in biology, physics, and math courses at the college level as well, with similar effect. Based on these results, we demonstrate that simply adding clickers to traditional lecture is not enough to statistically improve student performance in a statistically relevant manner.
As we changed from traditional lecture to more active learning with clickers and assigned groups, we spent more of the class time on problemsolving and lectured only in small snapshots of material between problems. However, we have found that students wanted a mixture of lecture and peerassisted learning. We recommend that faculty keep some lecture in the course because students still need some additional introduction to the material, and in many cases may need help with the mathematical steps of some problems. Ultimately, we also found that some lecture was needed to supplement the prelecture materials of the textbook and skeletal lecture notes.
In this article, we demonstrate that peerassisted learning in assigned groups with clickers resulted in improved performance of students. We did not have to sacrifice any lecture material that we would have covered in the traditional lecture approach. This is the first report, to our knowledge, of a direct comparison in the same course of student performance with traditional lecture, unassigned groups with clickers, and peerassisted learning using assigned groups for clickers. We found that students performed better on course assessments when clickers were added to the course, and their course grades and retention within the course were highest when peerassisted learning with assigned groups were used together with both active learning strategies. Simply adding clickers to the lecture, which may improve course grades somewhat, does not seem to result in a meaningful improvement in course performance according to our statistical analysis. It is critical that students work in the assigned groups to make this an active learning technique with significant impact. Finally, we believe that the active learning components described in this report are easily translatable to other STEM courses such as biochemistry, biology, math, and physics and that these techniques can be used to improve student learning and retention in those courses in a similar fashion.
David J. Weiss (dweiss@uccs.edu) is an associate professor in the Department of Chemistry and Biochemistry, Patrick McGuire is an associate professor in the Department of Teaching and Learning, Wendi Clouse is a senior research analyst in the Office of Institutional Research, and Raphael Sandoval was a student in the Department of Chemistry and Biochemistry, all at the University of Colorado in Colorado Springs, Colorado. Mr. Sandoval is now a science teacher at Cheyenne Mountain High School in Colorado Springs.