In a 1967 monograph, Michael Scriven (1967) suggested that it was useful to distinguish between two kinds of curriculum evaluation process. In the first, he suggested that evaluation “may have a role in the on-going improvement of the curriculum” (p. 41) while in the second, “the evaluation process may serve to enable administrators to decide whether the entire finished curriculum, refined by the use of the evaluation process in its first role, represents a sufficiently significant advance on the available alternatives to justify the expense of adoption by a school system.” (pp. 41-42) He also proposed that it would be worthwhile, “to use the terms ‘formative’ and ‘summative’ to qualify evaluation in these roles.”
Although Scriven had intended that these terms be used only to apply to the evaluation of curricula, the following year, in his work on ‘mastery learning’ Benjamin Bloom (1968) extended the use of the terms ‘formative’ and ‘summative’ to the evaluation (i.e., assessment) of individual students. In a subsequent paper, he explained the difference:
Quite in contrast is the use of “formative evaluation” to provide feedback and correctives at each stage in the teaching-learning process. By formative evaluation we mean evaluation by brief tests used by teachers and students as aids in the learning process. While such tests may be graded and used as part of the judging and classificatory function of evaluation, we see much more effective use of formative evaluation if it is separated from the grading process and used primarily as an aid to teaching." (Bloom, 1969)
While Bloom’s proposals about mastery learning were influential, and regular, frequent assessment was widely regarded as essential to effective instruction, the term ‘formative assessment’ was not widely used in primary and secondary schools. However, in higher education, and particularly in the UK, many universities introduced what they called “formative assessments” into their courses. These were typically assessments designed to mimic the assessments that students would take at the end of a course, and which allowed students to gauge their progress. While such assessments did, sometimes, provide insights into what a student might do to improve, the emphasis was on indicating the extent of progress towards an educational goal. More importantly, whether an assessment was described as formative or not depended primarily on its location in a sequence of instructional activities—“any assessment before ‘the big one’” (Wiliam, 2010, p. 36) as it were. While such assessments were also, sometimes, claimed to provide insights that might improve learning, there is little evidence that they did so.
Although the term “formative assessment” was not in widespread use, in the second half of the 1980s, a number of research reviews appeared that indicated that classroom evaluation processes could have a substantial positive—or negative—influence on learning. Some of these, such as reviews by Natriello (1987) and Crooks (1988) focused more on the negative aspects of classroom assessments, particularly in terms of impact on motivation. Others, such as those by Bangert-Drowns, Kulik, Kulik, and Morgan (1991) and Bangert-Drowns, Kulik, and Kulik (1991) showed the substantial benefits of regular classroom testing, for long-term recall through the effects of priming, and what we would now call retrieval practice (Karpicke & Blunt, 2011). Still others, looked at the way that regular classroom assessment might support teachers in making instructional adjustments, in the way envisaged by Bloom. In particular, Fuchs and Fuchs (1986) found that when the results were used to adjust instruction, especially when teachers used a pre-determined rule to decide what to do in the case of a given assessment outcome, there was a large positive impact on student learning.
When, some years later, Paul Black and I sought to update these reviews (Black & Wiliam, 1998) we realized that using the term ‘formative’ to describe the position an assessment occupied in a course of study, or the assessment itself, represented what Gilbert Ryle (1949) called a “category mistake”—ascribing to something a property it cannot have—since the same assessment procedure could yield evidence that could be used summatively or formatively (Wiliam & Black, 1996). There is, therefore, no such thing as a formative assessment. There are, however, assessments whose results can be used formatively. If, following Cronbach (1971), we define an assessment as a procedure for drawing inferences, then we can use the terms formative and summative to describe the kinds of inferences that we make from assessment results. When the inferences we make are about an individual’s level of achievement, or her or his suitability for a particular programme of study, then the assessment is functioning summatively. When the inferences are about how to improve an individual’s learning then the assessment is functioning formatively.
This insight provides a clear basis for distinguishing between the terms “assessment for learning” and “formative assessment.” As defined by Black, Harrison, Lee, Marshall, and Wiliam (2004), assessment for learning is “any assessment for which the first priority in its design and practice is to serve the purpose of promoting students’ learning. It thus differs from assessment designed primarily to serve the purposes of accountability, or of ranking, or of certifying competence” (p. 10). Assessment for learning would therefore include the use of assessment to motivate students, or to provide retrieval practice. Such assessment becomes formative assessment when the evidence elicited by the assessment is interpreted and used to improve instructional decisions.
In the 20 years since Black and Wiliam’s review of the impact of classroom assessment processes on learning appeared, the evidence on the value of classroom assessment processes—especially if they focus on providing retrieval practice and supporting instructional adjustments—has accumulated (see chapter 4 of Wiliam, 2016 for a summary of the research). However, many issues remain unresolved. Some of the most significant of these include the magnitude of the effects of such assessment processes on student learning (Bennett, 2011; Kingston & Nash, 2011, 2015), the skills and knowledge that teachers need to effectively implement such assessment (Heitink, Van der Kleij, Veldkamp, Schildkamp, & Kippers, 2016), and the domain-specificity of formative assessment practices (Andrade, Bennett, & Cizek, 2018). Perhaps most significantly, little is known about the best ways of implementing such formative assessment practices at scale.
Further reading
Bennett, R. E. (2011). Formative assessment: A critical review. Assessment in Education: Principles Policy and Practice, 18(1), 5-25.
Wiliam, D. (2011). What is assessment for learning? Studies in Educational Evaluation, 37(1), 2-14.
Wiliam, D. (2016). Leadership for teacher learning: Creating a culture where all teachers improve so that all learners succeed. West Palm Beach, FL: Learning Sciences International.
References
Andrade, H. L., Bennett, R. E., & Cizek, G. J. (Eds.). (2018). Handbook of formative assessment in the disciplines. New York, NY: Routledge.
Bangert-Drowns, R. L., Kulik, C.-L. C., Kulik, J. A., & Morgan, M. (1991). The instructional effect of feedback in test-like events. Review of Educational Research, 61(2), 213-238.
Bangert-Drowns, R. L., Kulik, J. A., & Kulik, C.-L. C. (1991). Effects of frequent classroom testing. Journal of Educational Research, 85(2), 89-99.
Bennett, R. E. (2011). Formative assessment: A critical review. Assessment in Education: Principles Policy and Practice, 18(1), 5-25.
Black, P., Harrison, C., Lee, C., Marshall, B., & Wiliam, D. (2004). Working inside the black box: assessment for learning in the classroom. Phi Delta Kappan, 86(1), 8-21.
Black, P., & Wiliam, D. (1998). Assessment and classroom learning. Assessment in Education: Principles, Policy and Practice, 5(1), 7-74.
Bloom, B. S. (1968). Learning for mastery. Evaluation Comment, 1(2), 1-12.
Bloom, B. S. (1969). Some theoretical issues relating to educational evaluation. In R. W. Tyler (Ed.), Educational evaluation: New roles, new means (Vol. 68(2), pp. 26-50). Chicago, IL: University of Chicago Press.
Cronbach, L. J. (1971). Test validation. In R. L. Thorndike (Ed.), Educational measurement (2 ed., pp. 443-507). Washington DC: American Council on Education.
Crooks, T. J. (1988). The impact of classroom evaluation practices on students. Review of Educational Research, 58(4), 438-481.
Fuchs, L. S., & Fuchs, D. (1986). Effects of systematic formative evaluation: A meta-analysis. Exceptional Children, 53(3), 199-208.
Heitink, M. C., Van der Kleij, F. M., Veldkamp, B. P., Schildkamp, K., & Kippers, W. B. (2016). A systematic review of prerequisites for implementing assessment for learning in classroom practice. Educational Research Review, 17, 50-62. doi: http://dx.doi.org/10.1016/j.edurev.2015.12.002
Karpicke, J. D., & Blunt, J. R. (2011). Retrieval practice produces more learning than elaborative studying with concept mapping. Science, 331(6018), 772-775.
Kingston, N. M., & Nash, B. (2011). Formative assessment: A meta-analysis and a call for research. Educational Measurement: Issues and Practice, 30(4), 28–37.
Kingston, N. M., & Nash, B. (2015). Erratum. Educational Measurement: Issues and Practice, 34(1), 55.
Natriello, G. (1987). The impact of evaluation processes on students. Educational Psychologist, 22(2), 155-175.
Ryle, G. (1949). The concept of mind. London, UK: Hutchinson.
Scriven, M. (1967). The methodology of evaluation. In R. W. Tyler, R. M. Gagne & M. Scriven (Eds.), Perspectives of curriculum evaluation (pp. 39-83). Chicago, IL: Rand McNally.
Wiliam, D. (2010). An integrative summary of the research literature and implications for a new theory of formative assessment. In H. L. Andrade & G. J. Cizek (Eds.), Handbook of formative assessment (pp. 18-40). New York, NY: Taylor & Francis.
Wiliam, D. (2016). Leadership for teacher learning: Creating a culture where all teachers improve so that all learners succeed. West Palm Beach, FL: Learning Sciences International.
Wiliam, D., & Black, P. J. (1996). Meanings and consequences: A basis for distinguishing formative and summative functions of assessment? British Educational Research Journal, 22(5), 537-548.
Please sign in or register for FREE
If you are a registered user on Research Communities by Springer Nature, please sign in