Be careful about using the results of standards-based assessments to evaluate reading comprehension by individual students, warns a recent study published in Educational Evaluation and Policy Analysis.
Reading comprehension is based on a wide and varied skill set that includes working-memory skills, syntactic awareness, phonological awareness and word reading. Additional diagnostic testing is needed to identify students’ specific problems, the researchers advise.
“From a policy perspective, the findings of the present study, consistent with previous research in this area, suggest that standards-based assessments are of limited use for diagnostic decision making at the level of the individual learner,” the researchers write.
For students, the stakes of statewide standards-based tests can be high when mandated assessments determine who will earn a high school diploma or promotion to the next grade, they note. To be ethically and politically defensible, statewide assessments must be used in conjunction with diagnostic measures for evaluations of individual students, the researchers say.
“Even though standards-based assessments are designed for accountability purposes, they are often used to provide feedback about individual students’ abilities with the assumption that meaningful instructional plans, at the classroom or school level, will be developed as a result,” the researchers write.
| Diagnostic tests used in study :|
• WRAT-3 Reading subtest
Diagnostic measures, administered either individually or in classrooms, are necessary to identify strengths and weaknesses, to develop effective remediation programs, or to determine whether a student is a viable candidate for high-school graduation, they say.
Compared to norm-referenced tests
For the study, researchers compared how a cohort of 1,111 4th-grade students performed in a battery of diagnostic tests for reading skills with how they fared in a standards-based reading assessment administered to all 4th-graders in British Columbia, called the Foundation Skills Assessment. Little research has been done on the relationship between standards-based measures and norm-referenced measures, according to the study.
The researchers conclude there is a weak relationship between the two types of testing. The broad proficiency categories masked significant variation among individual students, report researchers Andre Rupp and Nonie Lesaux.
There was a great deal of overlap in the range of minimum to maximum scores in the reading comprehension skill components tests across the three broad proficiency categories for the general assessment (exceeds expectations, meeting expectations, below expectations). For example, the range of student scores for simple spelling, phoneme identification, syllable identification and rhyme detection was the same across all three categories(0-6, 0-8, 0-8, 0-10, respectively).
In the meets expectations category, researchers found a wide range of abilities: 40% were high achievers (scoring well on both word-level skills tests or the working memory and language skills tests), 30% low achievers (scoring low on both types of tests), and 30% were mixed (scoring well on one type of skills tests but not the other).
An important subgroup of below expectations children had reading difficulties that were not primarily related to component skills. As a result, remedial instruction in foundational word-level skills and related cognitive and linguistic skills for this group would not be productive, they say.
An economically viable approach is to administer diagnostic testing to those students who fail to meet standards to better pinpoint their difficulties.
| Lessons Learned• Educators should exercise caution in using state|
assessment results to evaluate individual students•Assessment results in reading comprehension
are not diagnostic of specific reading difficulties•Additional diagnostic testing is needed
to plan interventions for individual students
•Proficiency categories of statewide assessments
“Given the reliance on standards-based assessments to guide educational decision making and the costs of implementing these assessment systems, it is unlikely that their use and overuse will be downplayed in the future,” the researchers conclude.
“However, there is a need to seriously consider whether the properties of these tests support any interpretation at the level of the individual and similarly whether there is any instructional information to be gleaned from the results.”
“Meeting Expectations? An Empirical Investigation of a Standards-Based Assessment of Reading Comprehension” by Andre Rupp and Nonie Lesaux. Educational Evaluation and Policy Analysis. Winter 2006 Volume 28 Number 4 pp. 315-333.
Published in ERN May/June 2007 Volume 20 Number 5