Caution needed in use of Observation Survey

iStock_000026636782XSmallOne of the most widely used assessment tools for early reading intervention, especially for use with Reading Recovery, is the Observation Survey of Early Literacy Achievement (OS). How valid, reliable and useful is this assessment tool in identifying the need for intervention in young readers and monitoring their progress? How useful is it for evaluating the effectiveness of a reading intervention program?

The researchers concluded that while the OS tests are valid for screening of early reading development, educators should be cautious about using the tests for diagnostic assessment and progress monitoring.

A team of Texas researchers used the OS tests to evaluate 182 first grade students from six urban schools. They published their results in the Reading Research Quarterly. Of the 182 students, 52 were typically developing readers and 130 had been identified as requiring early reading intervention with the Texas Primary Reading Inventory (TPRI), which measures students’ knowledge of letter sound correspondence, phonological awareness, and ability to read words in lists. The at-risk students had been randomly assigned to two reading intervention programs. Researchers administered five OS subtests Word Identification, Writing Vocabulary, Text Reading (Running Record of Text Reading), Dictation and Letter Identification at the beginning and end of first grade. They also administered subtests of the Woodcock Johnson III Tests of Achievement, the Test of Word Reading Efficiency and the phonemic awareness subtests of the Comprehensive Test of Phonological Processes.

Marked ceiling, floor effects

“The marked ceiling and floor effects of the Text Reading, Letter Identification, Word Identification and Dictation subtests imply that these measures do not possess sufficient scale sensitivity to detect small increments of progress, although they may be validly used to
track student attainment of benchmarks as they master early literacy skills,” the researchers say. The usefulness of the OS for screening might be enhanced by adding benchmarks denoting risk for reading difficulties, the article says. The OS tests may be more useful for beginning readers than for more developed readers, the article says.

Use of the OS tests as a tool for planning a complete early reading instructional program also may be limited, the researchers say.

“Although the OS can inform a teacher about a student’s development of reading accuracy, and some key sub-skills related to reading, it is less useful in guiding specific instructional decisions related to phonemic awareness, vocabulary, and comprehension, aspects of reading addressed in Reading Recovery and most other literacy programs,” the article says. “Classroom teachers and reading interventionists must rely on their informal observations and judgments about these domains because they are not specifically assessed by the OS.”

Simple one- or two-minute assessments of oral reading fluency repeated over time may be a more useful aide for educators modifying instructional programs, the article notes, especially given that it takes 30-45 minutes per student to administer the OS tests.

The researchers also conclude that educators should be careful in aggregating results for groups of students in program evaluation. The most meaningful assessment would be a comparison of the number of children who meet or do not meet OS benchmarks, the researchers say.

Here are researchers’ other comments and observations about use of the OS tests for early readers:

“A combination of the Word Identification and Writing Vocabulary tasks was a strong predictor of year end attainment of average performance in basic reading skills. However, students who fail to meet the screening criteria of Word Identification should be administered other tests in order to reduce false positive errors and provide diagnostic information regarding students’ instructional strengths and needs,” the article says. The Text Reading measure, while it informs the teacher about the strategies the student is using to identify unknown words, also tends to over identify students as at risk, the Reading Research Quarterly article says.

The OS Letter Identification, Word Identification and Dictation subtests are less useful in evaluating end of year outcomes in first grade because they measure skills on which mastery is expected and have ceiling effects that reduce their sensitivity, the researchers say. In the study, 30% of the typically developing readers were within three points of the ceiling in the Dictation test at pretest “Comparing the progress made by students who began with very low scores on this measure with their average achieving peers is clearly biased because the amount of progress possible in the normally developing group is quite small.” Educators should be cautious in comparing the gains of struggling readers who have low scores at the beginning of first grade with those of typically performing students who begin at higher levels, the study says.

“Validity, reliability, and utility of the Observation Survey of Early Literacy Achievement” Reading Research Quarterly Volume 41, Number 1, January/February/March 2006 pp 8-34.

Published in ERN February 2006 Volume 19 Number 2

Leave a Reply

  • (will not be published)