Responsiveness-to-Intervention Symposium

December 4-5, 2003 * Kansas City, Missouri

The National Research Center on Learning Disabilities sponsored this two-day symposium focusing on responsiveness-to-intervention (RTI) issues. The speakers, discussants, and participants assembled represented the wide diversity of individuals with a vested interest in LD determination issues. Advocates, instructional staff, researchers, and state-level education officials brought their collective and considerable expertise to the discussions.

Joseph Jenkins of the University of Washington presented this invited paper during the symposium. For links to other papers and materials, visit the main Symposium 2003 page.


Candidate Measures for Screening At-Risk Students

Previous Page | Next Page
(Introduction) | (How at Risk)

Highlights and Important Distinctions from Research on Screening

In this section, I highlight ideas, findings, and important distinction that come out of research on screening.

Grade-Specific vs. Multi-Grade Screening Systems

Researchers have examined the validity of various reading and reading-related measures that hold potential for screening, usually focusing on a specific grade level (e.g. kindergarten). In contrast, two research groups have worked to identify measures that can be used to screen across grades K-2. Good. Simmons, and Kame'enui (2001) developed the screening/progress monitoring system known as the Dynamic Indicators of Basic Early Literacy Skills (DIBELS). Foorman, Fletcher, Francis , Carlson, Chen, Mouzaki, et al. (1998) developed the Texas Primary Reading Inventory (TPRI) which consists of screens and skill profiles. Both these systems have been scaled-up, with multi-district or state-wide implementations.

Satisfactory Reading Ability-How is the Criterion Defined?

Inasmuch as the immediate goal of screening is identifying those at risk for unsatisfactory reading outcomes, screening hinges on the selection of criterion reading measures and performance levels on those measures. Two decisions go into establishing a criterion. The first is deciding on a suitable measure of reading (i.e.,, content standard); the second is deciding the performance level (i.e., performance standard) that distinguishes between adequate and inadequate reading skill.

What Criterion Measure?

In the field of early intervention, the family of achievement tests developed by Richard Woodcock and associates come closest to a "gold standard" criterion test of reading ability. These include the Woodcock-Johnson-Revised (WJ-R) and Woodcock Reading Mastery Test-Revised (WRMT-R) subtests (e.g., Letter-Word Identification, Word Identification, Word Attack, Passage Comprehension, or one or more cluster scales that combine various subtests). The various subtests and clusters assess different aspects of reading. Much of the screening research that follows uses one or more of the Woodcock tests as criterion measures, the studies differ in the tests, subtests, and scales used for their criterion measure of reading.

Choice of criterion measures is critical in evaluating a screen because students performing satisfactorily on one criterion may perform unsatisfactorily on a different criterion measure. For example in Speece et al. (2003), no first-graders read below the 25th percentile on the WJ-R Letter-Word Identification subtest, even though several students performed very poorly (reading fewer than 10 words correct per min) on a test of oral reading fluency. Moreover, the accuracy of a screening measure in predicting different criterion tests may differ. For example, when Speece et al's. (2003) used low ORF as the criterion for unsatisfactory reading at the end-of-grade 1, two screens (Nonsense Word Fluency and Letter Naming Fluency) demonstrated strong sensitivity (86%) and specificity (81%-88%) in identifying at risk kindergartners. However, the same screening measures were only 50% sensitive in identifying poor readers (i.e. missing half) when WJ-R Word Attack test was the criterion measure. Screens that are well linked to one criterion measure may not be well linked to another criterion. The moral is-choose criterion measures carefully.

Distal and Proximal Criterion Measures. Most states define reading proficiency according to a standards-based test adopted by that state. Typically, such tests are not given until third or fourth grade, too distant to serve as criterion measures for early (e.g., kindergarten and first grade) screens. Thus, researchers must use more proximal criterion measures to evaluate the accuracy of screens. For example, to validate the TPRI kindergarten and beginning grade 1 screens, researchers used ending grade 1 achievement on the WJ-R Basic and Broad Reading Scales. To validate the beginning grade 2 screen, TPRI researchers used ending grade 2 WJ-R Broad Reading Scale. For the DIBELS, Good, et al. (2001) used a different strategy in selecting proximal criteria to judge screening accuracy. Rather than linking DIBELS screens to external measures such as the WJ-R Scales, they linked them to subsequent DIBELS, CBM, and state tests. Specifically, Initial Sound Fluency (ISF) at mid-kindergarten is linked to Phoneme Segmentation Fluency (PSF) at end-of-kindergarten, which is linked to Nonsense Word Fluency (NWF) at mid grade 1, which is linked to Curriculum Based Measurement-Oral Reading Fluency (CBM-ORF) at end-of-grade 1, and so on until CBM-ORF is linked to the Oregon Standards Assessment at end-of-grade 3.

At-risk and for what--Unsatisfactory or Very Unsatisfactory Reading Outcomes?

Two types of performance standards have been used in research on screening resulting in screening students who will demonstrate either: (1) unsatisfactory reading, or (2) very unsatisfactory reading.

Unsatisfactory Reading as a Criterion. Most screens focus on predicting unsatisfactory reading outcomes, where unsatisfactory is defined as performing below a standard (e.g., performing more than one-half year below grade level, performing below a "high standard" like those used by some states in a standards-based reading test), Both the DIBELS and TPRI define unsatisfactory reading based on criterion levels that result in a fairly large proportion of the population so classified.

Using state-mandated, standards-based reading tests to define unsatisfactory reading is on the increase. States use different performance standards to define satisfactory reading ability. This means that satisfactory reading in one state is not necessarily satisfactory reading in another. For example, Washington State sets a fairly high standard for passing its reading criterion, such that one-third of fourth-graders perform unsatisfactorily (i.e., do not meet the standard). Other states set lower standards (e.g., Texas, where fewer than 2% fail the state competency test in grade 3).

Another common convention for defining unsatisfactory reading is norm-referenced performance falling at or below the 25th percentile. This criterion also can result in designating a substantial proportion of the population as "unsatisfactory readers" depending upon how a school performs in relation to the national norm group.

Very Unsatisfactory Reading as the Criterion. Only the very lowest readers--those considered to have a reading or learning disability--qualify for this designation. O'Connor and Jenkins (1999) and Speece and Case's (2001) screening efforts were focussed on identifying this group (i.e., performance below the 10th percentile).

In comparing the results of different screening procedures it is important to ascertain whether the screen seeks to predict unsatisfactory or very unsatisfactory reading skill. The proportion of students identified at risk by a screen depends on this distinction. To illustrate, the TPRI which focuses on identifying unsatisfactory readers (those ending first-grade performing below the 23rd percentile on WJ-R Basic Reading Scale) identified 56% of mid-kindergartners as at risk (Foorman et al., 1998). In contrast, O'Connor & Jenkins (1999) focusing on very unsatisfactory readers (those ending first-grade below the 9th percentile on the WRMT-R) identified only 18% of mid-kindergartners as at risk. Depending on the performance standard used as the outcome (i.e., unsatisfactory or very unsatisfactory reading levels) the screening procedure will identify very different proportions of students as at risk.

Previous Page | Next Page
(Introduction) | (How at Risk)

IDEAs that Work logo

The symposium was made possible by the support of the U.S. Department of Education Office of Special Education Programs. Renee Bradley, Project Officer. Opinions expressed herein are those of the authors and do not necessarily represent the position of the U.S. Department of Education.