Single tests over multiple time points
The difficulties with discrepancy and low achievement definitions reflect in part the use of a single time point. Both discrepancy and low-achievement approaches fail because they attempt to infer a breakdown in one or more of the unobserved causes of achievement from simple observation of the achievement endpoint. In the language of latent variable models, such systems that attempt to infer a latent or unobserved cause from an observed or manifest effect are "underidentified".
The solution to these difficulties is to not base LD identification on an assessment at a single point in time (MacMillan & Siperstein, 2002). Indeed, in studies of the discrepancy model, it has been recommended that two or more assessments should be given to enhance the reliability by which the cut point is assessed (Shepard, 1980). However, the costs and absurdity of multiple lengthy assessments to establish a child's "true score" around an achievement distribution should be obvious. To illustrate the problem, and in a situation where multiple assessments are essential, students are allowed to take the Texas high school exit exam up to 9 times because the odds of 9 consecutive failures for a student whose true ability lies above the cut point are negligible. And again, any attempt to simply improve the measurement of the discrepancy accepts, prima facie, that the true difference (∆) embodies the latent construct of LD, an assumption which is questionable at best given the lack of evidence that discrepancy models identify a unique group of underachievers on external variables not used to define the groups( Fletcher et al., 2002). Sadly, many students are placed in special education on the basis of a single assessment despite the measure problems outlined in this paper.
Another alternative involves the use of series of measurements over time, often in association with specific attempts to intervene and address the child's learning difficulties. A recent report of the National Research Council addressed alternative approaches to eligibility (Donavon & Cross, 2002). The report suggested that LD be identified when a student does not respond to quality instruction and intervention in these reading and behavioral domains. In LD, implementation of this approach would require frequent monitoring of progress as the student received some type of intervention (Fuchs & Fuchs, 1998). For example, Speece and Case (2001) found that an approach based on serial assessments of growth and level of performance on measures of reading fluency assessment were better predictors of the development of reading problems in at-risk children than single assessments of fluency.
Are these approaches psychometrically stronger than traditional approaches to LD identification? Multiple measures over time has the potential to reduce the difficulties encountered with reliance on a single assessment at a single time point, especially if that single assessment is then used to form a discrepancy with classification based on the ensuing observed discrepancy, since typically the discrepancy will be a poorer (i.e., less reliable) measure of the true difference (∆) than are the observed measures of their respective underlying constructs. Focusing on successive measurements over time has the effect of moving the identification process from "ability-ability" comparisons (two different abilities compared at one point in time) to "ability change" models (same ability over time). Such approaches have the potential to ameliorate the difficulties associated with ability-ability discrepancies, whether univariate or bivariate, because they involve the use of more than two assessments. Generally, the more information that is brought to bear on the decision, the more reliable the decision, although it is certainly possible to create counter examples by combining information from irrelevant or confounding sources. Such irrelevancies are not likely to be introduced by assessing the same skill over time as in a model that incorporates response to intervention, when that skill was previously deemed relevant to assess at a single time point.
More importantly, the introduction of serial assessments has an advantage beyond any statistical advantage it may confer for the estimation of individual's true status. Specifically, the introduction of serial assessments brings learning and the measurement of learning to the forefront in conceptualizations of LD. The collection of serial assessments under specified conditions of effective instruction simultaneously focuses the definition of LD on a failure to learn, where learning can be measured more directly, and controls the circumstances of instruction, thereby providing a clearer basis for the expectation of learning and the unexpectedness of any failure to learn. Together these two factors provide a compelling conceptual basis for a classification model that incorporates RTI.
Conceptually, the study of change is made more feasible by the collection of multiple assessments because the precision by which change can be measured improves as the number of time points increases (Rogosa, 1995). Moreover, when more than two assessments are collected, the reliability of estimated change can also be estimated directly from the data, and this imprecision in individual estimates can be used to provide improved estimates of growth parameters for individual students as well as for groups of students. This notion of borrowing precision lies at the heart of the multi-level modeling approach to individual growth curves analysis. And for those who favor status models over learning models (i.e., models that emphasize the level of attainment over the rate or process of attainment), it remains possible to use the intercept term in the individual growth model as an estimate of status. This intercept will provide a more precise estimate of true status at any single point in time than would a single assessment at that point in time. However, the growth model also carries with it the advantage that true status can be estimated at any point along the time line, including extrapolations beyond the observed time line, although such extrapolations make assumptions about the validity of the form of the growth curve, which must be supported through data. Moreover, the precision of estimates of true status will decline as the projection is moved further and further out from the center of the observed data points.
Multiple assessments over time make it possible to control for measurement difficulties resulting from error of measurement. Moreover, multiple assessments in response to intervention permit a more direct assessment of the child's learning over time. Different aspects of change can be estimated, including the slope (which may be linear or non-linear) as well as the intercept, which could be conceptualized as an overall measure of change that has the advantage of being estimated with multiple time points. Finally, focusing on multiple assessments in a response to intervention model has the advantage of clearly tying the identification process to the most important component of the construct of LD, which is unexpected underachievement. Response to intervention may identify a unique group of children that can be clearly differentiated from other low achievers in terms of cognitive correlates, prognosis, and even neurobiological factors. It is closely linked with learning, thus putting the concept of "learning" back into the term of "learning disabilities." But learning is actually assessed, as opposed to the types of static assessments that essentially represent short cuts to measurement and do not appear capable of identifying a unique group of underachievers simply because of the difficulties with underlying psychometric models.
Such approaches are not without difficulty. The introduction of serial assessments has not eliminated the necessity of indirect estimation of the parameters of interest. In the discrepancy model, we used D to estimate (∆). In a model incorporating RTI, we will use a complex function of the observed data for individual i as well as the data from many other individuals to estimate each of the πij, the j true learning parameters for individual i. Different approaches to this estimation problem have different strengths and weaknesses, but all will make assumptions about the arithmetic form of the model, the distribution of the learning parameters, as well as the distributions of the errors. The ramifications of these assumptions for inferences about individual learning parameters must be studied in the LD context.
Response to intervention models also involve imperfect measures that include measurement error. However, this problem is reduced because of the use of multiple assessments and the borrowing of precision from the entire collection of data to provide a more precise estimate of the growth parameters of each individual. Thus, it becomes possible to estimate a child's "true" status more precisely as well as to estimate the rate of skill acquisition and to use these estimates as indicators of LD. In addition, the approach to estimation will make assumptions about the distribution of errors of measurement. In some cases, errors might be assumed to be uncorrelated. Again, this assumption must be examined in terms of its importance to inferences about individual status and rates of learning. In many cases, the inclusion of multiple assessments will allow this assumption to be relaxed, and the correlation among errors of measurement can be estimated and taken into account in forming inferences about individual status and rates of learning.
There is still a need to identify individual children as LD based on some criterion score. Thus, there are still issues with defining cut points, and response to intervention models in and of themselves do not solve the issue of the dimensional vs. categorical nature of LD. Determining cut points, benchmarks, etc. will still be an arbitrary process until there is some attempt to tie the cut point to a functional outcome (Cisek, .2001), an issue never really addressed in LD identification for any identification model. However, models that include RTI have the promise of incorporating functional outcomes because they are tied to intervention response. Moreover, the collection of multiple time points of data allows for the introduction of latent classes and the possibility of separating individuals into distinct classes of learners based on their estimated growth trajectories. Such general growth mixture modeling carries the promise of being able to empirically address the question of dimensional vs. categorical conceptions of LD (Muthen, Khoo, Francis, & Boscardin, 2003).
Figure 2: Conceptual Diagram for a Latent Growth Mixture Model of Early Reading

Figure 2 provides a conceptual overview of growth mixture models in the familiar diagrammatic structure of a latent variable growth model. The figure shows two related processes - phonemic awareness and decoding measured at four time points in kindergarten and grade 1, respectively. In a typical latent variable growth model, the slope and intercept parameters for phonemic awareness (i.e., the circles labeled Ip, Sp) would be freely correlated with one another as well as with the slope and intercept parameters for decoding (i.e., the circles labeled Iw, Sw). However, in the present model, the correlations among these growth parameters are explained on the basis of a latent classification factor (i.e., the circle labeled C). This latent classification factor signifies the presence of a mixture distribution, such as the distribution of Figure 1, underlying the distribution of growth parameters. In the present hypothetical model, the latent classification is related to covariates such as gender, ethnicity, and SES, and the model also incorporates outcomes such as standardized, end-of-year reading and spelling outcomes. These models are a focus of intense research effort among the statistical and scientific communities and are estimable through widely available software, such as MPlus (Muthen & Muthen, 2002).
Some will argue that we do not know enough about RTI models, with or without growth and growth mixture model add-ons, in order to implement them on a large scale. This claim is inaccurate in terms of the underlying psychometric models, where ability change models represent sophisticated psychometric models that are well understood. Moreover, there are many examples of identification models in the literature that incorporate variations in either growth or intercept information that in fact represent examples of the models described in this paper. Here we have simply elaborated on the underlying models from a strictly technical, psychometric view (see Fuchs & Fuchs, 1998; Speece & Case, 2001; Vaughn et al., in press; Vellutino et al., 2003) . The specific nature of cut points and decision rules, and whether these types of models can be implemented in real school settings are different problems from the issue of whether such models are psychometrically viable and potentially identify a unique group of children. The critical question is whether these kinds of models identify a unique group of children who vary in attributes not used to define them. Thus, systematic evaluations of children identified as LD by response to intervention models on measures of cognitive skills, prognosis, and neurobiological factors would seem critically important.
Finally, there simply aren't circumstances where the identification of a child with a disability should be based on a single assessment, even if it is repeated over time. As the LD Summit suggested, it is entirely reasonable to incorporate, for example, achievement test scores from norm-referenced tests as part of the identification process that also includes RTI (Bradley, Danielson, & Hallahan, 2002). Similarly, it seems important to also consider what have traditionally been described as exclusionary criteria, particularly if these criteria are related to differential intervention needs, as is the case in English language learners, or children with mental retardation or sensory difficulties. However, in the absence of this type of approach, ability discrepancy or low achievement models are simply not viable for identification purposes. Moreover, it is not apparent that they identify unique subgroups of underachievers. Continued use of these models threatens to jeopardize the very construct of LD. Alternatives that incorporate response to intervention are clearly viable, and represent an important departure at the level of the construct as well as the approach to classification, measurement, and identification.
Previous Page | Next page
(One Test) | (References)

