Is assessment holding back the Science of Reading?

Matt Pasternack

CEO at Once

发布日期: 2024年4月26日

I've wanted to write this post for a while, but haven't found the right impetus. I'll settle for a recent update from Carrie Townley-Flores and Jason Yeatman 's team at 美国斯坦福大学 on the progress of ROAR (Rapid Online Assessment of Reading). I know David Stevenson at Reading Futures stands by that work, which means a lot to me. In their latest release, Stanford writes:

ROAR improves on traditional reading assessment in a number of ways. It’s easy to administer, and it has validated assessments for years K–12. By offering a variety of test options, it provides more detailed and reliable results than other standardized assessments and the validation studies underlying the assessments are published in open-access, peer-reviewed scientific journals. Additionally, it’s free, which allows more schools to access the program and encourages flexibility with testing. For example, it makes it easy to test a middle school or high school student for foundational skills, which are typically not assessed after third grade. Students also report it is fun to engage with -- which goes a long way.

Assessment is the measure of value in education. For some educators that means standardized assessment, for others it's more observational or anecdotal. I do think it's safe to say that at this point, the majority of district and school administrators' career prospects depend on demonstrating gains on standardized assessment.

For many grade levels and subjects, assessments are designed to measure state standards which are themselves designed by state departments of education (hopefully) adhering to the research base. Adieu Common Core?

I don't know how normative the research base is for some subjects. Is there a best way to learn US history? I do know that the #scienceofreading movement has popularized the notion that there is a strong evidence base for how to most effectively teach reading.

As more and more evidence emerges, innovators in curriculum are applying new insights to instruction with impressive results. 美国佛罗里达大学 Literacy Institute has been one of several leading the charge.

But shouldn't the earliest adopters of the research be assessment providers? If assessment is not using the latest and greatest understanding of how to teach reading, then curricula and programs that apply those understanding won't "get credit" on assessments and will be more slowly adopted by the district and school leaders whose job depends on getting the best assessment results.

In early reading, University of Oregon DIBELS has long been a standard, along with FastBridge - Illuminate Education , Pearson Aimsweb, and of course large assessment providers like Curriculum Associates , Renaissance Learning , NWEA .

If these legacy assessment providers are keeping up with the latest innovations in the Science of Reading, great, since they set the standard for all that follows. ("Legacy" is sometimes used as a pejorative. I don't mean it that way here. These folks collectively have enormous experience measuring student learning.)

But if the legacy assessment providers aren't, as ROAR or another innovator like Sprout Labs might argue, then we have a pretty significant problem on our hands if we want to accelerate the adoption of evidence-based practices in reading.

To ground this discussion in particulars, I'll pick on DIBELS for a moment.

In Kindergarten, DIBELS tests (1) letter naming, (2) phonemic segmentation (if I say 'am' you would say /a/ /m/), (3) nonsense word fluency (/h/ /a/ /p/ is pronounced 'hap'), (4) and word reading fluency (which includes lots of sight words).

I'm not a psychometrician. My understanding, though, is that letter naming may be less important initially than pronouncing phonemes from letters, segmentation may be less important than blending, nonsense word fluency may be less important than blending real words, and sight words are not a key indicator of early word reading fluency.

By comparison, ROAR offers validated tests on letters, phonemes, words, and sentences, respectively. They are actively validating measures of written vocab, morphology, rapid visual processing, working memory, picture vocab.

I want to be clear that I, personally, don't know if ROAR is more evidence-based than DIBELS or vice-versa. I welcome discussion in the comments. What does concern me is whether the leading assessment companies have an incentive to continuously adapt their reading assessments to follow the latest research. A priori, I wonder if they have the opposite incentive.

If I run a large assessment company and (1) have a large customer base that relies on those assessments and (2) I make frequent changes to the assessments, changes that are faster than instruction changes in those districts, my customers may see surprising decreases in performance, and might not be as happy with my product. Therefore, even if research were to show that X task isn't the best way to measure a reading skill, I might continue to feature X task in my assessments, because that's what my customers expect.

I recently wrote an article questioning whether philanthropy should fund education content. I think there is a stronger case to be made the philanthropy should fund reading assessment that is designed, developed, and perhaps distributed by universities. Theoretically university professors would have a stronger allegiance to the evidence base than to commercial interests and would have stronger incentives to innovate on assessment as the research develops. Additionally, although I am well aware of the unintended consequences of free things in education, I wonder if free reading assessment (what ROAR is proposing) really would be a win-win. Hi Bill & Melinda Gates Foundation ! Not only would universities have incentives to follow the research (with funders keeping watch), but school districts could then spend the money they've saved on assessment on better curriculum, training, and staff to provide the best instruction they can to help students improve on the assessments.

These are big, complex questions, and I hope I am not implying any more certainty than I feel. But this is personal for me. For the last two years Once has been collaborating with Stanford's National Student Support Accelerator (funded by Accelerate ) on a multi-year, multi-district randomized controlled trial that will likely take even more years to reach the right sample size (we're randomizing by school not student, a story for another day). The districts involved are using the benchmark assessments of their choice (no one is using ROAR). If those benchmark assessments haven't kept up with the research, there's a world where our results would be less reflective of our impact than I would like...

Onwards!

Karim Kuperhause

Ex-classroom teacher, currently VP of Growth for Hoot Reading. On a mission to change children's lives through literacy.

7 个月

It’s so important to ensure that assessments are aligned to instruction. Part of the role of assessments is not simply to evaluate students, it’s also to evaluate the efficacy of teachers and instruction.

Akhil Kishore

7 个月

I agree with your assessment Matt Pasternack?;-)? With content becoming available for free, assessment is what aligns the rest of the ecosystem. However I'm not sure if investments in content can stop. They seem integrated across all learning communities.

Dr. Rachel Schechter

Founder of Learning Experience Design (LXD) Research

7 个月

Thanks for this piece Matt Pasternack. Each of tests have different strengths and challenges, in part because they were designed to compete with each other in a competitive market. Kindergarten is particularly tricky, because they don’t know much at first and have the potential to learn SO much that first year. A 5yo that reads multisyllable words may score similar to their peers on MAP because they don’t need to read on the assessment. I saw Jason Yeatman present last week in San Diego and am excited about its potential! I’ve also seen that it’s very difficult to get schools to change their assessments, especially when they are focused on improving their reading practices because they need to keep the measurement consistent for evaluation. Love this conversation- adding Tyler Borek if he would like to chime in about Literably

1 次回应

David Stevenson

7 个月

Matt Pasternack, I don't come around here much, but since you mentioned me ... DIBELS is solid. The measures are indicators not learning objectives. Knowing letter names might just be a proxy for "went to preschool," and nonsense words aren't things to teach, but they're both predictive of reading outcomes. (It does take training and practice to administer reliably). At Reading Futures, we work with our clients' screeners -- iReady, NWEA MAP, STAR, DIBELS, etc. We see growth on percentiles (DIBELS), grade level (iReady and STAR), and/or RIT scores (MAP). We like to work with assessments that are in the building, and our measured growth does correspond to real outcomes. I'm a big fan of ROAR. Jason Yeatman is a serious scientist and he's taken a first-principles approach to measurement: leveraging technology, neuroscience, cognitive science, and AI. I especially like what he's doing in grades 4-12, providing unique insights on the underlying aspects of reading difficulty. There are frontiers here -- understanding decoding needs in adolescence -- and his team is doing awesome work.