A tale of two studies

A tale of two studies

For those of us interested in large-scale social science research, this summer was dizzying. The promise of this research is that we can learn how to improve people's lives in a scalable way. The footnote is that doing so is harder than it sounds--and it sounds hard. Intuitive, innovative ideas to improve people's lives at scale generate a lot of funding and fanfare early on, both of which dissipate as research gradually reaches messier conclusions.

In July, Sam Altman's OpenResearch published the largest ever randomized controlled trial on universal basic income. 1,000 people (treatment group) received $1,000/mo. for three years. 2,000 people (control group) received $50/mo. for three years. Remember when universal basic income had its moment in the sun in 2020 when Andrew Yang literally ran for president on the concept (the $1,000/mo. Freedom Dividend)? That's when Sam Altman pumped $20m in to study it. (I might have the chronology slightly backward.) Well what happened?

Financial health:

Unconditional cash improves financial well-being and reduces vulnerability in the short term but does not appear to reduce long-term financial anxieties. Additionally, though the overall financial health score improves significantly for recipients in the first two years of the study, the effect fades in year three. There was a nearly 8% increase in financial well-being for recipients in year one, which reduced to 6% in year two, and then was near zero by the third year.?

Health:

The cash led to large and significant improvements in mental health and food security in the first year of the program. These effects faded in subsequent years.?

Entrepreneurship:

Recipients exhibited increased interest in entrepreneurship and were more likely to have an entrepreneurial mindset...Entrepreneurial mindset, or orientation, is measured as a participants’ willingness and preference toward taking financial risks. Though this interest and intent did not translate into a significant increase in entrepreneurial activity for the average recipient, there was a notable increase in entrepreneurial activity for underrepresented groups. Black and female recipients were more likely to start or help start a business.?

These were probably not the effect sizes Sam Altman or Andrew Yang anticipated 5 years ago, but they significantly enhance our understanding of what we can expect from this type of intervention. Unfortunately, other than the readers of this post and listeners of Kevin Roose and Casey Newton 's Hard Fork, where I heard about the results, I don't know how many people heard about this. I don't think TechCrunch ran an article on the results despite 9 previous articles on universal basic income that predated these study results, including the 2017 post: "Is monetizing federal land the way to pay for basic income?"

Not surprisingly, there's barely a blip of interest in July when the results came out:


No significant spike in interest in universal basic income when the first large RCT on the topic comes out with modest (at best) results

Closer to home, for me, was Matthew A. Kraft Danielle Sanderson Edwards Marisa Cannata 's large study of nearly 7,000 students who received tutoring in Nashville, Tennessee from 2021-2023. They found:

evidence of a small to medium average positive effect on students’ reading test scores (0.04 to 0.09 standard deviations), but no average effects on math test scores or course grades in either subject.

When the study was first published a couple weeks ago, I didn't see much in the way of media interest, although yesterday there were two in-depth articles, Students aren't benefiting much from tutoring, one new study shows, by Jill Barshay at The Hechinger Report and This district provided tutoring to thousands of students. The results were mixed. by Sarah Schwartz at Education Week . Does the recent spike in Google trends indicate that this news is traveling faster than universal basic income?

If so, perhaps because expectations were lower (sky high in education, but no one was running for president on this platform--right now the focus seems whether or not to maintain the U.S. Department of Education ).

Jill Barshay writes about Matthew A. Kraft 's takeaway from the Nashville study:

Going forward, Kraft said he and other researchers need to “recalibrate” or adjust expectations around the “eye-popping” or very large impacts that previous small-scale tutoring programs have achieved...“I worry,” he said, “that we may excuse ourselves from the hard work of iterative experimentation and continuous improvement by saying that we didn’t get the eye-popping results that we had hoped for right out of the gate, and therefore it’s not the solution that we should continue to invest in.”

She then adds her own take:

I wonder if customized instruction can be accomplished at scale at an affordable price. To really help students who are behind, tutors will need to diagnose each student’s learning gaps, and then develop a customized learning plan for each student. That’s pricey, and maybe impossible to do for millions of students all over the country.?

I have a different conclusion, informed by the contrast I see between this study and the universal basic income study I opened with.

Compared to tutoring, universal basic income is a commodity. Sure in a universal basic income program you can give different amounts of money, at different intervals, with different eligibility criteria, but at the end of the day, a dollar is a dollar. If AI replaces most jobs (I certainly hope it won't), OpenResearch's report does not give me a lot of confidence that a $1,000/mo universal basic income is the solution.

In contrast, I think the education policy field has erred by treating tutoring like a commodity. I think customized instruction can be accomplished at scale at an affordable price. But I wouldn't bet on tutoring programs in general. I would bet on organizations that have unique insights into efficacy at scale and sustainable pricing.

Like Jill Barshay I wouldn't bet on tutors diagnosing each student's learning gaps and developing a customized learning plan for each student (without a lot more AI support than we have today). That's why at Once we focus on children in kindergarten, where we help every student build the right foundation from the beginning.

I would also look deeply at the coaching tutoring providers provide to their tutors. At Once every instructor receives over 25 hours of training and individual coaching in the #scienceofreading. Every instructor is continuously assessed against 55 different instructor competencies--and that's just to teach a single subject (reading), in a single grade (kindergarten). We record instructional sessions and use "game tape" of each instructor's instruction to make coaching as tactical and relevant as possible. This work is complex and hard but we're building the technical underpinnings so that we have a foundation for scale.

And when it works, it really works. Last year three US schools delivered Once to every one of their kindergarten students. Across those three schools, at the beginning of the year, the students' median rank nationally (measured by i-Ready) was at the 38th percentile (i.e., they scored worse than 62% of the nation’s kindergarteners), but, by the end of the year, their median rank was at the 67th percentile (i.e., better than all but 33% of the nation’s kindergarteners). They inverted their national rank.

The median student using Once in three schools that delivered Once to every kindergarten student grew from the 38th percentile nationally to the 67th percentile.

Now our challenge is to reproduce and improve those results at scale--starting with Nashville. The mayor doesn't think it's impossible and neither do we.

Michael Goldstein

Curious Questioner

6 个月

Great post. Really enjoy your writing. I'm sure Matt K would agree "the education policy field has erred by treating tutoring like a commodity." Perhaps would expand to say field treats EVERYTHING like a commodity. Just as a small example, you mention Once's 25 hours of training. But you and I would probably bet AGAINST any typical organization providing 25, or 3, or 100 hours of training. Because we know most RCTs of training show no effect. It's working for Once because, as you say, of the million details, the trial and error, and underneath it, the obsessive drive of your team to actually generate gains for kids. Whereas most education leaders/managers precisely lack that drive - because they mistakenly think they're delivering a commodity!

要查看或添加评论,请登录

Matt Pasternack的更多文章

  • I don't think we need to blow up how we measure schools...yet

    I don't think we need to blow up how we measure schools...yet

    Back in the spring I asked whether assessment is holding back the Science of Reading. I picked on DIBELS, writing: In…

    2 条评论
  • How can districts respond to the cyber charter earthquake?

    How can districts respond to the cyber charter earthquake?

    If you read K-12 headlines, and I asked you what's the biggest earthquake hitting schools this year, you might say:…

    4 条评论
  • Tutoring programs are not widgets

    Tutoring programs are not widgets

    Matthew A. Kraft Beth Schueler Grace Falken just released an important meta-analysis of 282 randomized controlled…

    5 条评论
  • Tech and Edtech

    Tech and Edtech

    When I was getting started in education reform, there was Linda Darling-Hammond saying "STOP". When I was getting…

    5 条评论
  • In defense (praise?) of silver bullets

    In defense (praise?) of silver bullets

    I still remember my first day of my first job out of the classroom in 2007. After 3 years of teaching middle school ELA…

    6 条评论
  • "What did we get for it?"

    "What did we get for it?"

    If you haven't yet listened to Michael Horn, Diane Tavenner and Stacey Childress's most recent Class Disrupted podcast…

    11 条评论
  • Our randomized controlled trial journey

    Our randomized controlled trial journey

    As companies and organizations grow and ossify, they tend to take fewer risks and make decisions more slowly. There's a…

    26 条评论
  • Writing Well

    Writing Well

    In business we generally preach concision, concision, concision. Ok, maybe just one concision :) Mostly it's because…

    1 条评论
  • The jig is up

    The jig is up

    It's counterintuitive that K12 education has numerous studies demonstrating programs and products with strong positive…

    12 条评论
  • Is assessment holding back the Science of Reading?

    Is assessment holding back the Science of Reading?

    I've wanted to write this post for a while, but haven't found the right impetus. I'll settle for a recent update from…

    5 条评论

社区洞察

其他会员也浏览了