Behind the metric: how we developed “Time Spent Learning Well”

Behind the metric: how we developed “Time Spent Learning Well”

by Rachel Salvador

At Duolingo, we’ve developed a lot of proprietary metrics that help us improve our product in very specific and unique ways. One of the most important metrics that we created is Time Spent Learning Well (TSLW).?

This isn’t just any metric—it’s an essential quality metric that helps us optimize for learning alongside growth. TSLW has evolved over several years to become the version we use today—let’s look at how we got here, what it measures, and how we use it!

How we arrived at TSLW

Since Duolingo is a learning app, we had to find creative ways to approximate teaching efficacy. Our Efficacy Lab conducts rigorous research to measure the efficacy of Duolingo, but these studies aren’t helpful in conducting A/B tests on a daily basis. It’s extremely important that we have a proxy metric for learning, because that’s what we’re here to do! So how did we land on Time Spent Learning Well?

Total Sessions

We started with Total Sessions, which measured the total number of sessions people were completing (from a path lesson to a Match Madness round to a quick Story review). The idea was more sessions = better for learning. If you do more, you learn more, right? Not quite!?

We found that Total Sessions was an imperfect metric because session length (time spent doing the activity) was very variable. We wanted learners to advance through their course and encounter harder content, and naturally, harder lessons might take longer. (Not to mention that we want learners to encounter more advanced content—yes, it’s harder, but it’s also new!)

So if a learner has 15 minutes a day to spend doing Duolingo, that might be 5 easier sessions (faster to complete) or 3 harder sessions (longer to complete). In this case, fewer sessions was not necessarily a bad thing for the learner! The metric wasn’t accurately measuring engagement or learning—we were biased towards people who were grinding on shorter, easier sessions.

Time Spent Learning

Next, we evolved to Time Spent Learning (TSL). This focused on time learners spent in learning activities, which includes: regular lessons, timed challenges, review sessions, lessons focused on conversation or listening practice, stories, etc. The challenge: we had to figure out how to make it relatively resistant to outliers.

In the beginning, Total TSL ended up being skewed to a small set of studious learners. And if we changed features that impacted competitiveness—like Leaderboards—Total TSL grew mostly due to these very active learners. That didn’t help us reach our goals because a) we were motivating a small group of learners who b) were already doing more than enough!?

We decided to develop a “standard” for how much time each learner should spend on Duolingo—after conducting several formal studies, discussions with our internal learning and curriculum experts, and assessing current learner behavior, we decided to optimize for a higher percentage of learners spending at least 15 minutes/day on Duolingo. Learning a new language takes time, and the more time you spend practicing, the better you’ll get. If you only spend 5 minutes a day on Duolingo, it will take you a really long time to reach your goals.

Once we looked at how each learner was spending their 15 minutes/day, we realized not all learning time is equal—after all, lessons on the path are the ones that teach you new concepts, skills, and vocabulary!

A path lesson is any lesson node that moves you down the path, including: personalized practice, Stories, and Unit Review Other lessons include Practice Hub exercises, level review, Legendary, Side Quests, Match Madness, and Ramp Up challenges

Time Spent Learning Well

Once we realized that, we landed on Time Spent Learning Well. This metric favors certain sessions that have better impact on learning—particularly those that move you down the Duolingo path. It might sound obvious, but an independent study from researchers at Northern Arizona University and East Carolina University found that “the number of completed lessons was the strongest predictor of learning gains.” While all of the exercises and activities on Duolingo have value, we found that most of the time, for most learners, going down the path is the most helpful. This is how learners encounter new and harder content, as well as the review that we’ve sprinkled throughout lessons! So when determining the formula for TSLW, we knew we had to pay special attention to the value of those lessons.

The rough formula for measuring TSLW is:

How we influence TSLW

Since we wanted learners to move down the path, we had to find ways to incentivize them to do so! Here are a few features we found positively impact TSLW (and keep the app entertaining):

  • Daily Quests: These are optimized very carefully for TSLW. The order of the Quests matters! The first is supposed to “get you going” and is usually the “easiest” to complete. They get harder and harder, and typically value activities that move you forward, i.e.: finish a unit or read the next Story on your path.?
  • Monthly Challenges: To encourage good learning habits, we switched the Monthly Challenge to be Quest-based instead of XP-based. When learners were focused on gaining XP for a Monthly Challenge, they could spend the last few days of the month gaming the system to earn XP in bulk. We wanted to encourage people to return daily and progress through different activities.

  • Leaderboards: This is one of our most popular features in the app, but we wanted to make sure it incentivized the best learning behavior. This is another area where XP grinding can happen, and the competition can feel “unfair” for learners who are more focused on content than gaining thousands of XP per week. One change learners might notice is that we’ve increased XP along the path so that XP rewards are proportionate to effort and learning outcomes. With this change, lessons on the path (better for learning) help our learners climb the leaderboard! See the results of this experiment below!

Testing everything

At Duolingo “test everything” is one of our operating principles—and we’re constantly iterating on different features to see how they impact our most important metrics! Here are a few things we learned about TSLW while testing:

Shorter isn’t always better…

We tried out making lessons shorter in the hopes that learners might be motivated to do more lessons, and TSLW would go up. Overall, this wasn’t the case, and ultimately the shorter sessions hurt TSLW.

…But longer seems to be!

We did find that adding new content (making the path longer) always had a positive impact on TSLW. We hypothesize that learners enjoy making progress and engaging with fresh content (rather than constantly reviewing older content).

Avoiding learner burnout

We know that not everything on Duolingo is going to positively affect Time Spent Learning Well… and that’s OK! Our philosophy is: “they can’t learn if they churn,” which means we all need a break every now and then. If all a learner can do on one day is extend their streak, that’s OK! That’s why delight is so important to our app, and why we make sure learners can choose between lots of different activities (a quick Ramp Up Challenge is better than missing a day!)

Keep learning!

The most important thing about TSLW is the “L”—above all, we want to make sure learners advance towards their goals! We’re still refining TSLW, but we’re confident that metrics like this help Duolingo move towards its mission of developing the best education in the world and making it universally available ??

Hasan Jamali

???????? ?? ????? ?????? ?????? ????

1 周

?????? ????

回复
Gabriela Quiroa

I help brands go global!

2 周

Ok, we need to stop this, is Duo dead or not!? Get it together people.

This is a great example of how Duolingo prioritizes actual learning over vanity metrics!

回复
Sylvie Lu

Crafting meaningful user experiences that empower behavioural change. UX/UI Researcher and Designer @ Signol, a startup encouraging sustainable operational choices.

2 周

Great example! Humans are complex and standard metrics like session time or login counts rarely capture the full picture. In UX, we often see how these surface-level numbers can be misleading if they don’t reflect real user intent or experience.

The world's best formula

回复

要查看或添加评论,请登录

Duolingo的更多文章