People analytics and event history analysis
Ben Hanowell
Director of People Analytics Research, ADP Research. I study the decisions of employees and employers. My posts reflect my own thoughts.
Or: How I learned to stop measuring time to fill and love cumulative incidence functions
This article is an adaptation of a discussion I led with the Seattle People Analytics Forum hosted by the Seattle People Analytics Network at Nordstrom headquarters on 7 October 2019. The slides I used during that discussion are here, and the source code for the slides is here. Thanks to the organizers for the opportunity, and to the participants for their comments and questions.
Does your recruiting organization use the average time to fill a job requisition as a measure of recruiting performance? Does your talent management organization use the average time to promotion as a measure of upward mobility in the company? What about the average time to employee termination? I'm guessing that multiple organizations at your company use it to measure employee retention.
It turns out that many companies miscalculate these time-to-event metrics, which makes them more likely to make poor business decisions. In light of recent lawsuits stemming from bad metrics, maybe you aren't surprised anymore by the idea that businesses miscalculate. So let's play the controversy game at higher stakes.
I'll let Morpheus lay it out for you:
No matter how good your arithmetic is, and no matter how clean and well-documented your databases are, many of the average time-to-event metrics that your company uses are an attempt to estimate a quantity that is fundamentally unknowable. For that reason, the metrics are utterly meaningless.
?? But before we poop all over cherished people metrics...
We use these metrics for a reason. Let's respect what came before by trying to understand why we use them now.
Yet let's think more deeply about these claims.
Here's what we're gonna do
How companies miscalculate the average time until an event occurs
Let's use the example of the average time to fill job requisitions. The most common problem I see is when companies use only filled requisitions to calculate time to fill. To illustrate why this is a problem, I'll use a toy example.
Suppose it's the end of 2018, and there were three job requisitions open at any point during that year. Below is a table of the requisitions with their open dates, fill dates, and time to fill. Fill dates and time to fill are left blank if the requisition hasn't yet been filled, as is the case for Requisition ID #3.
Let's throw out the un-filled requisition, because we don't know the time to fill for that one, right? Here's the new dataset.
To calculate the average time to fill, we just average the time to fill between these two requisitions, in which case the average time to fill is just two months. Assuming that the rate at which requisitions were filled was constant during 2018 (the second of two problematic assumptions we're making), the monthly fill rate is the inverse of average time to fill. In our case, then, half of a job requisition is filled for every full month that a requisition is open. To get the annual fill rate, we multiple the monthly fill rate by 12 months, which tells us that we fill six job requisitions for every full year that a requisition is open.
According to this analysis, our recruiting performance is AWESOME.
Aren't we forgetting someone? Requisition ID #3, although not yet filled, has been open for 14 months, and 12 of those months were in 2018.
We can use this transformed version of the original dataset to compute the actual monthly fill rate following the definition of a proper demographic rate, which in this case is:
The number of requisitions filled in 2018 is two, and the total number of months that requisitions were open during 2018 is 16, making the monthly fill rate 2/16 = 0.125. Again assuming that the fill rate was constant in 2018, the average time to fill requisitions is the inverse of the monthly fill rate, which is 1/0.125 = 8 months.
Let's review:
In general, if you use only cases where an event occurred to estimate the average time until that event occurs, you will underestimate the average time to event, and you will overestimate the rate at which the event occurs. So don't do that.
Introducing event history analysis
At this point, analytically-savvy readers be like:
For the rest of you scratching your heads, survival analysis is a useful way to analyze time-to-event data where not all of the events of yet occurred, a phenomenon known as right-censoring. One benefit of survival analysis is that you aren't including only information about time to event based on cases where the event occurred. In addition, you can relax the assumption that the rate of event occurrence is constant over time.
To illustrate by example, suppose we have data on the date that employees were hired and, if applicable, their date of termination. Using this data, we can estimate the survival function, which gives the probability that an employee will remain at the company beyond a specified period of time. To make this concept more concrete, here's a picture of a survival function. Each point along the curve gives the fraction of employees remaining at a time point just beyond each month since hire.
It turns out that the average time until an employee is terminated is equal to the area under the survival curve, as illustrated below.
领英推荐
Another quantity of interest in survival analysis is the cumulative incidence function, which gives the probability that an employee will be terminated by a specified period of time. Below is the cumulative incidence function associated with the survival curve we showed before.
In general, the cumulative incidence function gives the probability that an event will occur after a specified period of time has passed. For this reason, we can use it to address important questions in people analytics about the chance that something will happen by a target date. For example:
In other words, cumulative incidence functions provide information that people actually want to know. In all of these cases, the average time to event is a quantity that someone would use to guess at the answer to their real question. I know this because these are examples of questions people have asked just before they requested average time-to-event metrics from me.
The cumulative incidence function might sound familiar if your organization reports metrics like the percentage of requisitions that will be filled by a range of arbitrarily-defined time points (e.g., a week, two weeks, 30 days, 60 days, a year; I call these N-day metrics for their focus on an event occur within N days). The strength of estimating the cumulative incidence function is that you can estimate the chance the event will occur by any time period, not just the ones you happened to calculate. In addition, unlike the N-day metrics, you wouldn't have to build a separate prediction model for each time horizon. Instead, you can build statistical models that estimate how the entire cumulative incidence function varies across person-time, calendar-time, space, and other predictors.
Comparing cumulative incidence functions to estimate impact
Suppose that in one business market, we took some action we'll call X to increase retention (thus decrease attrition, and the height of the cumulative incidence function). Below, we compare the cumulative incidence function in that business market to a similar business market where we did nothing. Given that the cumulative incidence function for doing X is lower than if we did nothing, our experiment was a success. But what does that mean for the business? What do you tell the board?
If we take the area between the two cumulative incidence functions, it tells us the number of months of labor per employee that we lose if we don't do X. In this case, we lose a year and two months of labor per employee by not doing X. You could aggregate this figure across projected new hires to estimate turnover costs.
The cumulative incidence function C(t) is easy to calculate as the complement of the survival function S(t):
C(t) = 1 - S(t)
That is, unless there are competing risks.
What are competing risks?
Suppose you not only want to know about termination in general, but you want to measure the risks of voluntary vs. involuntary termination, or regrettable vs. un-regrettable termination. If an employee terminates voluntarily, then (barring an analysis that extends to consider returning employees) that same employee cannot be terminated involuntarily, and the same for regrettable vs. un-regrettable termination.
For another example, whenever we do analysis of time to first promotion, we must always consider the competing risk of termination. In large companies where internal transfers are frequent and even encouraged, we need to also consider the competing risk of first internal transfer.
Time to event is usually unknowable when there are competing risks
To illustrate why, I'll use the example that compares the risk of first promotion to the competing risk of termination. If I get terminated before my first promotion, you will never know the time it would have taken me to get promoted. You could guess how long it would have taken by looking at how long other people took to get promoted assuming I'm just like them. Yet because I was terminated before I was promoted, I'm probably not like them.
In general, if the reasons that two competing events occur are not independent, then the average time to either event is unidentified, meaning it cannot be estimated from data. Competing risks are dependent when one event is less likely to occur for reasons that make the other event likely to occur, or when both events are likely to occur for similar reasons.
Because average time to event is unidentified under dependent competing risks, the metric is meaningless.
Okay. I know what you're thinking.
Cause-specific cumulative incidence functions to the rescue
Recall that cumulative incidence functions give the chance that an event will happen within a specific period of time, addressing the questions that often drive people to ask for the average time to that event. Cause-specific cumulative incidence functions are basically the same thing, but they condition the occurrence of the event at time t on having persisted through all competing events up to that time. By conditioning on persistence through all competing events, the cause-specific cumulative incidence function makes no assumptions about the independence of competing risks.
For example, the picture below shows that the cause-specific cumulative incidence of voluntary termination at 100 months is just under 20%. For involuntary termination, the cumulative incidence by the same time point is just under 30%. These figures can be directly compared.
Cause-specific cumulative incidence functions add up to the total cumulative incidence of any event. This makes it possible to compare the risks of competing events in a natural way. The picture below shows the cumulative incidence functions for voluntary and involuntary termination stacked atop one another. The area of the colored bands represent the number of months of labor per employee that are lost to a given type of termination. Yes, I know I switched the colors of voluntary and involuntary termination from the last plot. So sue me.
Limitations of cause-specific cumulative incidence functions
Beware when formulating a competing risks analysis for the following reasons:
So here's my question
Given that the average time to fill is often miscalculated and even meaningless, how could you leverage cumulative incidence functions to address the business questions your organization has about how often events happen and how long it takes them to occur?
And here's another question
Have I piqued your interest? Would you be interested in learning more about how your organization could use event history analysis for people analytics? Let's talk. Or maybe... let's read. I'm thinking about writing a book about event history analysis for people analytics. Would you read it?
Senior Manager of Diversity, Equity, and Inclusion Research
5 年I would love to read a book on this!?
Compensation at Bumble
5 年Great post, Ben. I would definitely read a book that covers more examples like this one, very interesting!?