Predicting results, and working with Command Boards using Machine Learning

Predicting results, and working with Command Boards using Machine Learning

Analyzing the Employees TurnOver

Machine Learning, and the whole AI spectrum, give us a lot of resources to attack many common problems we have on a daily basis. The creation of the so-called ML 'models' give us powerful information: predictions.

Let's suppose we work for the HR team in a large company. The Board of the Company is worried about the relatively high turnover, and your team must look into ways to reduce the number of employees leaving the company. (This case is part of a DataCamp competition https://app.datacamp.com/workspace/w/fe01229f-0e67-47d8-8872-9eed8674da29)

The team needs to understand better the situation, which employees are more likely to leave, and why. Once it is clear what variables impact employee churn, you can present your findings along with your ideas on how to attack the concern.

The department has assembled data on almost 10,000 employees. The team used information from exit interviews, performance reviews, and employee records. The variables we will find in the dataset for each employee are:

  • Department - the department the employee belongs to.
  • Promotion - if the employee was promoted in the previous 24 months.
  • Review - the composite score the employee received in their last evaluation.
  • Projects - how many projects the employee is involved in.
  • Salary - for confidentiality reasons, salary comes in three tiers: low, medium, high.
  • Tenure - how many years the employee has been at the company.
  • Satisfaction - a measure of employee satisfaction from surveys.
  • Average Worked Hours per Month - the average hours the employee worked in a month.
  • Left - if the employee ended up leaving or not.


In the following report, we will analyze three main premises:

  1. Which department has the highest employee turnover? Which one has the lowest?
  2. Investigate which variables seem to be better predictors of employee departure.
  3. What recommendations would you make regarding ways to reduce employee turnover?

Before starting with the first question, we will analyze the dataset to see if we can find insights. First, we will do an Exploratory Analysis by studying statistical data and correlations between the variables (in pairs). After this, we will do a Graphical Analysis. Moreover, we will continue analyzing pairs of data in a deeper way. Once we finish the analysis, we will release the first draft of the insights found.

Then, we will move to the first question, analyzing it from two different angles:

  • Absolute ratio (the department's turnover vs the number of employees of the company)
  • Relative ratio (the department's turnover vs the number of employees within the department)

Regarding the second question, we will follow two different strategies, and then we will compare both of them.

  1. On one side, we will get the features following a specific Machine Learning Model, and we will generate the equation that allows us to forecast the outcome (if the employee leaves the company or not) using another ML Model.
  2. On the other hand, as a different approach, we will compare several ML algorithms, choosing the best three (following a specific scoring). Then, we will get the best features of each of these three models (and compare if they are the same ones like the ones we got from the previous point). After this, we will hyperparams tune each of the selected models to see if we can improve the scoring. Once this is finished, we will rank the final scores of the three algorithms again, and we will use the Voting model to see if the combination of them is better than using the best algorithm gotten. Then, we will try to get some predictions using some simulated data, and we will compare the results we got from both of the strategies followed.

Finally, in the last question, we will summarize all the studies done and get some general ideas to implement in the company. Also, in a more specific way, we will propose a strategy to follow based on a Command Board. Here, the Human Resources team will be able to monitor and predict if and when they have to apply any kind of incentive or specific tool to avoid an employee leaving the company. This command board will measure the urgency of the case, so no resources are wasted unnecessarily.

The full report is available in my GitHub repository https://github.com/vascoarizna/DC-EmployeeTurnOver Here you will find not only the full explanation with the graphics and the solution but also the Jupyter Notebook codes in case you want to take anything for you.

Go to Part 2: Descriptive Analysis


---------------------------------------------------------------------------------------------------------------

This case is part of a DataCamp competition:?Employees TurnOver - DataCamp

The full report is available in my GitHub repository?GitHub - vascoarizna

Here you will find not only the full explanation with the graphics and the solution but also the Jupyter Notebook codes in case you want to take anything for you.

Author:?Ignacio Ariznabarreta -?JIAF Consulting

要查看或添加评论,请登录

Jose Ignacio Ariznabarreta Fossati的更多文章

社区洞察