IBM HR Attrition Analysis

IBM HR Attrition Analysis

This is in response to the monthly analytic challenge courtesy of Etuk Anietie

1.?????? Title: IBM HR Attrition Analysis

2.?????? Introduction

IBM HR Attrition Analysis is a comprehensive examination of the factors influencing employee attrition within IBM, a global technology and consulting company. Attrition, the voluntary and involuntary departure of employees from an organization, can have profound effects on the company's performance, productivity, and overall workforce stability. As one of the leading players in the technology industry, IBM faces the challenge of understanding and mitigating attrition to sustain its competitive edge and ensure the long-term success of its business.

This analysis delves into the various aspects of HR attrition within IBM, aiming to provide valuable insights into the underlying causes, patterns, and trends associated with employee turnover. By examining factors such as job satisfaction, career development opportunities, work-life balance, compensation and benefits, and stock option opportunity, this analysis seeks to shed light on the reasons why employees choose to leave IBM and how the company can address these issues effectively.

Ultimately, the goal of the IBM HR Attrition Analysis is to provide actionable insights that enable IBM's human resources department and management to make informed decisions and implement effective retention strategies. By identifying the root causes of attrition and implementing targeted interventions, IBM can reduce turnover, enhance employee satisfaction, and cultivate a strong and resilient workforce capable of driving innovation and achieving organizational objectives.


Analytics process: The tool I will use for this case study is MS Excel 2019. The data analytics process will follow the PMAVD (Prepare, Model, Analyze, Visualize and Dashboard) process. Data Source: Kaggle

PS: This is a fictional dataset created by IBM data scientists


a.?????? Preparation: This usually involves setting out objectives. However, the objective of this task has been set to finding out the top five (5) reasons behind the attrition.

b.????? Measures: The dataset for this task has thirty-six columns which contains age, employee number, marital status, monthly income, job satisfaction etc.


3.?????? Exploratory Data Analysis (EDA)

EDA are a set of steps used to explore and understand the data better before cleaning and transformation.

a.?????? Cleaning and Transformation

-????????? Convert dataset into table: Ctrl A, Ctrl T, “OK”

-????????? Export to power query: Go to “Data”, Select “From table”.

-????????? Check for number of rows: Go to “Transform”, Select “Count Rows”

Columns: 36???? Rows: 1470


-????????? Rearrange columns and use the column “Employee Number” as identifier. Change its datatype to text.

-????????? The column “Education” contains numbers (1-5) only. From the dataset source, the represent the following

1 = Below College

2 = College

3 = Bachelor

4 = Master

5 = Doctor

I will replace the numbers with their true values. But first, change datatype to “Text”.

Create a conditional column and replace all numerical values with the appropriate educational values.

Click on “Add Column”, Select “Conditional Column”, type in the new column name (Educational Level) and define the new values for the new columns.

Drag the new column “Educational Level” to where the column for “Education” is and change datatype to “Text”. Delete the column for “Education”.

I will base my analysis on the columns containing “defined data” on the dataset source and columns containing necessary data to ascertain why workers left their job. The other columns will be deleted. Assuming all the workers worked in “one” company, I observed inconsistency in the columns “Total Working Years, Training Times Last Year, Years at Company, and Years in Current Role”. Therefore, they will be parts of the columns I will not be considering for this analysis.


Create conditional columns for the following columns to replace their numerical values with

The appropriate defined values:


Environment Satisfaction

1 = Low

2 = Medium

3 = High

4 = Very High


Job Involvement

1 = Low

2 = Medium

3 = High

4 = Very High


Job Satisfaction

1 = Low

2 = Medium

3 = High

4 = Very High


Relationship Satisfaction

1 = Low

2 = Medium

3 = High

4 = Very High


Performance Rating

1 = Low

2 = Good

3 = Excellent

4 = Outstanding


Work-Life Balance

1 = Bad

2 = Good

3 = Better

4 = Best

Follow the same steps to creating condition column for the “Educational Level” column and create for the columns listed above.

Job Satisfaction new column.


I will create a conditional column for “Monthly Income” to categorize the income into Low, Average and High. The income range is $1,000 - $20,000

Allocate appropriate “data types” to the newly formed and old columns

b.?????? Model, Analyze, Visualize

On the power query user interface, go to the home tab, click on “Close and Load” to go back to excel.

Select a point on the table, go to “Insert”, Select “Pivot Chart”.

Total attrition =237

Attrition By Gender On the new interface (Pivot Chart Field), drag “Attrition” to Filter, “Gender” to Axis and “Employee Number” to Value/Count.

Click on the filter icon for attrition on the table and select “Yes”. From the analysis, more males left the job.


For subsequent analysis, drag “Attrition” and “Employee Number” to Filters and Values respectively on the PivotChart Fields and click on the filter icon for attrition on the table and select “Yes”.


For brevity of work, I will not be documenting the analysis for all the variables. However, I will pinpoint the top five (5) reasons for attrition. I will be identifying.

After my analysis, the top five (5) reasons I identified as the possible reason for employees leaving are:

-????????? Monthly Income

-????????? Business Travels

-????????? Salary Hike

-????????? Stock Options

-????????? Daily Rate

Additionally, “Overtime” also played a role in the attrition of workers


Attrition by Income

From the analysis, the category of people with low income (190), make up bulk of the total attrition.

Attrition By Business Travels

People who rarely had the opportunity to go on business travels/trips left the job in mass.


Attrition by Salary Hike

Again, workers with the least increase left the IBM job

Attrition by Stock Option Level

Workers at entry level left the job in more number than workers in higher level

Attrition By Daily Rate

Workers who got low and moderate daily rate left in higher number


Again, for brevity of work, I will not delve into the steps to creating the dashboard. However, this can be available on personal request.


From my analysis, all the possible reasons for attrition boils down to "income level". To prevent further occurrences, IBM should see to it that its workers are on a comfgortable income level.


Owoicho Ujah的更多文章

