Standardize and clean those phone numbers using the new CleanPhoneNumbers R package!
Samantha Bell
Veterinary Data Analysis | Dashboards & Reporting | LVT | E-commerce | Bioinformatics
Have some dirty phone numbers in your data? This package can help!
THE TASK
Many data analysts will encounter projects involving phone numbers at some point in their career. This might mean:
THE COMPLICATIONS
But what if your data collection had no standard for phone number entry? When phone numbers are collected in free text format, you might end up with a variety of issues:
THE SOLUTION
The new R package named CleanPhoneNumbers has a function clean_numbers which will take care of all of this for you!
What does it do?
How is it used?
Simply supply your column or vector of dirty numbers, and your preferred country code. clean_numbers() will return a vector of the same length which only contains your new clean numbers - the rest will be set to NA.
Let's take a look!
First, we need to install and load the package from GitHub using the remotes package
领英推荐
install.packages("remotes")
library(remotes)
remotes::install_github("bell-samantha/Packages/CleanPhoneNumbers")
You will then need data in R which contains phone numbers. Let's try this example:
If this dataframe is loaded into R as "myNum", we can run the "dirty_phone" column through the clean_numbers() function to standardize and filter our messy data:
CleanPhoneNumbers::clean_numbers(phone = myNum$dirty_phone, country = 1)
phone is the vector of phone numbers you want to clean
country is the code for the country which can appear as the first digit of numbers in your area
Assigning the results to a new column is as easy as this:
myNum$clean_phone <-
CleanPhoneNumbers::clean_numbers(phone = myNum$dirty_phone, country = 1)
THE RESULTS
Now we have a clean set of phone numbers!
Have fun using this simple method to clean and group your phone numbers :-)
HAPPY PROGRAMMING!