Google Analytics Capstone
I've FINALLY finished my capstone for my Google Analytics Certificate. This entire certification journey has been a challenge, but mostly the result of my own distractibility. But, that's another post entirely.
For this post, let's just break down what I did for my capstone project: an examination of fitness tracker data for Bellabeat.
The Ask
Bellabeat is a high-tech manufacturer of health-focused products for women. It is a successful small company, but has the potential to become a larger player in the global smart device market. In this scenario, the cofounder and Chief Creative Officer of Bellabeat, Ur?ka Sr?en, tasked my fictional analytical team with?analyzing smart device fitness data to help unlock new growth opportunities for the company.?
Our team was asked to focus on one of Bellabeat’s products and analyze smart device data to gain insight into how consumers are using their smart devices. Our insights would then help guide marketing strategy for the company.?
Preparing our Data
For our analysis, we used the FitBit Fitness Tracker Data, a public dataset made available through Mobius on Kaggle.
I examined the data, which was divided into 18 different CSV files.? Using a combination of Excel and Google Refine, as well as this data dictionary, I created description of each of the tables:?
While the data comes from a legitimate source, it was easy to recognize its shortcomings. The data itself was only from 33 women, with inconsistent reporting. In addition, there was no demographic information that would help further break down any understanding of the data.
The Process
I opted to focus on four of the data sets for my analysis:
领英推荐
I also chose to conduct all my data cleaning, manipulation, analysis, and visualization in R, specifically RStudio Cloud (now Posit Cloud). While it undoubtedly would have been easier for me to complete this with a combination of Excel, SQL, and Tableau, I really wanted to challenge myself and learn more about R throughout this process.
My Analysis & Share
As I considered the ask from Bellabeat and looked at the dataset, I opted to focus less on how an active lifestyle correlated with calories lost or sleep gained, but rather focus on the what the data revealed about the type of users of fitness trackers as well as how and when the trackers were used.
I have uploaded my raw code as an R file and my analysis as an RMD file on my GitHub page. In addition, you can easily view my process and analysis below:
Act: What would I recommend to Bellabeat?
As I mention in the RMD file above, there are limitations to be gleaned from this analysis. However, I suggest the following to Bellabeat:
Conclusion and Final Thoughts
I used this project as a means to really delve into R as a programming language. After the previous course in the Google Data Analyst Certification Program (which focused on R), I still felt that there was a lot I didn't know. As a result, I was committed to using this capstone project as a hands-on crash-course experience with R.
As a result, I do feel like I have a stronger foundational understanding of the basics of data cleaning, manipulation, analysis, and visualization in R. However, there is still a bit to learn, and I could have spent another couple of weeks just learning and analyzing the data. That means there is definitely more I could have done.
If there is anything specific you want me to look at, I would love to hear from you! Connect with me on LinkedIn, and send me a message telling me what you want me to examine.?
?? I help people land their first data job (even with no experience) ?? Join 10k+ other analysts & get my newsletter! ??? Host of The Data Career Podcast
2 年Way to get it done!
Data Analytics Associate at Dan L Duncan Comprehensive Cancer Center | Data Scientist ???? | AI/ML | Healthcare Researcher | Data Visualization | Tableau | SQL | Python | PowerBI
2 年Congratulations James Charest