Beyond the Numbers: Discovering Insights in Animal Shelter Data
Christy Ehlert-Mackie, MBA, MSBA
Data Analyst who ?? Excel | SQL | Tableau | I analyze and interpret data so companies have the information and insights they need to make sound business decisions.
According to Shelter Animals Count , an estimated 6.5 million cats and dogs in the United States went through animal shelters and rescues in 2023. There are over 14,000 animal-sheltering organizations across the country. The system has experienced an increase in animals the past couple years which has led to a capacity crisis for many shelters and rescues.
Louisville Metro Animal Services (LMAS ) is one of those shelters. LMAS serves the Louisville, Kentucky area with “a fundamental mission to protect public health and safety and ensure the humane treatment of domesticated animals”. Their services include animal sheltering, field services, community engagement, animal adoption, and licensing.
Business Problem and Questions
For this project, I acted as a data analyst for LMAS to help them understand the flow of animals through their system for 2019 - 2022. Understanding historical data will help them adequately plan and use their resources (such as financial and people) as well as identify if there are opportunities to better serve their geographic area.
Questions answered include:
Key Insights
Detailed analysis and recommendations are below.
Data Cleaning and Exploration
The dataset Louisville Metro KY – Animal Service Intake and Outcome was found through data.gov which had a link to the Louisville Metro Open Data website . A csv file of the dataset was downloaded 12/27/23. This file included 55,964 rows and 18 columns.
The data dictionary on the Louisville Metro Open Data website was helpful but, since it was just a long text list, wasn’t the easiest to read. I ended up creating my own data dictionary file in Excel, using the online data dictionary and the data exploration I did. I also found a few values that weren’t included in the online data dictionary.
The csv file was brought into Excel for data exploration. I created pivot tables to explore the values in each field and noted how many records had blanks or missing data and what data cleaning I would need to do. The LOST and FOUND intake types had a lot of missing data. According to the data dictionary, those records represented animals that a citizen reported as lost or found on the LMAS website. I decided to exclude these records as they don’t represent animals physically going through the LMAS system.
Data cleaning I performed included:
The final Excel dataset was 42,855 rows and 20 columns.
Data Analysis
I started by doing data analysis in Excel using pivot tables to answer each of the questions above. Since I am more experienced with Excel than with Tableau, doing this helped me tie out the data in the Tableau visualizations to the Excel pivot tables to make sure I was getting the results I wanted.
In Tableau, I connected to the final Excel dataset and verified that all data was loaded correctly. Since there wasn’t a full year of data for 2023, I filtered the data to 2019 – 2022 so any analysis comparing years wouldn’t be distorted. I then created various visualizations and dashboards to answer the questions posed above.
How many animals went through LMAS? For 2019-2022, there were a total of 34,150 intakes and 33,912 outcomes. Knowing that some animals went through the system more than once during that time period, I created a calculated field to get a count of distinct animal id’s. There were 26,860 distinct animals.
How many animals went through LMAS multiple times? There were 5,806 animals that had more than one intake. The majority of these went through 2 or 3 times and were mostly strays who went to foster (had 1 intake as a stray and 1 intake when came back from foster). There were 25 animals that each had more than 5 intakes. All but 1 were dogs and most were brought in as strays multiple times.
Where did the animals come from geographically? I created a map based on jurisdiction zip code (the zip code where the animal was picked up). I used a table calculation to calculate each zip code’s percentage of the total intakes. There were no zip codes that had more than 10%. Zip code 40218 was the highest at 9.4%.
What were the demographics of animals served? Cats and dogs were 50% and 48%, respectively, of total intakes and 51% and 47% of distinct animals. This indicates that dogs were more likely to have gone through the system more than once. Other animal types combined totaled 2% of intakes. Since cats and dogs were the vast majority, I filtered the rest of the visualizations to exclude the other animal types.
To look at the split of cat and dog intakes by year, I did a table calculation to get the percentage of total by animal type. The overall split for all years combined was equal but looking at it by year shows that cat intakes were significantly lower in 2019 and higher in 2021.
领英推荐
Cats were nearly equally split between male and female. Dogs had a significantly higher number of males. I used CASE/WHEN statements to create groups for female and male as the dataset also had spayed (female) and neutered (male) values. There were a small number of animals that were of unknown sex.
The treemap by breed shows that cats were primarily 3 breeds: domestic shorthair, domestic medium hair, and domestic longhair. Dogs were many different breeds with the top 3 being pit bull, pit bull/mix, and labrador retriever/mix.
What was the length of stay (LOS)? Cats had an average LOS of 6.2 days and median of 2.0 days. Dogs’ LOS was longer with an average of 10.8 days and median of 6.0 days. I created a calculated field taking the difference in days between the intake and outcome dates.
What were the most common intake types? For both cats and dogs, the most common intakes were strays, followed by foster returns and owner surrenders.
What were the most common outcome types? This varied but adoption was high for both cats and dogs. Cats had a higher percent of outcomes for SNR which were cats that were sterilized and returned to where they were found. Other cat outcomes included sent to foster and trap/neuter/return (TNR). Dog outcomes also included returned to owner (RTO), sent to foster, and transferred to another rescue or agency.
How did intakes and outcomes vary by day of the week? Intakes were lowest for both cats and dogs on Sundays. Outcomes were also lowest on Sundays for both, with cats highest on Wednesdays and dogs highest on Saturdays.
How did the flow of animals change over time? Cat intakes were low January – April and then increased dramatically starting in May and through the summer months, then dropping. This trend corresponds with kitten season. Dogs didn’t have as much variation with a peak in July, which may correspond to dogs who escape and are picked up as strays as a result of being scared of fireworks. Looking by year, intakes for both cats and dogs were the lowest in 2020 which corresponds to the Covid pandemic. Cat intakes quickly rebounded in 2021 and dogs rebounded in 2022.
The trends for outcomes by month and year have similar patterns to intakes for both cats and dogs.
Looking at LOS trends, cats had the highest average LOS in the summer months. Dogs had more fluctuation with highest average LOS in January, August, and December. Average LOS was lowest in 2020 and has been trending upwards since then.
Recommendations
Based on the analysis and insights, my recommendations are:
Limitations
Further analysis could be done on animals with multiple intakes. There were animals that had multiple intakes due to moving in and out of foster which could be skewing LOS calculations and other visualizations. An option would be to look at the LOS of the animals from intake to their “final” outcome, disregarding the foster activity.
In looking at animals that had the longest LOS, it appeared that there could be some data quality issues. An example is one dog that had a LOS of 1462 days. The dog had multiple records with an original intake as an owner surrender in 2019 with an outcome date in 2023. However, other records for this dog showed it going to foster, returning from foster, and being adopted by the foster all in February 2019. There were also a few animals with a LOS of 365 days so I am not sure if the outcome dates were accurate for them or if this was a default.
Age would be interesting to analyze but it was not included in the dataset. The same with health status.
There is a lot of additional slicing and dicing of the data that could be done.
Closing
Thank you for taking the time to learn about my project! Click here to check it out on Tableau Public. I would love to hear any feedback you have. I am also looking for opportunities as a data analyst. Leave a comment below or connect with me. You can also check out my data analysis project portfolio website here .
This project was personally interesting to me for many reasons. I am a lifelong animal lover with cats being my favorite. All of the cats except my first have been from a local animal shelter. I have been a volunteer at that same shelter since 2009 (15 years!) as well as a longtime financial donor.
Additional References
I found these to be helpful in doing this project:
Environmental Engineer | Analyst (R, Python, SQL, Tableau, Excel) | ???? Army Veteran
3 个月Nicely done, Christy!
Data Analyst who ?? Excel | SQL | Tableau | I analyze and interpret data so companies have the information and insights they need to make sound business decisions.
3 个月https://public.tableau.com/views/LouisvilleMetroAnimalServicesAnalysis2019-2022/Overview?:language=en-US&:sid=&:redirect=auth&:display_count=n&:origin=viz_share_link