Healthcare Data Analysis Project Using Intermediate SQL

Healthcare Data Analysis Project Using Intermediate SQL

Introduction:

Running a hospital efficiently requires a deep understanding of complex operations and the ability to coordinate with staff members across all departments to ensure patients receive the right treatments and care in a timely manner. It is essential that every hospital employee understands their role and how they can best contribute to serving patients efficiently, ensuring that the average patient stay does not exceed the recommended duration for recovery.

Managing complex operations like those in hospitals relies heavily on data. This data provides valuable insights into hospital performance related to diabetes care, enabling analysts to identify areas that require the most support for improving patient outcomes and operational efficiency.

This article focuses on analyzing hospital data using intermediate SQL concepts.

The dataset used in this analysis is publicly available through the University of California, Irvine’s Machine Learning Repository. If you'd like to explore it yourself, you can find it here:

?? Hospital SQL Dataset

Data Analysis and Findings:

Before I get started with my analysis, first let’s take a look at the data that we will be working with.

Demographics Table Results
Health Table Results

Data Analysis and Findings:

We want to determine the length of stay for patients in the hospital.

Query 1
Query 1 - Output

This query helps us understand the length of stay for patients in the hospital. As shown in the output, most patients stay for three days, which is clearly reflected in the distribution.

One interesting aspect of this query is how it uses SQL to create a histogram. This is a visualization method that SQL isn’t typically known for, making it a cool and unique way to view the data.

What medical specialties have the most average procedures?

Query 2
Query 2 - Output

This query shows us how many procedures were performed on average for patients admitted to the hospital, while the count represents the total number of procedures conducted.

As you can see, Surgery-Thoracic has the highest average number of procedures at 3.5. Cardiology, on the other hand, has the highest total number of procedures conducted, with 5,352 in total.

Query 3
Query 3 - Output

This query is similar to the one above, with the main difference being that we’ve added a filter to only include specialties where the average number of procedures is greater than 2.5.

This adjustment allows us to focus on a more concise view of the top 5 specialties with the highest average procedures being conducted.

Does race play a factor in the number of lab tests conducted?

Query 4
Query 4 - Output

The short answer is no. Race does not appear to play a significant role in the average number of lab tests conducted. The differences in distribution percentages are so minuscule that they show that race has nothing to do with the number of lab procedures performed.

From an SQL perspective, this query uses a JOIN to link the health table with the demographics table. The column highlighted in purple represents the key used to match both tables. In this case, it is the patient_nbr column.

Classifying Lab Procedures as Few, Average, or Many.

Query 5
Query 5 - Output

As shown in the output, there is a strong relationship between the average time spent in the hospital and the frequency of procedures. Patients who stay longer tend to undergo more procedures, indicating that longer hospital stays often involve more extensive medical care.

Patient Numbers of African Americans Prescribed Metformin with Status 'Up'

Query 6
Query 6 - Output

As you can see here, we have used the UNION function to retrieve patient numbers based on two criteria:?

  1. Being identified as African American in the demographics table
  2. Being prescribed metformin with the status "Up" in the health table

What’s interesting about the UNION function is that it operates differently from a JOIN. A JOIN combines tables horizontally by matching columns, while a UNION stacks the results of two queries vertically, combining rows from both tables.

Patient Summary Based on Number of Medications, Admittance Status, Demographics, and Lab Procedures

Query 7
Query 7 - Output

As shown in the query above, the highest number of medications taken by a patient was 81, and the highest number of lab procedures recorded was 98. Additionally, in terms of demographics, Caucasians appear most frequently on this summary list.

This query utilizes the CONCAT syntax, which combines data with fixed text (highlighted in blue) to generate readable statements in the report.

Conclusion:

I thoroughly enjoyed using SQL to analyze hospital data and discovered several valuable insights along the way. This analysis leveraged intermediate SQL functions such as HAVING, JOIN, CASE, and CONCAT.

Here are the key takeaways:

  • Patients admitted to the hospital typically stay in the 1–3 day range, which is a positive indicator that they are receiving proper care and resources, enabling them to return home and maintain their health.
  • The Cardiology department performed the highest average number of procedures among all specialties.
  • There is no significant bias in terms of race and the average number of lab procedures, as the differences in percentages are minimal.

This project was both fascinating and insightful, highlighting the importance of using data in the healthcare industry. The data used in this project has immense value, as it can improve hospital operations. Data-backed decisions not only enhance efficiency for hospital staff but also ensure that patients receive timely and effective treatments without unnecessary delays.

If you enjoyed this article and want to see more of my explanatory data analyses, please like, comment, and connect!


Peg Blaine, RHIT, CPT

Data Analytics | SQL | Excel | Python | R | Tableau | Storytelling | HIM

2 个月

Great analysis, visualizations and insights Moiz Noorali ??

回复
Isaac Oresanya

Data Analyst @ DCJ & Data Evangelist ?? Voice for New Analysts & Data Beginners ?? Helping businesses win with data ?? Teaching, Scraping & Analyzing to Help You Fall in Love with Data

3 个月

Your work is engaging and clear. Nicely done!

Stuart Walker

Fraud Prevention Analyst @ M&G PLC | Data Analyst | Data Scientist | Python | SQL | Machine Learning | Data Analytics | Excel | Tableau | Power BI | R

3 个月

Good Job Moiz ??????

Omhari Gurung

Data Analyst | SQL, Tableau, Excel, Python, R, Data Visualization | Business Intelligence & Insights | Open to Remote & H-1B Sponsorship

3 个月

Hi Moiz, Your ability to derive actionable insights from complex datasets is truly commendable. Keep up the great work! ?? ??

Very informative Article ??

要查看或添加评论,请登录

Moiz Noorali的更多文章

社区洞察

其他会员也浏览了