Improving Bolivia's Infrastructure: An Analysis of Bolivia's Financial Loans Using SQL
CONTEXT
Bolivia flies under the radar as one of the best countries in South America to visit.
It’s rich in indigenous culture. Geographically, it shares Lake Titicaca along the Peru border, which is a popular place for hiking, climbing and experiencing all kinds of wildlife. The capital, La Paz, sits at 3,650 meters above sea level and is the highest capital city in the world featuring a dramatic mountain setting (Wilson, 2024). Lastly, it’s an affordable and cost-friendly place to travel as food, drink and hospitality resources come at a cheaper rate. The only downside to Bolivia is traveling from town to town as the infrastructure and road conditions are underdeveloped. If renting a car, prepare for bumpy conditions and be sure to include proper insurance (Kilroy, n.d.).
Attaining sustainable and reliable infrastructure throughout Bolivia is one of the country’s many developmental challenges that has taken priority over the recent years. There are still many rural areas of Bolivia without paved roads. To help Bolivia achieve safe and reliable road conditions, The World Bank Treasury (IBRD ? IDA) has greenlit funding to help improve infrastructure for all major roads and rural areas of Bolivia.
For this project, I am playing the role of a data analyst for the IDA. A stakeholder has requested an update on Bolivia’s most recent projects and where the country stands financially. I will use SQL to conduct the analysis, explain the commands used and provide key learnings.
THE DATA AND TOOLS
The dataset was downloaded from IDA’s website and given the name “banking_table.”?
You can access the dataset HERE.
CSV Fiddle was used to write the SQL statements and pull the data needed for the analysis.
SQL COMMANDS USED FOR THIS PROJECT
PRE-ANALYSIS
To get a sense of how big the dataset is before playing in it, use the following query to identify how many rows.
QUERY
OUTPUT
The query above uses the COUNT aggregation function. This determines the total number of transactions or rows in the dataset.
Additionally, the AS clause renames the column to “total_rows.” The AS function is used consistently for most of the upcoming queries.
The output shows that this data set includes more than one million rows!
KEY QUESTIONS
My key questions before diving into the analysis include:
ANALYSIS #1: How much money in total has the IDA loaned to Bolivia?
QUERY
OUTPUT
The query above uses the SUM function to total up the amount of USD in the ‘Country’ column.?
The WHERE clause is applied to grab only the rows that have ‘Bolivia’ recorded.?
Since the values in the ‘Original Principal Amount’ column have been included as various characters (varchar) and not numerical values, they must be transitioned into a numerical value using the CAST function in order to sum successfully. The CAST function is converting the various characters in the ‘Original Principal Amount’ to a floating number (ie. a decimal number). This rounds the number to the nearest decimal point.?
The query shows that the IDA has loaned more than $358 billion dollars to Bolivia to date. *Whoa!*
ANALYSIS #2: How many projects has the IDA funded for Bolivia?
QUERY
OUTPUT
The query uses the COUNT(DISTINCT) function to count the number of unique project IDs WHERE Bolivia is recorded in the dataset.
The output showcases that there have been 95 projects assigned to Bolivia.
ANALYSIS #3: What is the average amount of money per project that the IDA has loaned to Bolivia?
QUERY
领英推荐
OUTPUT
Using the Analysis #1 query as a template, switch out the SUM for the AVG function.?
The output shows that the IDA has loaned roughly $21.3 million dollars, on average, to Bolivia.
ANALYSIS #4: What are the names of the top 3 most expensive projects that the IDA has funded for Bolivia?
QUERY
OUTPUT
The MAX function delivers the project with the highest value in the ‘Original Principal Amount’ column. In coordination with MAX, WHERE tells the data to only look at projects that pertain to Bolivia.?
?Since the values in the ‘Original Principal Amount’ column are various characters (varchar) and include commas, they must be transitioned into a numerical value sans commas using the CAST and REPLACE function. Without doing this step, the SUM of the values will be unsuccessful.?
The CAST function converts the varchar to a double data type, which is a data type that stores larger and more precise values.?
The REPLACE function strips each value of its commas.
The WHERE function now sums up all the double data types that belong to Bolivia.?
The GROUP BY function pulls the ‘Project Name’ column into the output to identify the names of the projects.?
The ORDER BY function signals the data to list the values in the 'Original Principal Amount' column in descending order (DESC) from highest to lowest while including the project name.
The output shows that the top three most costly projects for Bolivia:
ANALYSIS #5: Which project had the most money canceled and what was the amount?
QUERY
OUTPUT
Take the query used for Analysis #4 and use it as a template for this query.
The output shows that BO Access and Renewable Energy was the project with the highest amount canceled. There is no information on IDA’s website mentioning the reason for the canceled amount.
ANALYSIS #6: How many projects has Bolivia paid back?
QUERY
OUTPUT
The query above uses a similar format to Analysis #2’s query. The difference here is the addition of the AND clause to include two filters in the data – rows pertaining to Bolivia and a fully repaid credit status.?
The query shows that Bolivia has paid back 70 projects.
ANALYSIS #7: How much of the money has Bolivia paid back?
QUERY
OUTPUT
The query above is a familiar one from a previous question that uses the SUM function to total up the amount of USD in the “Repaid to IDA (US$)’ column.?
The output shows that the IDA has paid back more than $229 billion dollars to the IDA.?
KEY LEARNINGS:
FEEDBACK
As a data analyst in the social analytics space, I found that having a strong background in Excel knowledge helped me learn, understand and successfully write all the SQL statements needed for this analysis. I’m currently on a journey involving continuous learning and improvement of my data analysis skills, so thank you for making it this far and tuning in! I welcome constructive feedback, so don't hesitate to reach out with any insights or suggestions :)
SOURCES
Director at Huron Consulting Group
1 年I love bolivia!
Freelance Social Media Manager
1 年I do not understand it but I LOVE IT, GO LEO
Insights Analyst @ Quid | Market Intelligence & Research | AI in Public Relations | Communication M.S.
1 年Very insightful! Thanks for sharing
Fraud Prevention Analyst @ M&G PLC | Data Analyst | Data Scientist | Python | SQL | Machine Learning | Data Analytics | Excel | Tableau | Power BI | R
1 年Good job Leo ??????