World Bank SQL Data Analysis

World Bank SQL Data Analysis

In the world of data analysis you may get datasets that have a few hundred or maybe even a few thousand rows of data. In that case using analytical tools like Excel and Tableau are perfect to analyse datasets of that size.? Now on the flipside there are datasets that can have millions if not billions of rows of data (most large companies will have datasets of this size).? If you want to analyse a dataset of that size then that’s when SQL comes into its own.? In this project I will be analysing a dataset from The World Bank (more information below).

During this project I will be answering many questions including the following:

  • How many columns and rows does the dataset have?
  • How many countries/regions/organisations are represented in the dataset?
  • Which project received the highest individual loan?
  • Which countries/regions/organisations borrowed the most?
  • Which countries/regions/organisations borrowed the least?
  • From the Country that borrowed the most, how many projects borrowed/requested funds?
  • From the Country that borrowed the most, which project(s) borrowed the most?
  • From the Country that borrowed the most, what is the average amount borrowed by each project?
  • From the Country that borrowed the most, how many projects still have a balance to pay off?
  • From the Country that borrowed the most, what is the total amount still to be paid back to the IDA by all projects combined?
  • From the Country that borrowed the most, how much has been paid back to the IDA by all projects combined?

To analyse the data I will be using Bit.io which is a cloud based PostgreSQL service which is free to use online.? I will be writing several SQL queries using various SQL statements/syntax including the following:

  • LIMIT
  • ALIAS (AS)
  • WHERE
  • COUNT
  • GROUP BY
  • MIN / MAX / SUM / AVG
  • ORDER BY
  • AND / OR / NOT

DATA

As mentioned above the dataset is from The World Bank and shows information specifically related to the International Development Association (IDA) Statement Of Credits and Grants.? To give you more context The International Development Association (IDA) credits are public and publicly guaranteed debt extended by the World Bank Group. IDA provides development credits, grants and guarantees to its recipient member countries to help meet their development needs. Credits from IDA are at concessional rates. The data is in U.S. dollars calculated using historical rates. This dataset contains historical snapshots of the IDA Statement of Credits and Grants including the latest available snapshot. The dataset was updated on the 16th of December 2022 and can be found here.

ANALYSIS

The first thing I wanted to do was gain an overall outlook of the dataset, I wanted to know how many rows and columns the dataset has.? To do this I wrote the below queries to gather this information.?The first statement brings back the number of rows which is 1,119,992.? The second statement brings back the number of columns which is 30.? I can now see there are well over a million rows of data and that is where the power of SQL will come in.

No alt text provided for this image
No alt text provided for this image

I want to find out some more information before doing a deep dive into the data to answer specific questions.? I want to know how many countries/regions/organisations are represented in the dataset, the below query shows that there are 137 countries/regions/organisations represented.

No alt text provided for this image

I am curious to know that of all the projects that the IDA helped to fund, which project received the highest individual principal amount.? The below query shows the 10 projects that received the highest “Original Principal Amount” with the “BD: Padma Bridge” project from Bangladesh due to receive the highest at $1,200,000,000.00.

No alt text provided for this image

The above query shows the highest individual principal amount to a single project but that doesn’t mean that this is the only amount that the “BD: Padma Bridge” project was due to receive.? Therefore that means that another project could have received the most funding from the IDA via multiple loans or it could indeed be the “BD: Padma Bridge” project.? The below query shows which project was due to receive the highest principal amount all together.

No alt text provided for this image

I can see from the above query that the “BD: Padma Bridge” project was indeed due to receive the highest principal amount with a total of $168,000,000,000.00 but how many loans was the project due to receive to reach this figure?? The below query shows how many loans the “BD: Padma Bridge” project was due to receive in total.? I can see from the below query that there are a total of 140 separate “Original Principal Amounts” of $1,200,000,000.00 which gives the project a grand total of $168,000,000,000.00.

No alt text provided for this image

Having analysed the data above the numbers don’t quite sit right with me so I wanted to do some more digging.?I decided to write the below query to see how many of the 140 “Original Principal Amounts” were actually paid out to the “BD: Padma Bridge” project and to my surprise there were none.

No alt text provided for this image

This now shows that the “BD: Padma Bridge” did not receive the highest loan payment of any project.? This is further backed up by the information on the official project website here which shows that the project was due to receive $1,200,000,000.00 from the World Bank however they opted to use a different fund.? I will need to write another query to find out which project actually received the highest loan payment against their “Original Principal Amount”.? The below query now shows that the “Reengagement and Reform DPF” project in Sudan had the highest Individual “Original Principal Amount” of $1,160,000,000.00 of which $1,096,893,255.03 or 94.56% has actually been paid out to the project.

No alt text provided for this image

Seeing that I have looked at projects which borrowed the most I now want to look more at the countries. I will be doing a deep dive into a specific country and see what their finances/commitments are but before I pick which one, I want to know a little more about the countries within the dataset. I want to know which countries borrowed the most and which borrowed the least from the IDA.? The first query below shows the 10 countries which have borrowed the most from the IDA.? I can see from these ten countries that India has borrowed the most from the IDA with a total amount of $5,594,167,646,305.81 (which is 81.29%) of the “Original Principal Amount”. Seeing that the “Original Principal Amount” is different this indicates that India is either waiting for more funds to be paid out or there will be no more funds paid out to match the “Original Principal Amount”.? This is the same for all the other countries in the top 10 with the exception of China who has actually received $36,281,664,696.21 more than their “Original Principal Amount”.

No alt text provided for this image

The next query shows the 10 countries/regions/organisations that have borrowed the least to the IDA? I can see that the Middle East and North Africa region has borrowed the least from the IDA with $600,000.00 compared to their “Original Principal Amount” of $7,700,000.00.? However I can also see that the Country with the least amount of borrowing is St. Kitts and Nevis who received a total of $213,519,448.80 compared to their “Original Principal Amount” of $210,000,000.00.? This shows me that St. Kitts and Nevis actually received $3,519,448.80 over and above their “Original Principal Amount” but was still way below what the next country received which was Swaziland with $475,344,113.37.

No alt text provided for this image

I can see there is a huge difference between the lower end of the borrowing scale to the extreme high end ($5,593,954,126,857.01 to be exact between India and St. Kitts and Nevis).? Seeing that India does indeed have the highest borrowing I decided to choose this country to answer some more questions.

The first question I have is how many projects in India has received/requested borrowing from the IDA?? The below query shows that India has/had 454 projects that received/requested borrowing.?

No alt text provided for this image

Next I wanted to know which project(s) in India received the most funding.? The below query shows the top 10 projects in India that received the most funding/borrowing.? I can see the project with the most funding/borrowing is the “IN: Elementary Education (SSA II)” project with a total amount of funding/borrowing at $181,533,930,820.82 compared to their “Original Principal Amount” of $189,000,000,000.00

No alt text provided for this image

Now that I know which project has borrowed the most from the IDA, I now want to know what is the average amount borrowed by the 454 projects in India?? The below query shows that the average amount loaned to each project in India is $12,459,170,704.47

No alt text provided for this image

I have now got a good picture of the initial borrowing that India has received however I am now curious to know how much has been paid back and how much is still owed? The below query shows me how many projects still have an outstanding balance and how much that balance is.? I can see that of the 454 projects within India 448 are still paying off a balance and only 6 projects have no balance to pay.

No alt text provided for this image

I now wanted to look at India as a whole and find out how much is still due to be paid back from their initial borrowing.? I wrote the below query which provides me that figure.? I can now see that the total amount due to the IDA from all projects in India is $3,366,686,252,663.57

No alt text provided for this image

I wrote a further query to see how much has been paid back to the IDA from all projects located in India.? I can see below that the total amount paid back so far is $2,379,385,222,422.60.

No alt text provided for this image

MY INSIGHTS

  • There are 1,119,992 rows and 30 columns in the dataset.
  • There are 137 countries/regions/organisations represented in the dataset.
  • The “Reengagement and Reform DPF” project in Sudan had the highest Individual “Original Principal Amount” of $1,160,000,000.00 of which $1,096,893,255.03 or 94.56% has actually been paid out.
  • India received the most funding with $5,594,167,646,305.81 paid out in loans so far.
  • The Middle East and North Africa region borrowed the least from the IDA with $600,000.00.?However the Country with the least amount of borrowing is St. Kitts and Nevis who received a total of $213,519,448.80.
  • There are 454 projects in India that has received/requested funding from the IDA.
  • From the projects located in India the “IN: Elementary Education (SSA II)” project borrowed the most from the IDA with a total of $181,533,930,820.82.
  • The average amount borrowed per project in India is $12,459,170,704.47.
  • Of the 454 projects within India 448 are still paying off a balance, only 6 projects have no balance to pay.
  • The total amount still due to be paid back to the IDA from all projects in India is $3,366,686,252,663.57.
  • The total amount paid back to the IDA by all the projects in India is $2,379,385,222,422.60.

Using the power of SQL I was able to analyse this large dataset fairly quickly using some basic statements and was able to answer quite a few questions.?A lot more insights could be gained from this dataset but I could end up going down a data spiral.

Thank you for taking the time in reading my SQL Data Analysis project.?If you enjoyed reading my project and would like to connect on LinkedIn the please check out my profile here.?You will also find more data analysis projects on my featured section on LinkedIn.


Cynthia Clifford

Strategic Energy Management Data Analyst at CLEAResult -- Creative Problem Solver | Data-Driven Insights | Client-Centric Solutions Specialist

2 年

You've done a fabulous job on this article, Stuart Walker. You took a deep dive. I loved seeing the process of how you explored the Padma Bridge. I loved the way you "followed your nose."

Avery Smith

?? I help people land their first data job (even with no experience) ?? Join 10k+ other analysts & get my newsletter! ??? Host of The Data Career Podcast

2 年

I’m dizzy looking at this thing.

Karen Waggoner

Data Analyst | ETL | Excel | SQL | Tableau

2 年

So many questions, so many answers... Love it!

Zeina Eze

Data Analyst | Excel | Tableau | SQL | Educator

2 年

Very detailed, nice work Stuart Walker !

Sarah Rajani

'Data with Sarah' ? Data Analyst at Government of AB (Ministry of Justice) ? Sharing practical data tips, insights, and lessons learned

2 年

Great analysis, Stuart! ??

要查看或添加评论,请登录

Stuart Walker的更多文章

  • Fraud Detection using XGBoost: A Machine Learning Approach

    Fraud Detection using XGBoost: A Machine Learning Approach

    The core of the project lies in handling the significant class imbalance typical in fraud detection datasets and…

    10 条评论
  • R HR Attrition Data Analysis

    R HR Attrition Data Analysis

    Human Resources (HR) attrition, also known as employee turnover, is an enduring concern for organisations across…

    18 条评论
  • Python Engineering/Iron Ore Processing Data Analysis

    Python Engineering/Iron Ore Processing Data Analysis

    Iron ore processing is a crucial aspect of the metallurgical industry, primarily concerned with the extraction and…

    21 条评论
  • NBA 2021/2022 Player Stats Data Analysis

    NBA 2021/2022 Player Stats Data Analysis

    The role of analytics in sports is crucial, as it offers valuable insights that empower teams to make informed…

    8 条评论
  • Healthcare SQL Data Analysis

    Healthcare SQL Data Analysis

    In my previous SQL project I looked at financial data from the World Bank. I was able to analyse the data quickly using…

    6 条评论
  • Massachusetts School Data Analysis

    Massachusetts School Data Analysis

    Imagine being hired to analyse a wealth of education data for a specific state in USA, well in this Tableau project…

    8 条评论
  • DoorDash Data Analysis

    DoorDash Data Analysis

    In the world we now live in we have a multitude of options to stay at home and get hot delicious food and groceries…

    30 条评论

社区洞察

其他会员也浏览了