Available Datasets for Study – Big data, data science and research

Available Datasets for Study – Big data, data science and research

Summary: This text details numerous publicly accessible datasets. Several websites are highlighted, including "Our World in Data" focusing on global issues, "Get Open Data" offering UK-centric data, and "Open Data Northern Ireland" providing government and public sector information. Specific datasets are listed, covering areas like health, economics, and crime. Additional resources are also included, such as links to repositories containing New York City taxi data and articles compiling various big data sources. The overall aim is to provide a comprehensive overview of readily available data for research and analysis.


·?????? Our World in Data

·?????? Get Open Data

·?????? Data.gov.ie

·?????? Open Data Northern Ireland

·?????? Dept for Health NI Covid 19 Statistics Microsoft Power BI

·?????? Yellow Cab Taxi Big Data NYC Yellow Taxi Trip Data on Kaggle

·??????20 Free Big Data Sources Everyone Should Know:

This article lists some of the best free big data sources available for various uses.

·??????70 Open Data Sources for Big Data:

This article compiles 70 free data sources across various categories such as government, health, finance, and more.

·??????Big Data Sources - LibGuides at Washington State University:

This guide links to multi-disciplinary datasets, some of which are openly available, while others may require affiliation or contact with the data-collecting organization.

·??????Top 10 Open Data Resources Online:

This article provides a collection of open data resources that can be accessed online for research or analysis.

·??????Big Data: 33 Brilliant And Free Data Sources:

This Forbes article lists 33 free data sources that are useful for big data projects.

·??????What is a Data Source? Definitions and Examples:

This resource explains what a data source is and provides examples of different types of data sources.

·??????Sources of Big Data: Where does it come from?:

This blog post explores diverse sources of big data, including social media, IoT, and more.

·??????Data Sources and URLs - Oracle Documentation:

This documentation discusses connecting applications to databases using JDBC data sources and URLs.

·??????Social Media Data Sources - George Mason University:

This guide discusses how to acquire and use social media data and tools.

·??????106 Free Data Sources For Any Project in 2025:

This article lists 106 trustworthy, free data sources for various types of projects.


In this section I go into more detail about some of the data sets referenced above. ?Using AI summarising technology to summarise the available data.

Our World in Data https://ourworldindata.org/

The website "Our World in Data" provides a comprehensive collection of datasets and research on various global issues. Here is a summary of some of the key datasets available on the site:


1. CO? Emissions: Data on carbon dioxide emissions, a primary contributor to climate change, including historical trends and comparisons between countries.


2. Economic Inequality: Information on income and wealth disparities across different regions and populations.


3. Human Rights: Data related to human rights violations and progress in human rights around the world.


4. Poverty: Statistics on poverty rates, including the share of the population living in extreme poverty, defined as living on less than $2.15 per day.


5. Energy: Data on energy production and consumption, including renewable energy sources and fossil fuel usage.


6. Life Expectancy: Historical and current data on life expectancy at birth, showing improvements over time and regional disparities.


7. Causes of Death: Information on the leading causes of death globally and how they have changed over time.


8. Population Growth: Data on population trends, including growth rates and demographic changes.


9. COVID-19: Comprehensive data on the COVID-19 pandemic, including cases, deaths, and vaccination rates.


10. Agriculture: Data on agricultural land use, including the proportion of land used for livestock and crop production.


11. Maternal Mortality: Statistics on maternal deaths and how they vary across different countries and regions.


12. Education: Data on literacy rates and access to education, highlighting the importance of foundational skills like reading and writing.


13. Electricity Access: Information on the percentage of the population with access to basic electricity, with a focus on regions lacking this essential service.


14. Foreign Aid: Data on foreign aid contributions as a share of national income, highlighting trends over time and differences between donor countries.


15. Health: Data on various health indicators, including child mortality rates, undernourishment, and the prevalence of diseases.


These datasets are presented through interactive visualizations and are freely available for use, aiming to inform and empower those working towards solving global challenges.


GetOpenData https://GetOpenData.ai

The website "GetOpenData" provides access to a variety of open datasets across several categories, including health, policing, criminal justice, public opinion, education, and housing. Here is a summary of the datasets currently available and those planned for future inclusion:


### Currently Available Datasets:


1. A&E Attendances and Waits - NHS Statistics:

?? - Data on the number of attendances at Accident and Emergency (A&E) departments and the number of patients waiting more than 4 hours.


2. Hospital Waiting List Sizes - NHS Statistics:

?? - Information on waiting list sizes and average waiting times for each hospital trust, updated monthly.


3. Sentencing Outcomes England and Wales - Ministry of Justice:

?? - Monthly data on sentencing outcomes issued at criminal courts, including the number of offenders sentenced, types of sentences, custody rates, and sentence lengths.


4. Stop and Search - police.data.uk:

?? - Records of all stop and search incidents by police force and local geography.


5. UK Employment, Unemployment, and Economic Inactivity by Age Group - ONS Labour Force Survey:

?? - Employment, unemployment, and economic inactivity levels and rates by age group, provided as rolling three-monthly figures, seasonally adjusted.


6. UK General Election Results 2024 - BBC News:

?? - Results from every constituency in the 2024 UK General Election.


### Upcoming Datasets:


- Annual GDP by Region (England and Wales) - ONS

- Commissioner Waiting List Sizes - NHS Statistics

- Court Effectiveness Data - MoJ Quarterly Court Statistics

- Number of UK Businesses by Local Area - ONS

- Personal Well-being in the UK - ONS

- Personal Well-being in the UK by Local Authority - ONS

- Polling for the UK General Election - UK Polling Report

- Quarterly GDP by Region (England and Wales) - ONS

- Real-Time Economic Activity - ONS

- Recorded Crime in England, Wales, and Northern Ireland - police.data.uk

- Retail Sales Data (various breakdowns) - ONS

- Sexual Orientation in the UK (by age, sex, and region) - ONS

- Subnational Population Projections for England - ONS

- Suicides in England and Wales - ONS

- UK Annual Incomes - ONS

- UK Spending Using Credit and Debit Cards - ONS

- UK Trade by Country and Commodity - ONS

- Weekly Deaths (by age, sex, health board, local authority, and region) - ONS


These datasets are designed to be accessible and actionable, providing insights through a simple chat interface without requiring coding or data skills. The platform is currently in beta and welcomes feedback for improvement and suggestions for additional datasets.


Open Data Northern Ireland


Transparency and innovation through the publication of

government and public sector datasets

Here are the extracted lists:



·?????? Economy, industry & employment

·?????? Education

·?????? Environment & agriculture

·?????? Finance

·?????? Health

·?????? Population & society

·?????? Property & land

·?????? Tourism, leisure, culture & arts

·?????? Transport



1. AccessNI

2. Agri-Food and Biosciences Institute - Fisheries and Aquatic Ecosystems Branch

3. Antrim and Newtownabbey Borough Council

4. Ards and North Down Borough Council

5. Armagh City, Banbridge and Craigavon Borough Council

6. Belfast City Council

7. Belfast Community Planning Partnership

8. Business Services Organisation

9. Causeway Coast and Glens Borough Council

10. Charity Commission for Northern Ireland

11. Commissioner for Older People for Northern Ireland

12. Department for Communities

13. Department for Communities - Debt Management

14. Department for Communities - Historic Environment Division

15. Department for Communities - Professional Services Unit

16. Department for Infrastructure

17. Department for Infrastructure - Parking Enforcement Unit

18. Department for Infrastructure - Planning

19. Department for Infrastructure - Rivers

20. Department for Infrastructure - Roads

21. Department for Infrastructure - Transport Regulation Unit

22. Department for Infrastructure - Walking and Cycling Branch

23. Department for the Economy

24. Department for the Economy - Minerals and Petroleum Branch

25. Department for the Economy - Statistics & Research Branch (Tertiary Education)


Dept for Health NI Covid 19 Statistics

Microsoft Power BI



Yellow Can Taxi Big Data

Here are some relevant URLs related to the "Yellow Can Taxi Big Data":

  1. GitHub - NYC-Yellow-Taxi-Big-Data-Analytics: This repository contains an analysis of New York City yellow taxi data using Hadoop MapReduce 1.
  2. 2019 Yellow Taxi Trip Data - JSON File: This dataset includes trip records for yellow taxis in 2019, provided in JSON format 2.
  3. 2022 Yellow Taxi Trip Data - CSV File: This dataset includes trip records for yellow taxis in 2022, provided in CSV format 3.
  4. NYC Yellow Taxi Trip Data on Kaggle: This Kaggle dataset is useful for practicing machine learning skills on time-series data related to NYC yellow taxi trips 4.
  5. GitHub - BigData-Ops-on-TLC-Yellow-Taxi: This repository contains an analysis of New York City's yellow taxi data using various big data tools such as Hadoop, HBase, Sqoop, MapReduce, and AWS Cloud Infrastructure 5.
  6. Big Yellow Taxi Data - White & Black: This article discusses the accidental disclosure of personal data about New York City cab rides and the challenges of anonymizing big data 6.
  7. TLC Trip Record Data: This page provides information about the trip record data for yellow and green taxis, including fields capturing pick-up and drop-off details, trip distances, fares, and more 7.
  8. NYC Taxi and Limousine Yellow Dataset - Azure Open Datasets: This dataset on Azure Open Datasets includes yellow taxi trip records with various details such as pick-up and drop-off times, locations, trip distances, fares, and passenger counts


Ten Big Data URLs

Here are ten URLs that provide access to big data sources:

  1. 20 Free Big Data Sources Everyone Should Know: This article lists some of the best free big data sources available for various uses 9.
  2. 70 Open Data Sources for Big Data: This article compiles 70 free data sources across various categories such as government, health, finance, and more 10.
  3. Big Data Sources - LibGuides at Washington State University: This guide links to multi-disciplinary datasets, some of which are openly available, while others may require affiliation or contact with the data-collecting organization 11.
  4. Top 10 Open Data Resources Online: This article provides a collection of open data resources that can be accessed online for research or analysis 12.
  5. Big Data: 33 Brilliant And Free Data Sources: This Forbes article lists 33 free data sources that are useful for big data projects 13.
  6. What is a Data Source? Definitions and Examples: This resource explains what a data source is and provides examples of different types of data sources 14.
  7. Sources of Big Data: Where does it come from?: This blog post explores diverse sources of big data, including social media, IoT, and more 15.
  8. Data Sources and URLs - Oracle Documentation: This documentation discusses connecting applications to databases using JDBC data sources and URLs 16.
  9. Social Media Data Sources - George Mason University: This guide discusses how to acquire and use social media data and tools 17.
  10. 106 Free Data Sources For Any Project in 2025: This article lists 106 trustworthy, free data sources for various types of projects 18.




Les Black的更多文章
