NIR605: Critical Data Studies “A critical analysis of the Open Data movement”
Paddy Gorry
CRT Foundations in Data Science PhD Student | Intel Postgraduate Scholar 21/22
Open data is data that is open for use by anyone and everyone, with no restrictions on access or use. A core duty of the open data movement is the furthering of open data knowledge educating stakeholders of the vast potential that open data has for society at large. In theory, open data provides a means of efficient and transparent governance, with the potential for greater accountability where open government data is implemented. Open data can lead to greater public participation and social innovation. There is also huge economic potential in open data, as valuable datasets can be made widely available, promoting economic innovation and job creation. The aim of this essay is to provide a critical analysis of the open data movement, by evaluating some of its positive and negative characteristics. The main aspects of the open data movement that will be assessed are government accountability, access to open data, and the funding of open data.
The open data movement has been steadily growing in the last number of decades. A pivotal moment in the open data movement was a conference hosted by Tim O’Reilly in Sebastopol, California in 2007. O’Reilly, along with 29 other open government advocates, met with the goal of developing a list of core principles for open government. What followed from this conference became a core foundation of the open data movement going forward. The eight core principles outlined here, that data should be: complete, primary, timely, accessible, machine processable, non-discriminatory, non-proprietary, and license-free, are now all ubiquitous terms in the open data community. The goal of the group in developing these principles was to allow world governments to “become more effective, transparent, and relevant to our lives”. (O'Reilly & al., 2007) In the years following this conference, many world governments begun to take steps toward open government. In 2007 the US launched their own public data portal ‘Data.gov’ as a means of providing public access to many high value datasets produced by the government. (US General Services Administration, n.d.) More recently the Irish Department of Public Expenditure and Reform published their Open Data Strategy for 2017-2022, with the core aim of becoming a world leader with regards to open data and to “create an environment where the economic, social, and democratic benefits of open data are recognised and realised.” (Dept. of Public Expenditure and Reform, 2019) Open data has been adopted by many world governments who understand the potential benefits that open data and open government can bring. The potential of open data has been well documented but at times, misunderstood. It is clear open data can serve to benefit many industries as well as public bodies, but at present there are several issues that have hindered the growth of the open data movement.
The open data movement is strongly rooted in the context of open government. The eight principles outlined by the Open Government Working Group in 2007 served as the core foundation for many open government initiatives going forward. One of the most commonly stated theoretical benefits of open government, is the increased level of transparency and accountability, as well as efficiency, provided by open data. In their 2017 findings submitted to members of the G20, the OECD reported the strong potential for open data to “reinforce anti-corruption efforts by strengthening transparency, increasing trust in governments, and improving public sector integrity and accountability.” (OECD, 2017) An example of this action comes from Argentina. Between 2007 and 2015, the quality of national statistics in Argentina was on the decline. A nation with an unfortunate history of corruption, political pressure led the government to release unreliable data which misrepresented the true reality of the Argentinian economy and society. This was done to project a more positive image of the country. A consequence of this misrepresentation is that data sources before mid-2016 are unreliable (OECD, 2017). As of 2019 some positive steps have been made to improve the state of data released by the government. The administration began to prioritise the release of good quality data to the public. Records of tax and income information for public servants was released as a means of re-gaining the trust of the public in government data. The OECD report on Argentina’s open data outlined an issue that is commonplace in many countries hoping to adopt more open data practices. A large portion of stakeholders lack the motivation to move towards providing high quality open government data. It has been found that certain incentives, financial and meritorious, may be effective motivators in the short term, but could be overall detrimental in the long term. (OECD, 2018) This example shows the importance of proper education and instruction with regards to open data principles.
The example of Argentina is indicative of a common trend in many instances of open government or open data initiatives. Open data principles are often only partially enforced, or are implemented poorly, resulting in many of the potential benefits being lost. This can be said with regards to the reduction of corruption which open data is cited as helping prevent. Brazil is a country with a long history of corruption scandals at all levels of government. Transparency International ranked Brazil as number 76 on their ‘Corruption Perceptions Index’ (Transparency International, 2015), indicating that the level of perceived corruption in Brazil is still presently very high. Despite this fact, Brazil has also been ranked in the top 20 countries globally for open data, taking into account policy, practice, and implementation. (Iglesias, 2017). With regards to fighting corruption and improving transparency and accountability however, there are still some issues present. Much of Brazils open data strategy and policy is not centralised and is carried out across multiple departments. (Iglesias, 2017) In addition to this while there has been a great deal of work in developing open data infrastructure, much of the data is of poor quality, in many cases containing too much or unnecessary detail and thus “details of illicit behaviour may be drowned out” (Hulstijn, et al., 2017) According to Transparency Internationals findings, despite also ranking Brazil as a top 20 country in terms of their implementation of open data, it is clear that the implementation of open government data still needs much work. Open data in Brazil often lacks information on licensing, is often inconsistent in terms of quality, and at present is limited to only a few sections of the public sector. Public finance data is up to data whereas there is a lack of datasets relating to lobbying, land and company registers, among others. (2017) While the open data movement has the potential to greatly reduce government corruption through the implementation of transparency and accountability through the introduction of robust open data policy, there is still a great deal of work that and research that is needed in order to effectively reduce governmental corruption.
One of the most fundamental aspects of open data, is access. Data cannot be open unless it is accessible to “the widest range of users for the widest range of purposes.” (O'Reilly & al., 2007) Open access to data sources is crucial, and as such the development and maintenance of functioning data infrastructures is important for open data to succeed. A physical infrastructure, such as a network of roads, must follow strict and consistent practices in order to provide an effective and efficient means of travel. In the same way an effective data infrastructure must provide clear and unrestricted access to high quality datasets. A successful data infrastructure must consist of clean, well maintained datasets that follow the core principles of open data and can be effectively used by any user that accesses them. These datasets require servers to host them as well as staff to monitor and maintain them, as well as ensure both the servers and data themselves are adhering to any necessary guidelines or standards set for them. An example of an effective data infrastructure is Open Street Map. OSM is a global geospatial infrastructure, operated much like Wikipedia, allowing ordinary users and commercial organisations alike to contribute data. This operations model has allowed OSM to grow at an immense rate, with the service having over seven million contributions to date. (Open Street Map, 2021) This openness however can result in the inconsistent or poor-quality data being entered into the system. In their assessment of French OSM data (Girres & Touya, 2010), Girres and Touya found that while OSM has a great deal of responsiveness and flexibility, the lack of more standardisation with regards to metadata and attribute data can result in poor data quality. This can be seen in certain areas by querying OSM data and viewing the attribute values. The city of Tokyo is full of convenience stores known as “Konbini”, and geospatial data indicating the locations, names, etc. of these stores can be easily acquired from OSM. A quick glimpse at the data shows some of the inconsistencies, and often unnecessary data, that can be attributed to geospatial features. In the case of Konbini the respective brand/store name is often spread among six or more features, rather than just having a single name attribute in either English or Japanese. Some of the Konbini will include upwards of 130 attributes, with others having less than half of these. This lack of standardisation can result in a great deal more processing work being required to make proper use of OSM data. This relates to another key principle of open data, that data should be machine processable, i.e. “data is reasonably structured to allow automated processing” (O'Reilly & al., 2007) Though OSM data is readily available in a variety of non-proprietary formats (e.g. .geojson, .shp) which can be processed by many different software packages, this lack of standards for attribute data can essentially negate the benefits of the system using these non-proprietary formats.
One of the most significant factors relating to the open data movement is funding. The net goal of the open data movement is to provide the end-user with free access to open data resources with little or no restrictions, however the production and curation of this data can, in many cases, be very costly. Using the example of Open Street Map, a great deal of the infrastructure and content of the site has been contributed by willing users while the maintenance of the sight itself is overseen by the Open Street Foundation and funded by donations and membership fees (OpenStreetMap Foundation, 2020). OSM is a clear example of a successful open data infrastructure, and they are an established name with powerful stakeholders that have a vested interest, as well as the means, in sustaining the service – major users of OSM data include Facebook, Microsoft, and Amazon (OpenStreet Map Foundation, 2020). The ubiquity of OSM in GIS also results in their potential base for donations or membership fees to be much larger. This unfortunately cannot be said for all open data initiatives. A service like OSM has been in operation for 16 years and in that time has been built up with millions of contributions, both in geospatial data and in money. For a smaller company or public body, the development and maintenance of an open data infrastructure requires a large up-front investment. Further costs over the long-term stem from maintaining and growing the infrastructure. Some of the most crucial elements include developing open data portals, establishing appropriate storage capacity, developing APIs, passing/amending of policy in order to allow for the data to be opened correctly, and establishing impact studies to measure the success of the infrastructure. (Open Data Institute, 2014) Some sample figures are outlined in Donker’s (Donker, 2018) assessment of open data funding. Donker divides the costs into three main categories, adaptation costs, infrastructural costs, and maintenance and operation costs. Adaptation of the open data initiative can cost between €20,000 and €100,000 depending on the organisation, and the data involved. Adaptation involves developing an “open data strategy” and processing of the data to ensure it is of good quality and is in keeping with the core principles of open data (machine readable, non-proprietary format etc.) Infrastructural costs can often be the largest barrier to entry for smaller companies or governments. An example given by Donker is the Danish Address Registry, which required a €5,000,00 once off investment for the appropriate IT infrastructure to allow the register to function as required. The operational and maintenance costs are estimated at €10,000 to €200,000 per annum. It important to note that a follow up study estimated that the direct financial benefit from this system totalled €62 million. (McMurren, et al., 2016) The cost of developing an open data initiative can be large, but as with the Denmark Address registry there is potential for great returns. For certain bodies however, switching to an open data model can have severe consequences, particularly those that previously operated on a fee or membership-based system. A total of 69% of the trading revenue for UK Ordnance survey was a direct result of funding from licensing and membership fees (Ordnance Survey, 2017). They stated that while a switch to an open data model could be greatly beneficial to the body, the loss of license fees could “lead to a significant loss of commercial revenue and consequent pressure on Ordnance Survey costs and service levels”. In the long-term creating and funding open data initiatives and infrastructures would be very beneficial for many industries and government bodies. The current issue however is that in many cases the short-term cost is very high, discouraging or dis-allowing companies or bodies without proper funding to properly adopt open data initiatives. In the case of private companies, it is often unfeasible to develop open data initiatives without the aid of additional government funding.
The open data movement stems from an idealistic notion that sees the potential of data sources being freely accessible to anyone and everyone for any purpose they see fit. Open data has the potential to greatly benefit society, through providing a means of governance, a means of increasing public involvement, and providing greater transparency, accountability and efficiency in the public sector. In theory this certainly appears to be the way forward, however in practice this may not be the case for some time. The open data movement as it stands is fraught with issues of a lack of understanding of what open data is, a lack of proper adherence to the core principles of open data, a lack of funding, and in some cases where data being opened seems to cause more issues than it solves. For the open data movement to progress to a more effective state, a great deal of research is needed into the best methods of illustrating the benefits of open data to stakeholders which are as of yet unconvinced. More work again is needed to ensure that the basic principles of open data are implemented comprehensively, and the development and enforcement of open data standards will be needed to this end. A heavier emphasis on the potential of open data in global governments, and a push to increase funding for open data initiatives would open the door for more public bodies, companies, and countries, to open their data and push the movement forward.
References
Dept. of Public Expenditure and Reform, 2019. gov.ie - Open Data in Ireland. [Online] Available at: https://www.gov.ie/en/policy-information/8587b0-open-data/ [Accessed 14 April 2021].
Donker, F. W., 2018. Funding Open Data. In: Open Data Exposed. The Hague: T.M.C. Asser Press, pp. 55-78.
Girres, J.-F. & Touya, G., 2010. Elements of quality assessment of French OpenStreetMap data. Transations in GIS, 4(14), pp. 435-459.
Hulstijn, J., Darusalam, D. & Janssen, M., 2017. Open data for accountability in the fight against corruption. Computational Accountability and Responsibility in Multiagent Systems, pp. 52-66.
Iglesias, D., 2017. Open Data and the Fight against corruption in Brazil. [Online]
Available at: https://images.transparencycdn.org/images/2017_OpenDataBrazil_EN.pdf [Accessed 20 April 2021].
McMurren, J., Verhulst, S. & Young, A., 2016. Denmark's Open Address Data Set. [Online] Available at: https://odimpact.org/files/case-study-denmark.pdf [Accessed 27 April 2021].
OECD, 2017. Argentina: Multi-dimensional Economic Survey. [Online] Available at: https://doi.org/10.1787/eco_surveys-arg-2017-en [Accessed 20 April 2021].
OECD, 2017. G20/OECD Compendium of good practices on the use of open data for Anti-corruption. [Online] Available at: https://www.oecd.org/corruption/g20-oecd-compendium-open-data-anti-corruption.htm [Accessed 20 April 2021].
OECD, 2018. Open Government Data Report: Enhancing Policy Maturity for Sustainable Impact, Paris: OECD Publishing.
Open Data Institute, 2014. How to plan and budget an open data initiative - The ODI. [Online] Available at: https://theodi.org/article/how-to-plan-and-budget-an-open-data-initiative/ [Accessed 22 April 2021].
Open Street Map, 2021. Contribute Map Data. [Online] Available at: https://wiki.openstreetmap.org/wiki/Contribute_map_data [Accessed 20 April 2021].
OpenStreet Map Foundation, 2020. Who Uses OpenStreetMap?. [Online] Available at: https://welcome.openstreetmap.org/about-osm-community/consumers/ [Accessed 22 April 2021].
OpenStreetMap Foundation, 2020. Finances - OpenStreetMap Foundation. [Online] Available at: https://wiki.osmfoundation.org/wiki/Finances [Accessed 22 April 2021].
Ordnance Survey, 2017. Ordnance Survey Limited Annual Report & Accounts 2016-17. [Online] Available at: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/636813/ordnance-survey-annual-report-2016-2017-web.pdf [Accessed 22 April 2021].
O'Reilly, T. & al., e., 2007. 8 Principles of Open Government Data. [Online] Available at: https://public.resource.org/8_principles.html
Transparency International, 2015. Corruptions Perceptions Index. [Online] Available at: https://www.transparency.org/en/cpi/2015/index/ [Accessed 20 April 2021].
US General Services Administration, n.d. Homeland Security Digital Library. [Online] Available at: https://www.hsdl.org/?abstract&did=36424 [Accessed 14 April 2021].
ACCA ,BA (Hons) Business , CIPD Associate, Dip HR Management Dip , Post Grad Certificate in Innovation & Entreprenurship
3 年Well done paddy??
Solicitor specialising in small business, property, employment law. Self employed since 1986. YouTuber, Podcaster
3 年Great work??