AI-Readiness to Address the Missed Promise of Open Data

On May 09, 2013, the White House published the "Making Open and Machine Readable the New Default for Government Information" Executive Order with the promised benefit of "making information resources easy to find, accessible, and usable can fuel entrepreneurship, innovation, and scientific discovery that improves Americans' lives and contributes significantly to job creation."

Eleven years later, I would argue we are still a long way from the advertised promise of open data, speaking as both an open data producer and an open data consumer.

I am sure some open data pundits may disagree with me while others may have lots of theories about what went wrong.

If I had to zero-in on the root cause, I would blame it on the original phrasing:

To promote continued job growth, Government efficiency, and the social good that can be gained from opening Government data to the public, the default state of new and modernized Government information resources shall be open and machine readable.

Instead of "machine readable", "machine understandable" should have been used.

You can download CSV files, but you need humans to extract meaning from them or to stitch them together into something coherent. Machines can read them; machines cannot understand them. Even worse, with the advent of large language models (LLMs), machines often mis-understand them.

The good news is that this is an issue being tackled as we speak: the US Department of Commerce released last week a "Request for Information: AI-Ready Open Government Data Assets".

"AI-ready" is a better moniker than "machine understandable" as it does not distinguish between humans and machines.

My hope is to see soon statements like

the default state of new and modernized Government information resources shall be open and AI-ready

And I think that ontologies and knowledge graphs have a key role to play to make data AI-ready. The other good news is that the National Science Foundation is looking into it (NSF 23-571).


PS: I mentioned 2 initiatives from the US government. This is out of ignorance about what's happening in other regions. I am sure other countries are doing the same and some might even be further along. Comments are welcome.



Arnaud Sahuguet

invent, architect, build and ship products that leverage technology to solve meaningful problems and have a large social impact. Currently working on GenAI applied to financial services (hedge fund).

7 个月

From the Department of Commerce presentation

  • 该图片无替代文字
回复

Intriguing insights on the intersection of AI and open data policy—looking forward to seeing how these initiatives continue to evolve and impact data accessibility.

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了