Preparing Open Data for the Age of AI

In the past year, the explosion of generative Artificial Intelligence (GenAI) in the public consciousness has created both excitement and trepidation. GenAI has the potential to change how we work, learn, play, and consume information. In response to this technology – and leaning into the momentum driven by the Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence – the Commerce Data Governance Board launched the AI and Open Government Data Assets Working Group. This working group is tasked with developing guidelines for publishing Commerce data that can be consumed by emerging AI technologies such as GenAI.

The working group is particularly interested in how GenAI can advance the Department of Commerce’s strategic goal to “expand opportunity and discovery through data.” Innovations such as Google’s Data Commons and OpenAI’s ChatGPT have shown that GenAI can empower users to discover and quickly derive insights from public data without specialized expertise or knowledge. By simply engaging in a plain language conversation with an AI model, users can obtain illuminating statistics, graphs, charts, and maps on a wide range of topics, including demographics, economics, and the climate. We are hopeful that enabling simpler interfaces can allow for more equitable access to our data by a wide variety of users.

At the same time, the working group is concerned with the risks that come when AI systems return incorrect or fabricated results to users. For example, when a user asks Google’s Bard or ChatGPT about the demographics of Suitland, Maryland, often the responses will return inaccurate results from non-government resources or ask the user to visit the Census website (with no link to the proper location of where to find such information).

To provide these models with the data needed to realize the benefits and mitigate the risks of AI, the working group will modernize Commerce’s public data to be AI-ready. AI-ready data means data that is not just machine-readable, but machine-understandable; data that is enriched with contextual metadata and organized in interpretable standard formats. AI models can then better interpret Commerce data, link them to similar data, and return accurate results from authoritative sources.

Adopting AI-ready standards will also improve the search functionality of Commerce data. By publishing AI-ready Commerce data, applications like Data Commons, Bard, or ChatGPT can report back accurate information when asked about the demographics of places like Suitland, Maryland.

The working group will draft technical guidelines for publishing AI-ready open data and will engage industry, academia, and other partners across the public data ecosystem. This group is chaired by Census’ Chief Scientist, Sallie Ann Keller, and is made up of data management and AI experts across Commerce’s thirteen bureaus. The working group hopes to publish the guidelines by the end of 2024.

The AI and Open Government Data Assets Working Group and the Department of Commerce are committed to innovating on data dissemination practices to meet the needs of our users and are optimistic about the potential of generative AI to democratize access to data, expanding opportunities for discovery and insights for all Americans.

Oliver Wise, Sallie Ann Keller, and Victoria Houed,

No Comments Yet

Leave a Reply

Your email address will not be published.

© 2024 Open Data News Wire. Use Our Intel. All Rights Reserved. Washington, D.C.