This data set took six years to create. Worth every moment.
Hudson Hollister
Helping the energy industry manage and understand regulatory information.
(Cross-posted: https://www.datacoalition.org/this-data-set-took-six-years-to-create-worth-every-moment/)
Today, for the first time in history, the U.S. federal government's spending information is one single, unified data set.
Under a deadline set by the DATA Act of 2014, today every federal agency must begin reporting spending to the Treasury Department using a common data format. And Treasury has published it all online, in one piece, offering a single electronic view of the world's largest organization.
Until today, different types of federal spending information were all tracked in different ways and reported to different places. Agencies reported their account balances to Treasury, budget actions to the White House, contracts to GSA, and grants to the Awards Data System.
But today, these agencies are reporting all of this information to a new database at Treasury, and Treasury is reporting it to you.
Until today, if you wanted to view the federal government's account balances, you would have to file a Freedom of Information Act request with every agency. Even if you did that, you wouldn't be able to figure out which grants and contracts were paid from which accounts.
But today, every agency is linking its accounts, budget actions, grants, and contracts together, showing which grants and contracts are paid from where. Here's an interactive picture of it all. And here's the data set, ready to download. Try it!
Why does this matter?
In 1804, President Jeffersonwrote to his Treasury secretary, Albert Gallatin, that the government's finances had become too complex for Congress to understand - allowing spending and debt to rise out of control.
Jefferson hoped that the scattered "scraps & fragments" of Treasury accounts could be brought into "one consolidated mass," easier to understand, so that Congress and the people could "comprehend them ... investigate abuses, and consequently ... control them."
Jefferson's goal was not fully realized, not until today.
Congress and the White House continued to track spending by appropriation and budget, while federal agencies developed their own complex accounting methods. In 1990, federal agencies began publishing regular financial statements, summarizing all their accounts, but not providing detail. In 2006, then-Senator Barack Obama and Senator Tom Coburn passed a law to publish, online, a summary of every federal grant and contract.
Even after the reforms of 1990 and 2006, these records of accounts, budgets, grants, and contracts all remained segregated from one another, and could not be connected into "one consolidated mass" - not until today.
Today's data set brings all that information together in one piece, and links it. We can see how budget actions, account balances, and grant and contract awards all relate to each other.
Starting today, we can finally run data analytics across the whole government, all agencies, to illuminate waste and fraud. (In Washington, federal leaders got a first taste of this at the first-ever DATA Act hackathon, two weeks ago.)
Starting today, we can track the economic impact of Congress' spending decisions, because we can finally match laws Congress passes to the grants and contracts that are awarded under those laws.
Starting today, the federal government can operate as one enterprise, the way private-sector companies do, because its dozens of agencies' thousands of financial systems are all speaking the same language.
Last month, former Microsoft CEO Steve Ballmer announced that he had invested $10 million and years of effort into USAFacts.org, a new attempt to create one picture of government spending. Ballmer's team had to combine - manually - budget information from the White House, financial statements from the Federal Reserve, and state and local sources. USAFacts.org didn't even try to integrate grant and contract details; there was no way to link them.
If Ballmer had just waited a month, they would have found much of their work - at least the federal part - already done, in the new data set.
The data set isn't perfect (much more on that later), but it really is "one consolidated mass."
How did this happen?
Six years of legislating, lobbying, courage, coding, and cajoling - that's how.
First came the legislating. In June 2011, Congressman Darrell Issa and Senator Mark Warner introduced the DATA Act. Their goal? "Standardizing the way this information is reported, and then centralizing the way it’s publicly disclosed," said Warner.
Issa and Warner were right: data standards were, and are, the key to transforming the chaos of federal spending into "one consolidated mass." If federal agencies all used the same data format to report their different kinds of spending information, then it could all be brought into one picture.
But the data format didn't exist. Issa and Warner proposed to require the executive branch to create one.
The DATA Act earned early support in the House, where Issa chaired the Oversight Committee, but went nowhere in the Senate. Data standardization was not the first issue on most Senators' minds.
Then came the lobbying. In 2012, I resigned from Rep. Issa's Oversight Committee staff to start what was then called the Data Transparency Coalition, the first, and still only, open data trade association. Our first mission: rally tech companies to support the DATA Act.
Tech companies have plenty of self-interest to support reforms like the DATA Act. As the government starts publishing its information in standardized formats, analytics software gets a lot more valuable.
Still, the Coalition didn't grow very fast. The payoff for our efforts - a unified data set covering all federal spending - was years in the future (today!), and so were most of the business opportunities. Our member companies were signing up to support a long-term vision, which isn't a natural use for marketing budgets.
We hosted our first DATA Act Demo Day, then our second. Sarah Joy Hays came on board and pulled off a spectacular first-ever open data trade show, Data Transparency 2013, with credentials and keynotes and exhibit booths and everything - then four more.
Thanks to Warner's persistence, support from the Sunlight Foundation and civil society, and our new tech-industry push, things began to happen in the Senate. Sen. Rob Portman signed on as a cosponsor and the crucial Homeland Security and Governmental Affairs Committee started to get interested in data standardization.
But courage would be required, especially Warner's.
Behind the scenes, the Obama White House did its best to sink the bill. This was surprising. President Obama was a strong public supporter of open data in government. His Open Data Policy directed all federal agencies to standardize and publish all their information as open data.
But his White House Office of Management and Budget wasn't on board. OMB didn't want the challenge of standardizing all spending information, nor did OMB want anyone else to do the job. OMB recommended changes to the DATA Act that used nice words but would have gutted its mandate.
But Warner stood up to the White House. He rejected the proposed changes and kept the bill strong.
A few months later, both chambers of Congress unanimously passed the DATA Act. And on May 9, 2014, three years ago today, President Obama signed it into law, very quietly.
With the law on the books, a coding countdown began. The Treasury Department had one year to come up with a common data format for government spending information - the chaotic, fractured financial, grant, and contract details spread across thousands of systems that had never before been coordinated.
Treasury also had to figure out how, exactly, agencies would deliver their data using that common format. Nobody had ever before created a system like what was needed.
Most government management laws die like this: Congress passes a law and issues some celebratory press releases. The White House, or GSA, or Treasury sets up committees and procedures to do the work. But the work turns out to be hard and complicated, and nobody in the administration really wants to do it - they're acting because Congress told them to. As soon as Congress' attention moves on to other topics, the bureaucrats write reports pretending the work has been done. Or, better yet, the project is combined with another one, it changes ownership several times, and the law's original goals are gradually forgotten.
The DATA Act avoided this fate - largely because of one person.
At Treasury, Deputy Assistant Secretary Christina Ho had already been trying to standardize spending data. (Christina was the first to find the Jefferson letter I quoted earlier, in fact.)
Once the DATA Act became law, she was put in charge of implementing it, and she made up her mind that this time would be different.
Christina assembled a team that shared her ambition and understood why we needed a unified data set covering all spending. They got to work.
Christina's team created the data format: the DATA Act Information Model Schema, or DAIMS, which defines the common data fields of federal spending and shows how they related to one another.
They did this work in the open, in public, using the GitHub coding platform to take suggestions from the whole world and show their choices. Nothing like this had been done in government before.
They announced the DAIMS on May 8, 2015, one day before the deadline. That triggered a second countdown: agencies would have to report by today.
And to help agencies deliver their information, Christina recruited the 18F technology development center at the General Services Administration. 18F built the DATA Act Broker, a piece of open-source software that collects and validates spending data from every agency. They built it using Agile methodology, with constant testing and revision.
Here is the code of the DATA Act Broker; download it if you want.
Nothing like this had been done in government before either.
But coding wasn't enough. The DATA Act's supporters outside the government, and Christina's team inside, had to do a great deal of cajoling.
Even with the DAIMS providing a standard structure for all government spending information, and a DATA Act Broker easing the process, the law didn't really have teeth.
There were no penalties for agencies that don't report standardized spending data. And OMB made it clear that the Obama administration didn't really care if they did, or didn't.
OMB couldn't, or wouldn't, create a list of the agencies required to comply. OMB tried to claim that most of the DAIMS wasn't really required by the law - in order to shut it down later. OMB insisted on a weaker DAIMS than Treasury wanted, in which financial information comes right from source systems, but grant and contract information doesn't.
With a lack of leadership from the White House, we had to push agencies toward compliance in other ways.
First, a few agencies started to realize that standardizing their spending information would make their own work easier, and so we celebrated them at our events.
The Small Business Administration was the first, and best. Chief Financial Officer Tim Gribben used the DAIMS to visualize which SBA grants were being paid from which of its accounts, and plot them on a map. This would have required a bunch of data calls before the DATA Act. Now, it was automatic.
In 2015, over 600 people participated in our DATA Act Summit and saw demonstrations of what leaders like Tim were doing. Ditto in 2016.
Second, Congressional committees stayed involved, instead of moving on. The House Oversight Committee held four hearings focusing on the DATA Act. Behind the scenes, we stayed in touch with committee staff and Members, delivering intelligence and describing the law's long-term vision.
Every year, we brought tech companies to Capitol Hill to remind Congress why the DATA Act was important.
Members of Congress publicly rebuked OMB for slow-walking the DATA Act, and told the agencies they'd celebrate compliance.
Rep. Mark Meadows even did his own DATA Act software demonstration - on our stage. Members of Congress don't usually do demos.
Third, we worked to spread the word about the DATA Act's benefits to the people who'd have to do the work - especially federal financial management professionals, who'd have to report the data, and inspectors general, who'd have to audit it.
In 2016 we founded the Data Foundation, a new nonprofit research organization. Its first piece of research, The DATA Act: Vision & Value, which we co-published with MorganFranklin Consulting, told federal agencies why the DATA Act mattered.
The cajoling worked. Not every agency is going to make today's deadline, but almost all of them will - and even the worst ones are submitting partial reports.
And we'll keep cajoling until all reports are in.
What comes next?
The data set is live. Now, it sure had better get some use! If the data set is used for antifraud analytics, internal management, and public transparency, especially by the federal agencies themselves, its quality will get better and better.
At next month's fourth annual DATA Act Summit, we'll highlight the agencies, tech companies, and coders who are doing the most amazing things with this new resource. We'll celebrate the winners of last month's DATA Act hackathon too.
We're not out of the woods yet.
Last week, the Data Foundation's new report with Deloitte, DATA Act 2022, described the six main challenges to the DATA Act's success. We need to spend the next five years dealing with those.
What are the challenges? The most serious is that DATA Act reporting is running alongside old-fashioned, non-standardized reporting. Agencies still have to report the same information using documents and non-standardized legacy databases like the FPDS, even as they comply with the new DATA Act mandate.
As long as that happens, there's a danger that agencies will see the legacy databases as the main system, and the DATA Act as an add-on.
Congress needs to kick the stool out from under this duplication, and direct the government to make the DATA Act the main, and eventually the only, way that spending is reported. DATA Act 2022 explains how.
The second-most-serious is that the government continues to use the DUNS Number to identify grantees and contractors. The DUNS Number is owned by Dun & Bradstreet. Dun & Bradstreet has a monopoly, protected and profitable, on spending data. Until that monopoly is broken, the private sector won't be able to take full advantage of the data set.
Passing the DATA Act and getting agencies' spending data took six years. Fully realizing its vision will take many years more.
But every moment has been worth it. Every moment will be worth it. A unified federal spending data set makes our democracy better, in so many ways.
Today, we thank the Data Coalition's members and Data Foundation's supporters, without whom none of our work would have been possible.
And today, we celebrate Darrell Issa, Mark Warner, Christina Ho, Tim Gribben, and all the other leaders who caught Jefferson's dream of a single, unified federal spending data set, and didn't let go.
Sr. Project Manager: Owners Representation of Energy Projects
7 年Good Data requires good data visualization. Putting up a chart that shows the Social Security Insurance payouts in the same manner as a Transportation spending is wildly misleading and inaccurate. SS is an annuity insurance payout, not an appropriated government expenditure like the other data shown. Please update the chart so you compare Apples to Apples. Thank you
Network Planning, Design, Project Management, Technical Operations Support Specialize in A/V/Broadcast/Media
7 年Congratulations! A most welcome, long over-due effort. I wonder if it might be possible to adapt the model for state, local and municipal use. Thanks for all the hard work. --Fred Huffman
Co-Founder & Co-CEO at Collective Manufacturing Group
7 年Amazing work! Thank you for your efforts and for sharing all the effort it took to get this done.
Wild Card - draw me for a winning hand | Creative Problem Solver in Many Roles | Manual Software QA | Project Management | Business Analysis | Auditing | Accounting |
7 年So, does the DATA act require disclosing which cronies get enriched? I doubt it.