The Query is the Data

The Query is the Data

A big revelation hit me the day a law enforcement investigator explained the yellow stickies plastered around his desktop computer screen.

He said, “Every week, or at least once a month, I search for the people on these stickies.”

These subjects of interest included wanted criminals, missing kids and so on.

With this process, he would periodically search system A for the name, date of birth and other identifying attributes on sticky number one. Then he would rekey the same search information into systems B, then C and so on. Once he searched all the systems, he moved on to sticky number two, then three and so on.

I thought: You gotta be kidding me! This organization could’ve at least implemented stored queries.

NOTE: With stored queries, the information on the stickies is entered into a list, like a spreadsheet. Each row containing the name and identifiers from each sticky, so instead of 42 stickies there is a 42-row list (aka stored queries). Then every so often, daily, weekly, etc., the stored queries are automatically run against systems A, B, C, etc. This approach is not great, but it is still much faster than searching each system by hand!

But no. This investigator’s system didn’t have stored queries. So I just stood there noodling his stickies:

This stickie needing to find system data.

This stickie info needing to find system data info.

This info needing to find that info.

It’s all info!

Where the heck in the course of systems design did we start thinking about queries and data so differently? It’s all data.

Idea: Why not store the queries in the same place we store the data? Like this:

The benefits are pretty clear:

· Real-time notification — If the data shows up after the query, the data finds the query. For example, the investigator would receive an instant notification that reads: “new information has arrived related to your missing kid!” (no more need for stickies).

· Queries find queries- — That’s right, if two people ask similar questions, they find each other. Even if there was no “data.”

· No performance consequences — The file could contain 100% data, 100% queries or any ratio in between. In any case, the performance is the same.

Pure magic. Hence, why we have been building entity resolution systems which treat data and queries with equality … for decades.

Armed with this little thought for the day, my blog post ‘Data Finds Data’ may mean a bit more to you than it did before.

 

Jeff, this is elegantly simple. Approach reminds me of some of the organizational thinking left behind by Charles Sanders Peirce.

Manny San Pedro

Criminal Intelligence Analyst

6 年

Hit the nail, Jeff. Letting the machine do the heavy lifting is the secret nobody is attempting to uncover (at least it seems that way). “Data finds data and the relevance finds you.” Thanks for sharing, Jeff. #TwentyTwo48 #Number3

Keith Keller

Ph.D., M.Eng., PMP, CISSP | IT Program Manager at NASA

6 年

A common problem and a logical solution, thanks for sharing!

David E.

Principal at Legacy Software, Ltd.

6 年

Try getting data people to wrap their head around the analogy that systems are the machine tools that produce the data they deal with.

John Janek

Chief Technologist | ex-diplomat | complex problem-solver | systems thinker

6 年

All data is observation. ?? I’ve got an upcoming post similar to this about process data. We need to stop thinking about these things as different and get back to basics.

要查看或添加评论,请登录

Jeff Jonas的更多文章

社区洞察

其他会员也浏览了