登录查看更多内容

The Query is the Data

Jeff Jonas

Founder & CEO

发布日期: 2019年3月11日

A big revelation hit me the day a law enforcement investigator explained the yellow stickies plastered around his desktop computer screen.

He said, “Every week, or at least once a month, I search for the people on these stickies.”

These subjects of interest included wanted criminals, missing kids and so on.

With this process, he would periodically search system A for the name, date of birth and other identifying attributes on sticky number one. Then he would rekey the same search information into systems B, then C and so on. Once he searched all the systems, he moved on to sticky number two, then three and so on.

I thought: You gotta be kidding me! This organization could’ve at least implemented stored queries.

NOTE: With stored queries, the information on the stickies is entered into a list, like a spreadsheet. Each row containing the name and identifiers from each sticky, so instead of 42 stickies there is a 42-row list (aka stored queries). Then every so often, daily, weekly, etc., the stored queries are automatically run against systems A, B, C, etc. This approach is not great, but it is still much faster than searching each system by hand!

But no. This investigator’s system didn’t have stored queries. So I just stood there noodling his stickies:

This stickie needing to find system data.

This stickie info needing to find system data info.

This info needing to find that info.

It’s all info!

Where the heck in the course of systems design did we start thinking about queries and data so differently? It’s all data.

Idea: Why not store the queries in the same place we store the data? Like this:

The benefits are pretty clear:

· Real-time notification — If the data shows up after the query, the data finds the query. For example, the investigator would receive an instant notification that reads: “new information has arrived related to your missing kid!” (no more need for stickies).

· Queries find queries- — That’s right, if two people ask similar questions, they find each other. Even if there was no “data.”

· No performance consequences — The file could contain 100% data, 100% queries or any ratio in between. In any case, the performance is the same.

Pure magic. Hence, why we have been building entity resolution systems which treat data and queries with equality … for decades.

Armed with this little thought for the day, my blog post ‘Data Finds Data’ may mean a bit more to you than it did before.

William (Bill) Reh

5 年

Jeff, this is elegantly simple. Approach reminds me of some of the organizational thinking left behind by Charles Sanders Peirce.

1 次回应

Manny San Pedro

Criminal Intelligence Analyst

6 年

Hit the nail, Jeff. Letting the machine do the heavy lifting is the secret nobody is attempting to uncover (at least it seems that way). “Data finds data and the relevance finds you.” Thanks for sharing, Jeff. #TwentyTwo48 #Number3

1 次回应

Keith Keller

Ph.D., M.Eng., PMP, CISSP | IT Program Manager at NASA

6 年

A common problem and a logical solution, thanks for sharing!

2 次回应

David E.

Principal at Legacy Software, Ltd.

6 年

Try getting data people to wrap their head around the analogy that systems are the machine tools that produce the data they deal with.

3 次回应

John Janek

Chief Technologist | ex-diplomat | complex problem-solver | systems thinker

6 年

All data is observation. ?? I’ve got an upcoming post similar to this about process data. We need to stop thinking about these things as different and get back to basics.

4 次回应

查看更多评论

要查看或添加评论，请登录

Jeff Jonas的更多文章

Entity Resolution: Insights and Implications for AI Applications

2023年8月22日

Entity Resolution: Insights and Implications for AI Applications

Ben Lorica 罗瑞卡 recently wrote an informative article (maybe the best, ever) for his Gradient Flow newsletter about the…

3 条评论
The Capricious Nature of Generative AI — How Old Is My Mother Now?

2023年5月14日

The Capricious Nature of Generative AI — How Old Is My Mother Now?

If a different answer next time would be concerning, then you are likely using Generative AI the wrong way. Inspired by…

8 条评论
Senzing AI – Accelerating Discovery and Innovation for a Better World

2023年5月10日

Senzing AI – Accelerating Discovery and Innovation for a Better World

Imagine a world where we make new discoveries faster. Could we solve more of the world’s biggest challenges? Could we…

17 条评论
Questioning Myself … As ChatGPT Makes Plausible Claims About Me

2023年2月26日

Questioning Myself … As ChatGPT Makes Plausible Claims About Me

I’ve seen ChatGPT coming for months. After ChatGPT’s launch, we’re seeing its remarkable capabilities blooming left and…

47 条评论
How to Handle Drifting Entity IDs in Entity Resolution Systems

2022年10月27日

How to Handle Drifting Entity IDs in Entity Resolution Systems

A common question from those new to entity resolution is: Are the entity IDs being created persistent? Meaning, once a…

17 条评论
Evaluating Entity Resolution? How to Avoid Inaccurate Accuracy Testing

2022年2月2日

Evaluating Entity Resolution? How to Avoid Inaccurate Accuracy Testing

Imagine conducting vehicle safety tests, like measuring stopping distance or handling in high-speed turns, in a small…
Evaluating Entity Resolution? Don’t Overlook Operational Impacts!

2022年1月25日

Evaluating Entity Resolution? Don’t Overlook Operational Impacts!

Imagine evaluating a car by just driving it around the block… when your real needs are a low maintenance, all-terrain…
Evaluating Entity Resolution? Beware of Behind the Curtain Wizardry

2022年1月19日

Evaluating Entity Resolution? Beware of Behind the Curtain Wizardry

Imagine evaluating a car by having a seasoned professional demonstrate how they can drive the car around the block ……

14 条评论
Entity Resolution Breakthrough Using AWS ECS Fargate & Aurora PostgreSQL Serverless

2020年11月2日

Entity Resolution Breakthrough Using AWS ECS Fargate & Aurora PostgreSQL Serverless

“Good, fast, cheap. Choose two.

12 条评论
Quickly Discover Hidden Connections in PPP Loan Data Using Senzing

2020年9月23日

Quickly Discover Hidden Connections in PPP Loan Data Using Senzing

At Senzing we have created the first real-time AI for entity resolution. We make it quick and easy to accurately…

7 条评论

See all articles

The Query is the Data

Jeff Jonas

Founder & CEO

Jeff Jonas的更多文章

社区洞察

其他会员也浏览了

UNLOCK THE POWER of Information for Strategic Decisions

Do you trust your data? ??

Recover Lost Files Effortlessly with FoneDog Data Recovery 1.5.10 Guide

Where Do Deleted Files Go?

Is Your Database Clean? Learn About Our 50-Point Data Scrub

?????????????? ?????? and other keys

What I've Learned About Data in 2019

Are You Missing Data Hidden in Plain Sight?

The Marvelous World of Field Types: Data Integrity's Superheroes!

Jeff Jonas的更多文章

Entity Resolution: Insights and Implications for AI Applications

The Capricious Nature of Generative AI — How Old Is My Mother Now?

Senzing AI – Accelerating Discovery and Innovation for a Better World

Questioning Myself … As ChatGPT Makes Plausible Claims About Me

How to Handle Drifting Entity IDs in Entity Resolution Systems

Evaluating Entity Resolution? How to Avoid Inaccurate Accuracy Testing

Evaluating Entity Resolution? Don’t Overlook Operational Impacts!

Evaluating Entity Resolution? Beware of Behind the Curtain Wizardry

Entity Resolution Breakthrough Using AWS ECS Fargate & Aurora PostgreSQL Serverless

Quickly Discover Hidden Connections in PPP Loan Data Using Senzing

社区洞察

其他会员也浏览了

UNLOCK THE POWER of Information for Strategic Decisions

Do you trust your data? ??

Recover Lost Files Effortlessly with FoneDog Data Recovery 1.5.10 Guide

Where Do Deleted Files Go?

Is Your Database Clean? Learn About Our 50-Point Data Scrub

?????????????? ?????? and other keys

What I've Learned About Data in 2019

Are You Missing Data Hidden in Plain Sight?

The Marvelous World of Field Types: Data Integrity's Superheroes!