登录查看更多内容

Data Annotators: The Unsung Heroes Of Artificial Intelligence Development

Bertalan Meskó, MD, PhD

Director of The Medical Futurist Institute (Keynote Speaker, Researcher, Author & Futurist)

发布日期: 2024年3月27日

The wonders of AI in healthcare are undeniable, revolutionising diagnostics and prevention with astonishing advancements. From detecting diabetic retinopathy to algorithms identifying skin cancer, and predicting cardiovascular risks, AI's achievements are reshaping patient care.

But underpinning these headline-grabbing breakthroughs is an invisible army whose meticulous work powers the algorithms that save lives. They're the unsung heroes of the AI healthcare revolution, often underpaid and underappreciated. So, who are these hidden figures? What exactly do they do?

Have you ever wondered how to create a smart algorithm? Where and how do you get the data for it? What makes a pattern-recognizing program work well and what are the challenges? Nowadays, everyone seems to be building artificial intelligence-based software, also in healthcare. Still, no one talks about one of the most important aspects of the work: data annotation and the people who do this time-consuming, rather monotonous task without the flare that usually encircles AI.

Without their dedicated work, it is impossible to develop algorithms, so we all need to know and talk about the superheroes of algorithm development: data annotators.

How to make algorithms dream of cats?

The method for creating and teaching an algorithm depends on the question it aims to solve. Let’s say you want the algorithm to spot lung tumors in chest X-rays. For that, you will need tools for pattern recognition – the question doesn’t differ much from spotting cats on Instagram.

At first, it sounds easy. Until you start thinking about how to explain to a computer what a cat is. Our usual human clues - fur, ears, eyes, whiskers, four legs, cuteness, and grace - mean nothing to an algorithm that only sees pixels.

“You will need millions of cat photos, appropriately labeled as having a cat. That way, a neural network or a multilayered deep neural network can be trained using supervised learning to recognize pictures with cats in them”, David Albert, M.D., Co-Founder and President of AliveCor, the company that has been developing a medical-grade, pocket-sized device to measure EKG anywhere in less than 30 seconds, explained The Medical Futurist.

So, you won’t tell the algorithm what’s a cat, but you rather show it millions of examples to help it figure it out by itself. That’s why data and data annotation is critical for building smart algorithms.

What is data annotation?

Annotating data is time-consuming and tedious without any of the flare promised by artificial intelligence associated with sci-fi-like thinking and talking computers or robots. In healthcare, the creation of algorithms is rather about utilizing existing databases which mainly encompass imaging files, CT or MR scans, samples used in pathology, etc. At the same time, data annotation will be drawing lines around tumors, pinpointing cells or designating ECG rhythm strips. Thousands, tens of thousands of them. No magic, no self-aware computers.

That’s what Dr. Albert has been doing. “You need accurately labeled and annotated data to develop these deep neural diagnostic solutions. But it's an awful lot of work.

For example, I may annotate or diagnose ten thousand ECGs over several weeks, then another expert goes through the same ten thousand - and then we see where we disagree. After that, we have a third person, who is the adjudicator - who comes in and says, okay, regarding this five hundred where you disagree, this is what I think the answer is.

So it takes at least three people and weeks of work to give you a reasonably confident answer. Deep neural networks to perform correctly in order to take advantage of big data require a tremendous amount of annotation work”.

Counting cells and drawing precise lines around tumors

Katharina von Loga is a consultant pathologist at The Royal Marsden NHS Foundation Trust. A while ago she explained how she uses software-based image analysis to monitor the changes of immune cells within cancerous tumors during therapy. The computer helps her count the cells after she designates carefully the set of cells she’s looking for.

领英推荐

Is AI a Bridge to Infinity?

Deepak Chopra MD (official) 1 年前

The State Of CRISPR Clinical Trials And Their Future…

Bertalan Meskó, MD, PhD 11 个月前

Can AI Really Make Healthcare More Human—and not be…

Dr. Chris Stout 5 年前

“I have an image of a stain in front of me, where I can click on the specific set and annotate that that's a tumor cell. Then I click on another cell and say that’s a subtype of an immune cell. It needs a minimum of all the different types I specified, only after that can I apply it to the whole image. Then I look at the output to see if I agree with the ones that I didn't annotate but the computer classified. That’s the process you can do indefinitely,” she explained.

The hardships of data annotation

Although it sounds perfect in theory that you can train an algorithm to support medical work in pathology, the practice is much more complicated. As medical data archives were (obviously) not created with mathematical algorithms in mind, it’s gargantuan work to standardize existing sampling processes or to have enough “algorithm-adjusted” samples.

It matters how the sample was processed from getting the specimen from the patient until it’s under the microscope. The staining method, the age of the sample, the department where the sample was produced – are all factors to count when deciding on a sample for successful algorithmic teaching.

Beyond the problems of the massive variability in the samples, we have another issue: the lack of experts for data annotation, as well as the difficulty of finding databases of scale. Usually, the precision of an algorithm depends on the size of the sample – the bigger, the better. However, hospitals or medical centers, even really resourceful ones, don’t have enough data or enough annotations. It takes companies like Google, Amazon, or Tencent with unlimited financial resources and a global footprint to derive the kind of scale that you need to develop accurate AI.

What is more, the human resources problem is aggravating. There are only 30-35,000 cardiologists in the United States, all very busy. They don't have time to mark ECGs. On the same note, there are only about 50,000 radiologists - they don't have time to read more chest X-rays. So, we have to do something.

From medical students through online annotators to AI building its own AI

Experts often mention the option to employ medical students or pre-med students in university for simpler annotation tasks – to at least solve the human resources trouble. David Albert played with the thought of building online courses for training prospective annotators, who would afterward get some financial incentives for the annotation of millions of data points. Medical facilities could basically crowdsource data annotation through platforms such as Amazon Mechanical Turk. The process could employ the “wisdom of the crowds”.

Another option would be the employment of algorithms also for annotation tasks – so basically building AI for teaching another smart software. We've seen deep learning-based tools that can do completely automatic annotation by themselves and then the user just has to correct where this automatic process did not work well.

Katharina von Loga also mentioned how international and national committees are working on the standardization of the various sampling processes, which could really ease annotation work and accelerate the building of algorithms. All these could lead to better and bigger datasets, more optimised data annotation and more efficient AI in every medical subfield.

What will the future bring for smart algorithms in healthcare?

We'll see the widespread appearance of smart algorithms in the next five to ten years. We will have much more sophisticated artificial intelligence for healthcare. It would augment doctors and allow them to return to being physicians, not just documenters. We all know how administrative tasks considerably add to the problem of physician burnout, thus such AI solutions are much needed.

Artificial intelligence will not replace physicians, the combination of their work with that of fellow medical professionals should be the direction to take for the future. However, we also see that doctors who don’t use algorithms might get replaced by the ones who do so.

While there will be (and should be) countless debates about the ways of cooperation between artificial intelligence and physicians, one thing is certainly clear. We will never have smart algorithms in healthcare without data annotators.

That’s why we felt the need to talk about and appreciate the experts who right now might be sitting in dark hospital rooms in front of computers and annotating radiology or ophthalmology images so someone else somewhere else could create a potentially lifesaving medical application from them. Without the data annotation heroes, we'll never have artificial intelligence in healthcare.

Kudos to all data annotators out there!

The Future of Digital Health

218,239 位关注者

Niv Shochat

CEO @ Medcase | MBA, CPA

6 个月

Thank you for this beautiful explanation. This is what Medcase does for over 4 years with excellence delivery to our customers. Appreciate the recognition of helping the Future of Global Digital Health.

3 次回应

? M Sheeraz

PBO at Meezan Bank Ltd | BBA - IBA Sukkur

6 个月

#jellyGOOD Hats off to them.

1 次回应

Nadir AMMOUR

8 个月

Highly valuable article that put light on a critical role to build reliable AI products in healthcare. If you want to know how much correctness you expect from the program, ask who trained it

2 次回应

Bert Hartog

8 个月

Fantastic overview and 100% on the mark ... the AI revolution could not have happened without proper data annotation. This is a novel profession that deserves recognition and professional standards to ensure efficiency and optimal alignment and best practices across disciplines and industries.

6 次回应

查看更多评论

要查看或添加评论，请登录

Bertalan Meskó, MD, PhD的更多文章

Sibionics Blood Glucose Sensor: Review

2024年11月28日

Sibionics Blood Glucose Sensor: Review

As The Medical Futurist, I often explore the latest advancements in digital health technology to assess their potential…

7 条评论
Top 8 Dermatology Trends in 2025 – This And More News In Digital Health This Week

2024年11月27日

Top 8 Dermatology Trends in 2025 – This And More News In Digital Health This Week

A lot is going on as we prepare for the end of the year. We had an interview about a portable IV pump designed to…

4 条评论
Amazing Technologies Changing The Future Of Dermatology

2024年11月21日

Amazing Technologies Changing The Future Of Dermatology

Smart algorithms will soon diagnose skin cancer, dermatologists consult patients online, and 3D printers will print out…

7 条评论
Where Are 3D-Printed Casts?

2024年11月20日

Where Are 3D-Printed Casts?

Years ago, I met Scott Summit, Design Director of 3DSystems. At that time, he was wearing a cast due to ongoing issues…

17 条评论
Top 8 Pharma Trends for 2025 – This And More News In Digital Health This Week

2024年11月19日

Top 8 Pharma Trends for 2025 – This And More News In Digital Health This Week

This week featured news about innovative ways of developing AI for medical purposes. One was particularly controversial…

3 条评论
8 Top Pharma Trends In The Digital Health and AI Era

2024年11月14日

8 Top Pharma Trends In The Digital Health and AI Era

In the midst of the digital health era, the pharmaceutical industry has been experiencing a rapid evolution. Thanks to…

10 条评论
The Role Of Medical Events In The Digital Health Revolution

2024年11月13日

The Role Of Medical Events In The Digital Health Revolution

Medical and healthcare events that bring every stakeholder together play a significant role in how the digital health…

8 条评论
What Trump's Win Means For AI And Healthcare – This And More News In Digital Health This Week

2024年11月12日

What Trump's Win Means For AI And Healthcare – This And More News In Digital Health This Week

The United States being a leader in digital health and healhcare AI investments, as well as regulations and policies;…

6 条评论
10 Promising Advanced Healthcare Technologies In Practice

2024年11月8日

10 Promising Advanced Healthcare Technologies In Practice

As a futurist, I have constantly been exploring what the future of healthcare holds for medical professionals and…

15 条评论
Top Diabetes Companies On The Way To The Artificial Pancreas

2024年11月7日

Top Diabetes Companies On The Way To The Artificial Pancreas

Connected continuous glucose sensing technologies, sensors inserted under the skin, digital skin patches, and many more…

6 条评论

See all articles

Data Annotators: The Unsung Heroes Of Artificial Intelligence Development

Bertalan Meskó, MD, PhD

Director of The Medical Futurist Institute (Keynote Speaker, Researcher, Author & Futurist)

How to make algorithms dream of cats?

What is data annotation?

Counting cells and drawing precise lines around tumors

领英推荐

The hardships of data annotation

From medical students through online annotators to AI building its own AI

What will the future bring for smart algorithms in healthcare?

The Future of Digital Health

218,239 位关注者

Bertalan Meskó, MD, PhD的更多文章

社区洞察

其他会员也浏览了

Of Algorithms and Minds: Navigating the AI-Human Partnership #15 Exploring The Dynamic Synergy Between Artificial Intelligence And Humans

Generative AI In Healthcare (Academic)

Unraveling AI Adoption Across Different Industries

The Synergistic Power of Advanced AI Technologies in Healthcare: A Comprehensive Review

AI and Medical Diagnosis

From ontology to oncology: Deep science for medical AI

Of Algorithms and Minds: Navigating the AI-Human Partnership #4 Exploring The Dynamic Synergy Between Artificial Intelligence And Humans

Artificial Intelligence and Cultural Perspectives: The Jewish Edge in Innovation

Revolutionizing Healthcare: The Role of GenAI in Shaping the Future

The Impact of Generative AI on Healthcare: A Prescription for Transformation

How to make algorithms dream of cats?

What is data annotation?

Counting cells and drawing precise lines around tumors

领英推荐

The hardships of data annotation

From medical students through online annotators to AI building its own AI

What will the future bring for smart algorithms in healthcare?

The Future of Digital Health

218,239 位关注者

Bertalan Meskó, MD, PhD的更多文章

Sibionics Blood Glucose Sensor: Review

Top 8 Dermatology Trends in 2025 – This And More News In Digital Health This Week

Amazing Technologies Changing The Future Of Dermatology

Where Are 3D-Printed Casts?

Top 8 Pharma Trends for 2025 – This And More News In Digital Health This Week

8 Top Pharma Trends In The Digital Health and AI Era

The Role Of Medical Events In The Digital Health Revolution

What Trump's Win Means For AI And Healthcare – This And More News In Digital Health This Week

10 Promising Advanced Healthcare Technologies In Practice

Top Diabetes Companies On The Way To The Artificial Pancreas

社区洞察

其他会员也浏览了

Of Algorithms and Minds: Navigating the AI-Human Partnership #15 Exploring The Dynamic Synergy Between Artificial Intelligence And Humans

Generative AI In Healthcare (Academic)

Unraveling AI Adoption Across Different Industries

The Synergistic Power of Advanced AI Technologies in Healthcare: A Comprehensive Review

AI and Medical Diagnosis

From ontology to oncology: Deep science for medical AI

Of Algorithms and Minds: Navigating the AI-Human Partnership #4 Exploring The Dynamic Synergy Between Artificial Intelligence And Humans

Artificial Intelligence and Cultural Perspectives: The Jewish Edge in Innovation

Revolutionizing Healthcare: The Role of GenAI in Shaping the Future

The Impact of Generative AI on Healthcare: A Prescription for Transformation