The Secret Life of Data Labelers
The business of supplying labeled data for building AI systems is a global industry. But the people who do the labeling face challenges that impinge on the quality of both their work and their lives.
What’s new:?The Verge?interviewed more than two dozen data annotators,?revealing?a difficult, precarious gig economy. Workers often find themselves jaded by low pay, uncertain schedules, escalating complexity, and deep secrecy about what they’re doing and why.
How it works:?Companies that provide labeling services including Centaur Labs, Surge AI, and Remotasks (a division of data supplier Scale AI) use automated systems to manage gig workers worldwide. Workers undergo qualification exams, training, and performance monitoring to perform tasks like drawing bounding boxes, classifying sentiments expressed by social media posts, evaluating video clips for sexual content, sorting credit-card transactions, rating chatbot responses, and uploading selfies of various facial expressions.
What they’re saying:?“AI doesn’t replace work. But it does change how work is organized.” —Erik Duhaime, CEO, Centaur Labs
领英推荐
Behind the news:?Stanford computer scientist Fei-Fei Li was an early pioneer in crowdsourcing data annotations. In 2007, she led a team at Princeton to scale the number of images used to train an image recognizer from tens of thousands to millions. To get the work done, the team hired thousands of workers via Amazon’s Mechanical Turk platform. The result was ImageNet, a key computer vision dataset.
Why it matters:?Developing high-performance AI systems depends on accurately annotated data. Yet the harsh economics of annotating at scale encourages service providers to automate the work and workers to either cut corners or drop out. Notwithstanding recent improvements — for instance, Google?raised?its base wage for contractors who evaluate search results and ads to $15 per hour — everyone would benefit from treating data annotation less like gig work and more like a profession.
References: