AI and Its Human Annotators

AI and Its Human Annotators

Part 2 of our miniseries on the role of humans in creating AI.


Pluralism In AI

Unlike most traditional AI, where you might have to identify whether something is a stop sign, much of GenAI human feedback depends on subjective perspectives. It’s not asking to label the bicycle in the image, which is binary, but, for example, to identify which output is more engaging. Whether an output is engaging or more engaging than an alternative output is subjective. What one person finds engaging, another person could find annoying, presumptuous, too talkative, or too casual.

One approach to addressing the inconsistency that can arise when outsourcing human feedback to workers who may hold vastly different opinions is constitutional AI. This framework includes a stage of supervised learning where a model produces critiques of its own outputs, referencing a “constitution” in its evaluations. Next, the initial model is fine-tuned on this self-generated feedback. This is effectively a form of reinforcement learning from AI feedback (RLAIF). Anthropic’s Claude is the best-known user of this technique. But even RLAIF doesn’t solve everything.?

The developers of GenAI – the entities that have access to at least tens of millions of dollars in funding and the engineers who have completed no less than master’s degrees and who have access to some of the most advanced hardware in the world – are not going to be able to provide guidelines to annotators that reflect the pluralistic views of the likely users from around the world. Most people in the world are not from Western countries, highly educated, from industrialized nations, wealthy, and from high-functioning democratic societies.?

This collection of characteristics was coined with the acronym “WEIRD” (Western, Educated, Industrialized, Rich, and Democratic) by Harvard’s Joseph Henrich in his exploration of cultural and psychological evolution, but the same acronym applies to the creators of GenAI making decisions on what a handful of foundational models that constitute the vast majority of GenAI use globally will say in response to any given output.?


Indeed, many users of AI systems have complained about the apparent bias that the models portray in their approach to answering controversial topics, or lack thereof. This has spurred some companies to intentionally train and market their models as being favorable to those of certain ideologies. Especially committed users with some technical ability can train their own models on curated data, utilize fine-tuning methods, or conduct personal RLHF to imprint their beliefs into AI models. This can lead to a lack of exposure to diverse ideas and further divide populations who progressively prefer to use models that lock them into echo chambers.

The Annotators

Nature of the Work

Data labeling work is often tedious. Image annotation, for instance, may require repeating the same task for the same types of images from slightly different viewpoints for several hours on end, day after day. Being a labeler also means parsing instructions provided by employers that can stretch on for dozens of pages. This is because computers require precision. They are not well-equipped to generalize. So, whereas a human asked to label turtles in images will intuitively know whether it’s real or a stuffed animal, a machine will need to be told when a turtle is an actual live animal versus a cartoon versus a stuffed animal. The model must see at least dozens of variations to train itself to identify them going forward reliably.

The annotators often have to sign highly restrictive non-disclosure agreements that prevent them from telling others what they’re working on, the criteria, or who they’re working for. In fact, they often don’t know who is paying their employer for the annotations. These agreements could as easily be for an immoral company using their feedback to create tools for warfare or oppressing minorities as they could be for helping to protect people.?

Annotators also often don’t have work colleagues, a company culture, or a routine work schedule to provide a stable life. Rather, they are treated as powerless cogs in the machine by design. They are siloed because it’s in the tech companies' best interest, as it makes the possibility of an annotator leaking valuable trade secrets less likely.?

As the goals of AI are constantly changing, so are the expectations these employers set for their workers, which puts them at an extreme disadvantage, requiring them to continually adapt. Some experts believe that this type of work is a new form of data colonialism, which is commonly defined as the exploitation of human beings through data. Workers in the Global South annotate images and text that will be used to primarily benefit people in the Global North, and it’s not clear the temporary job and low pay are a fair or acceptable trade.

Compensation

Most annotators live in the Global South, such as Venezuela, India, the Philippines, Kenya, and Lebanon, and are paid wages that can amount to fewer than $2 USD per hour. Additionally, because they are so thoroughly siloed, they have no bargaining power to negotiate at least living wages and basic benefits like health insurance.

Payment is often based on the task at hand. Shorter, simpler tasks pay less, while longer or more complex tasks pay more. Workers often don’t know what tasks will be offered or whether they will be similar to previous tasks (i.e., identifying road signs or clothing) when they log in, so they have no reliable way to budget their income or time. This can become a very pressing problem, such as in the case of? Oskarina Fuentes Anaya, a Venezuelan worker who has had to resort to AI labeling due to economic instability in her home country. Anaya has stated that if she does normal tasks, like going out or walking her dog, she might miss a high-paying task. Since these tasks are highly infrequent, Anaya does not get the basic privilege of stability in her life. In a move that might seem extreme to others but a necessity for her, Anaya has used an extension on her browser that will wake her up if there is a task in the middle of the night.?

If there are few or no tasks for a day or days at a time, the workers won’t earn any money. Adding to the frustrations, the workers often must complete unpaid training, such as reading the labeling criteria and doing a calibration practice run. A study from late 2022 estimated the average earnings of Amazon’s Mechanical Turk workers was about $1.77 per task. A TIME investigation pegged payment at less than $2 per hour for Kenyan workers doing annotation tasks for GenAI companies.

In some places, annotators face outside adversity, such as energy and Wi-Fi blackouts, which already set them back regarding the opportunity to earn money through annotating. When they can work, annotators may spend upwards of 16-18 hours a day glued to their computers, waiting for tasks to arrive at any time.?

In contrast, the handful of large vendors (like Scale AI and Sama AI) overseeing the projects on behalf of even larger companies (like Meta and Microsoft) are often valued in the billions of dollars. Notably, people working directly for the vendors, such as employees at a California headquarters, suffer none of the unpredictability of those they task on their behalf in Asia or Africa. Vendor employees receive steady salaries and benefits like almost all other tech companies' employees.

Moreover, the ease with which annotation vendors can relocate their operations from one region to another, chasing lower labor costs, echoes the historical patterns of exploitation by colonial powers, notably the British Empire, which systematically extracted resources and labor at minimal costs to fuel its industrial expansion. This modern-day practice of cheap labor sourcing resembles the mercantilist strategies of the past, where economic gain for the empire was often placed above the well-being of the local population. By drawing parallels with the British’s extraction of labor from their colonies, we see a bit of a reflection in today’s AI companies that may prioritize cheap AI annotation and hedge profits over the stability and dignity of labor markets in developing countries. This neo-colonialist approach disregards the socio-economic impact on the regions that are left behind, often without sustainable development or alternative employment opportunities. As AI continues to shape the global economy, ethical labor practices shouldn’t be an afterthought.

Beyond the immediate concern for equitable compensation, there is a pressing need to address the stability and professional growth opportunities for those contributing to the RLHF process. The gig economy model for AI training often leaves workers without predictable workloads, benefits, or opportunities for career advancement. This impacts their financial stability and their ability to develop professionally within a rapidly advancing field. Companies utilizing human-annotated data for training AI should consider implementing structured career paths that recognize and build upon the valuable skills these workers gain over time. Such pathways could include certification programs, skill development courses, and the potential for transition to more permanent roles. By investing in the workforce in this manner, companies can improve the quality of annotations and contribute to a more skilled and stable labor market, fostering a sense of job security and professional worth among their workers.


https://images.app.goo.gl/ZNTdFCRm6m61qCyT8

Content Moderation

Humans are the reason we can use sites like Facebook or TikTok and not be awash in a sea of porn and gore. As researchers at the Distributed AI Research (DAIR) Institute put it:?

“Every murder, suicide, sexual assault or child abuse video that does not make it onto a platform has been viewed and flagged by a content moderator or an automated system trained by data most likely supplied by a content moderator. Employees performing these tasks suffer from anxiety, depression, and post-traumatic stress disorder due to constant exposure to this horrific content.”

TIME Magazine investigated the reality of Kenyan content moderators employed by the data-labeling company Sama. In a call-center-like building on the outskirts of Nairobi, men and women work day and night as first-responders for Facebook’s social media platforms. For as little as $1.50 per hour, these Sama employees view and remove illegal, harmful, and banned content before it is displayed on the average user’s screen. And they do all this brutal work while experiencing “intimidation and alleged suppression of the right to unionize.”??

There is plenty of conversation surrounding the impact of social media on users’ mental health, and rightfully so. But what of the people who handle the “bad” content users see and the worse ones we don’t? As one person puts it, “Who protects the protectors?”?

This particular supply chain can be especially fraught when companies demand strict confidentiality so content moderators cannot tell anyone about what they’re going through at their jobs. There are records of moderators becoming depressed after prolonged exposure to disturbing content, leading to them withdrawing from loved ones. Post-traumatic stress disorder (PTSD), anxiety, and panic attacks are also noted mental health consequences for certain types of moderators. Despite these known drawbacks for the field of work, many remain in their roles because there are no good alternatives. Some might say it’s a soft form of coercion where one party holds all the power in the relationship and will extract virtually all the benefit from the exchange.


This article was originally posted on https://intersectingai.substack.com/p/ai-and-its-human-annotators

The following students from the University of Texas at Austin contributed to the editing and writing of the content of LEAI: Carter E. Moxley, Brian Villamar, Ananya Venkataramaiah, Parth Mehta, Lou Kahn, Vishal Rachpaudi, Chibudom Okereke, Isaac Lerma, Colton Clements, Catalina Mollai, Thaddeus Kvietok, Maria Carmona, Mikayla Francisco, Aaliyah Mcfarlin

要查看或添加评论,请登录

David Atkinson的更多文章

  • K-12 Education and GenAI Don’t Mix

    K-12 Education and GenAI Don’t Mix

    One of my least popular opinions is that the rush to cram GenAI into K-12 curricula is a bad idea. This post will lay…

    3 条评论
  • GenAI Questions Too Often Overlooked

    GenAI Questions Too Often Overlooked

    Jacob Morrison and I wrote a relatively short law review article exploring the thorny gray areas of the law swirling…

    2 条评论
  • GenAI Lawsuits: What You Need to Know (and some stuff you don’t)

    GenAI Lawsuits: What You Need to Know (and some stuff you don’t)

    If you want to understand the legal risks of generative AI, you can’t go wrong by first understanding the ongoing…

  • GenAIuflecting

    GenAIuflecting

    Lately, a surprising number of people have asked my thoughts on the intersection of law and generative AI (GenAI)…

  • The Risks of Alternative Language Models

    The Risks of Alternative Language Models

    There is something like "the enemy of my enemy is my friend" going on in the AI space, with people despising OpenAI…

  • The Surrender of Autonomy

    The Surrender of Autonomy

    Autonomy in the Age of AI There are dozens, or, when atomized into their constituent parts, hundreds of risks posed by…

  • Humans and AI

    Humans and AI

    Part 3 of our miniseries on how human contractors contribute to AI. Poor Working Conditions and Human Error While tech…

  • RLHF and Human Feedback

    RLHF and Human Feedback

    Part 1 of our miniseries on RLHF and the role humans play in making AI. RLHF puts a friendly face on an alien…

  • Some Concluding Thoughts on GenAI and the Workforce

    Some Concluding Thoughts on GenAI and the Workforce

    This is Part 4 of our bite-sized series on GenAI and the workforce. The Reality: For Now, Human Labor Is Still More…

  • UBI? My, Oh, My

    UBI? My, Oh, My

    Part 3 of our bit-sized series on GenAI’s potential impact on the workforce. Economic Impact If many people lose their…