Putting Responsible AI into Practice
Welcome to Continual Learnings
A weekly newsletter for practitioners building ML-powered products.
What we're reading this week
As model training becomes more frequent and more automated, it introduces risk, like the risk that an adversary poisoned the dataset. This paper shows that it’s inexpensive (<$100) to poison web-scraped datasets in a way that impacts model performance.?
This high-profile academic collaboration (including Hazy Research at Stanford and SkyLab at Berkeley) promises reasonable latency (as little as 1 sec / token) for large models like OPT-175B on a single 16GB GPU.
This new active learning method achieves competitive performance on regression tasks, while only assuming black-box access to the underlying model.
You know by now that reinforcement learning from human feedback is critical to developing state-of-the-art language systems like ChatGPT. But how critical is the “reinforcement learning” part? This paper explores alternative objectives for training with human feedback, and finds that a simpler baseline actually works better than RL. Simply prepend the human feedback as a token to the sequence before training on it. At test time, generate text by conditioning on that initial positive feedback token.
Last week we highlighted?Toolformer, which showed that language models can teach themselves to use “tools” like calculators. This paper surveys similar approaches.
This paper explores tuning prompts for different subtasks and combining them at test time. The advantage is personalization — you can tune models online to match the preferences of individual users. As a result, the authors claim state-of-the-art on some continual learning benchmarks.
This Stanford lecture looks like a good intro to prompting, instruction tuning, and reinforcement learning from human feedback.
Production ML papers to know
In this series, we cover important papers to know if you build ML-powered product.
Putting Responsible AI into Practice
Microsoft’s challenges rolling out Bing Chat put responsible use of AI in the news last week.
As AI systems get more capable, ML practitioners will be asked to help mitigate the risks of putting them in front of people.
领英推荐
A recent?paper?could help - it lays out reasons that ML can inadvertently harm the world, and suggests ways to address them.
The challenge
To mitigate and prevent the unwanted consequences of ML on the world, we need a good understanding of how harm might be introduced by ML.
This paper frames harm as?biases?that appear at different stages of the ML process.
Datasets we use as ML practitioners may have historical, representation, or measurement biases due to their creation processes. Turning data into model outputs can introduce aggregation, learning, and evaluation biases. Finally, productionizing the model could create deployment biases.?
So what are these biases, how do we identify them, and how can we deal with them?
The Biases
Historical bias?is present in collected data if it reflects real-world biases that are then encoded in an algorithm. For example, word embeddings can reflect harmful stereotypes when trained on datasets from a particular decade. To mitigate, you can try over- or under-sampling features in order to generate a dataset that removes the bias.
Representation bias?occurs when the data used for modeling under-represents part of the population, resulting in a model that will not generalize well for that subpopulation. Issues with image recognition algorithms are often caused by this bias.?A possible mitigation is to adjust the sampling approach so that groups are more appropriately represented.
Measurement Bias?occurs during feature and target selection. Features and labels can be poor proxies for the outcomes you target, especially if they oversimplify what they are supposed to measure (e.g. GPA as a proxy for success), or they represent human decisions (e.g. human-assigned ratings).
Aggregation Bias?happens when a “one-size-fits-all model is used for data in which there are underlying groups or types of examples that should be considered differently.” This can lead to a model that is not optimal for any group, or a model that is fit to the dominant population.
Learning Bias?appears when modeling choices compound performance disparities across subpopulations. For example, the choice of objective function might skew performance, or the choice of a smaller model might amplify poor performance on under-represented data, as the model has limited capacity and preserves information from frequent features only.?
Evaluation Bias?arises because of the need to compare models against each other using established benchmark data. This data may be misrepresentative of the use population for the model, and might also suffer from historical, representation or measurements bias. Mitigations could include evaluation on a broader range of metrics on more granular subsets of the data, or the development of new benchmarks.
Deployment Bias?occurs when a model is used in a different way in production than was intended during training. For example, algorithms intended to predict a person’s likelihood of committing a future crime can be used “off-label” to determine the length of a sentence. Mitigations can be challenging but include contextualizing the model output with other sources of information and judgements.
The upshot
As AI becomes more broadly adopted, we as ML practitioners need to understand the broader impact of what we are developing and the unintended harm it may cause.
This paper is a good starting point for thinking about the consequences more systematically. You can find the?paper?here.
Thanks for reading!
Feel free to get in touch if you have any questions: you can message us on socials or simply reply to this email.
You can also find previous issues on?our blog,?on?twitter?and?on LinkedIn.
The Gantry team