What goes wrong if A.I. locks in today’s moral values?
What could go wrong if A.I. locks in today's moral values?

What goes wrong if A.I. locks in today’s moral values?

There many things to worry about in today’s world, and you and I are neither mentally nor emotionally equipped to worry about them all. But what we want to know is, are enough people worrying about important issues on our behalf?

The upheaval triggered by ChatGPT's release has caused many people to worry publicly and voluminously (and on LinkedIn) about AI and its growing effects on employment, productivity, originality and intellectual property, our telling real from unreal, and society itself.

Professional worrier (technical term: moral philosopher) William MacAskill’s intriguing book What We Owe The Future points out that while we don’t yet have enough people worrying about AI (and other very important issues), we at least have enough people now worrying that we need more people worrying about the issue.

Worrying about Values Lock-in

One such issue MacAskill raises requiring respectable rumination that I’ve not seen posted about on LinkedIn is that of Values Lock-in: Human belief systems of all sorts have an almost universal tendency to try to lock in their own values, to the exclusion and elimination of others sets of values. Pluralism and moral evolution often aren’t embraced.

Why is this a problem?

Because while at every point in time we believe our values to be quite good, as we learn and evolve as society we continuously realise in retrospect that some of our values have been actually rather abhorrent, after all.

Values improve over time — in an inquisitive and humane society

You're skeptical?

Consider our values today, vs. just a couple of values examples over the past 150 years:

  • In 1886, President of the British Medical Association William Withers Moore, warned that overeducating women could see them develop a disorder he dubbed “anorexia scholastica”, which made women immoral, insane and asexual.
  • 50 years ago New Zealand law deemed homosexual behaviour an illegal act deserving of punishment — including imprisonment. 100 years ago British societal values and laws caused the death of Alan Turing.
  • Our treatment of animals in domestic, research, and agricultural settings has improved significantly over the decades. In this, we have an area that is evolving contemporaneously- we can personally witness improvement of values in the treatment of animals.

Improving values are ABSOLUTELY NOT guaranteed

While values have changed in many cases to improve the wellbeing of many more in society, values can regress. Reversion to rhetoric and ideologies common in disaffected populations in the early 20th century is being seen today, for example. To illustrate how completely not-inevitable (“contingent”) improvement of values is, MacAskill delves deeply into the question of whether it was inevitable that human slavery was outlawed. This is an absolutely fascinating exploration of a moral issue, and to the author’s own shocked surprise — and mine! — he reached the conclusion that the abolition of slavery was absolutely not inevitable. Moral values of society required enormous change.

We are so accustomed to today’s paradigm regarding the evils of slavery that most would find it unimaginable that society might not have evolved to this point (notwithstanding the fact that slavery continues to exist and be tolerated around the world today).

But wait — aren’t we experiencing Moral Decline?

Isn’t that why we’re trying to make society great again, get it back on track, take our country back etc?

It can be hard to swallow that moral values might be improving over time. In fact, research shows that all of us believe that society is undergoing moral decline (n=12,492,983). What’s most amazing about our belief in this is the following:

  1. We all believe that morality is declining and society used to be more moral
  2. We all believe this decline to have started in the years soon after our birth
  3. We attribute this to decreasing morality of people as they age, and to decreasing morality of successive generations
  4. We all believe this moral decline does not apply to our own group and peers, who instead we believe have been improving over time.

While making for perfect fodder for populist political sloganeering, our universal belief in moral decline instead appears rooted in two psychological phenomena: biased memory for information, and biased exposure to information. Read this fascinating research in full here - it's a treat: The Illusion of Moral Decline

How then to train AI?

Narrow Artificial Intelligence models excel at adopting negative values or ideology based on the training we humans provide it. Why? AI models display bias because the data we feed into them is biased, and this is a difficult problem to fix. AI models are statistical prediction models — biased inputs give biased outputs.

For example, AI-models in HR software in North America have found to reflect gender and racial bias, because they were trained on a corpus wherein a majority of chosen candidates in the past had certain racial, gender and other characteristics — societal biases of the time. Feeding in this historical data taught the HR AI model to pick candidates who best matched past choices, not objective choices.

Consider another: an AI model selecting musicians for the orchestras would disproportionately pick male musicians if fed with data from before orchestras began conducting blind auditions of musicians behind a screen in order to reduce bias.

Thus, Values Lock-in is a major problem for narrow AI models (all the AI we’re using today).

Read more about the problems of data bias in two incredible books:

  1. Invisible Women (the Royal Society Insight Investment Science Book Prize and the Financial Times and McKinsey Business Book of the Year Award in 2019)
  2. 12 Bytes, by Jeanette Winterson. A book that forces you to think in new ways while also being humorous and entertaining to read — starting with the first essay Love(lace) Actually. More below:

Lock-in a danger for narrow AI — but perhaps not for Artificial General Intelligence?

Jeanette Winterson — in her thought-provoking collection of essays, “12 Bytes” — raises the possibility that if AGI recursively creates more and more intelligent AGI, Artificial General Intelligence will far exceed human intelligence, and an intelligence of this level has every chance of seeing and avoiding the stupid failures of human biases.

This is intriguing…as are the other essays in this book, including on embodied vs. disembodied life. The idea that an AGI far more intelligent than us will avoid our failings rather than locking in our biased values is as appealing as the idea tech will dig us out of our over-consumption problem. A shade too much Tech Optimism? Perhaps we’ll just have to wait and see whether AGI locks in contemporary values or improves over time.

Values Lock-in — What to Do?

MacAskill suggests two things: first, a “Great Reflection” where humans spend time worrying about our values, and second, deliberately working to ensure that systems are designed to improve their values over time rather than embodying static point-in-time values. That is, instead of a model being grounded on a dataset that embodies bias and being unable to factor this in, deliberately designing A.I. to factor in human dataset biases and alleviate these over time.

But on what basis? The moral values examples discussed above required some sort of implicit or explicit moral foundation — and ideologies can have wildly different moral codes even regarding those specific examples. If A.I. applies such a moral code that persecutes on the bases above, we make the very mistake MacAskill worries about and introduce yet more persecution bias.

Perhaps starting with a simple scientific basis such as that proposed (for morality) by neuroscientist Sam Harris, that could be implemented algorithmically: “the well-being of conscious creatures”. Does it work for all the above examples of evolving values? Thus, A.I. models with a weighting to iteratively improve the well-being of conscious creatures over and above the biased dataset they’re trained in may help.

So while there may not be enough folk worrying about such things — instead distracting others with talk of existential threat — at least more folk are considering them. And it makes a case for us funding Moral Philosophy and Arts at university as well as STEM.


FAQs to ponder:

  1. How might we practically implement a “Great Reflection” on our values in society, given vast differences in moral beliefs and ideologies across human societies that exist right now? E.g. mechanisms and platforms to promote inclusive dialogue and discourse between different perspectives…Spanning in-person and digital interactions.
  2. If we agree that we should train AI to improve its values over time, what are potential challenges and risks associated with adopting any specific moral foundation, such as Sam Harris’s suggestion of “the well-being of conscious creatures”? Selecting a specific moral basis for models such as “the well-being of conscious creatures” first requires confidence in its universal applicability — i.e. will it work? Then, implementing this basis algorithmically would be an interesting challenge in and of itself: potential biases in existing datasets and models, unintended consequences, ethical principles and societal implications. It might also pose challenges to our status quo: a model with improving values would see the flaws in our short-termist economic and social policy.
  3. Do we have any existing examples or case studies where AI models have successfully improved their values over time? If so, how is this practically being implemented and achieved? This, I don’t know — though I imagine the people who are worrying about AI may be studying value enhancement, transparency, accountability, and ethics in AI models.

Doug Hammond

Combining Cyber Security, Stakeholder & Risk Management skills to enable companies achieve their goals securely

1 年

A well-researched and deeply thought-provoking article Jonathan Rickard

要查看或添加评论,请登录

社区洞察

其他会员也浏览了