Humanizing Machine Language
Marnie Hughes-Warrington AO
Standing Acting Vice Chancellor, Deputy Vice Chancellor Research and Enterprise, and Bradley Distinguished Professor at University of South Australia
If you could train a machine to speak, write or listen, what would you most want it to be able to do? In this month's Techist blog, I look at advances in natural language processing research, and identify key gaps, including the humanizing that sees more of us engaged in AI research. Read the blog here:
https://techist.tumblr.com/post/187796198997/humanizing-machine-language
or read the full text below:
----
I like to think of the textbook I use for teaching as having been made by a bot, and that the students and I are training it to write better history. It is not bad as writing goes, but the students now know that the breathless spooling of names and dates is not an argument, and that language is about making sense of the world with one another.
I mean no disrespect to the authors, and I hope that someone, somewhere, is doing the same to my writings. For we all stumble and grasp for argument in a sea of information, whether we are students or senior academics. We have so much still to learn about how to write our way to a better world.
My aim is simply to humanise the language learning of machines in ways that escape the routine treatment of computer scientists as blameworthy creators. After all, the clumsiness, oversight and even hate of machines is our human clumsiness, our human oversight, and our human hate.
Take even the quickest look at the latest research on natural language use by machines, and you don’t get the sense of computer scientists having it all under control. The daily updates in this area of AI research in ArXiv are daunting, even bewildering. Today’s batch includes papers on auctioneer bidding language, mortality prediction in intensive-care units, translation of document labels across languages and detecting adversarial attacks, as well as a host of advice on optimising algorithm and neural network design and statistical handling. Tomorrow will mean more variety again, and no clear sense of the wild constellation of the world’s languages having been tamed.
Energy and activity do not make an argument, but they can make a world via accident and unintended consequences. Machines have been trained to generate speeches by members of the UN General Assembly, and the academy is still coming to grips with the use of ‘sybils’ or hatebots that inflate the follower count of influencers, troll social media users, or which nudge people towards particular political beliefs.
This accidental world is also nearer to the academy than we might acknowledge. The current discussion on plagiarism and contract cheating, for example, already feels somewhat dated by the realisation that each of us has a text corpus that might in time be used to generate text, with our consent. Indeed, if there were a line to cross, we probably leapt over it with glee when spellchecker and predictive SMS and Gmail Smart Compose were launched.
Yet national language processing is in many ways still like the analogue textbook training I am undertaking with students. Our work is with a limited and rather homogenous corpus: one text book. That corpus is in a major language: English. We humans are ‘training’ that text together by explaining how the arguments presented can be optimised.
The hard work of that ‘training’ in one language with a limited corpus is the gap in natural language processing. Google Translate works with just over 100 of an estimated 6500 world languages. Content may be trawled support machine futures trading, but these crawlers are susceptible to clumsiness, oversight, and hate. They still need supervision. Amazon employees listen in on Alexa interactions to improve responsiveness. But argument, let alone metaphor, humour, wonder and the uncanny might still pass them by.
And while we train, language keeps moving. This is not just a matter of fashion, but one also of identity and of acknowledgement. Ask any Indigenous or First Nations group working to show that Native Title turns on more than a preserved corpus of words. New words and senses of meaning are invented, and words borrowed or conjured to address the loss of language that often followed from loss of country.
The world of languages is a babel box that defies axiomatic confinement. But that does not mean that national language processing cannot get better, or that it cannot be just, good, or fair.
Humanising AI helps us to see this. This is not just about seeing machines as human, but about acknowledging that they are what more of us can make them. And that making—training—is the patient to and fro of history making. Selecting. Rejecting. Selecting again. Asking if, then what if. Weighing up alternatives and making a tentative call or no call at all. Knowing that causes might not be tied tightly to effects. Figuring out that information is missing, or unfair, or even hateful.
I may be doing this in a classroom with students, but you do it every time you reject or accept the predictive text that is put to you. Recognise your history making, and do something good with it.