Malta, Multilingualism, and ML: EACL 2024
Slightly lost walking along the Dingli cliffs, Malta

Malta, Multilingualism, and ML: EACL 2024

Made it home from a great time at the Conference of the European Chapter of the Association for Computational Linguistics (#EACL2024) in Malta. Another full day in the air—ugh—but it was worth it! Such a well-organised conference and an intellectually stimulating experience. It was a pleasure to present our paper about Aboriginal and Torres Strait Islander languages and meet many smart people doing good work in natural language processing (NLP). Below are a couple of papers I enjoyed from each of the conference’s three days, with a focus on multilingualism and ethics in NLP:

  • "Centering the Speech Community" by Steven Bird and Dean Yibarbuk was a real highlight, and I was thrilled to present our paper right after Steven in a session on multilingualism. It has a narrative style that shows the evolving relationship between the authors (and the local community) over five years as they navigated differing understandings of language. In the discussion, they note that the NLP community often resists approaches that don't work across many languages. However, this paper challenges such priorities by demonstrating the benefits of a different focus – how language technologies can enhance local agency and knowledge sharing when developed with the community's needs at the forefront.

  • "Code-Switched Language Identification is Harder Than You Think" by Laurie Burchell and others tackles the challenge of language identification (LID) for code-switched text—when multiple languages are mixed within the same utterance. They reformulate LID as a multi-label task (meaning each text can have multiple language labels) and compare models on code-switched datasets spanning up to eight languages, finding that even the best models struggle to recognise all languages present. I found this paper interesting for highlighting the gaps between LID performance in constrained settings vs. realistic code-switched data.

  • "AnthroScore: A Computational Linguistic Measure of Anthropomorphism" by Myra Cheng and others introduces an automatic metric to quantify how much language anthropomorphises an entity by comparing the probability of human vs. non-human pronoun references based on context. By applying their metric to research papers, they found a steady increase in anthropomorphic language over time, especially for language and multimodal models. They also found higher anthropomorphism in news than in research papers, but I checked with Myra after the presentation, and they didn’t find any change in anthropomorphic language in news about AI over time. I’ve been thinking about what their findings mean for the increasingly public debate about AI, and our responsibilities around language as researchers.

  • This paper presented by Jason Weston "Leveraging Implicit Feedback from Deployment Data in Dialogue" explored improving conversational AI models by learning from human-bot conversations, without explicit human annotations. The key idea is to extract implicit signals of conversation quality from user behaviours, like response length or sentiment. I have been looking into external feedback signals (e.g., thumbs up/down, open-text comments), but this paper taps into the potential of user interaction data to refine dialogue models.

  • In "Clever Hans or Neural Theory of Mind? Stress Testing Social Reasoning in Large Language Models", Natalie Shapira and others conducted an extensive evaluation of 6 tasks to investigate the extent of large language models' (LLMs) theory of mind (ToM) abilities. While LLMs exhibited certain ToM capabilities, the authors found that this behaviour was far from robust, with LLMs struggling on adversarial examples, indicating reliance on "shallow heuristics" rather than genuine ToM abilities. This paper also reinforced that we should be cautious about using human psychological tests to evaluate AI, as the consequences don't straightforwardly transfer.

There were plenty more interesting papers, a couple of keynotes, and a fantastic two-day workshop on computational methods for endangered languages to finish the week. But the real star of the week was Malta itself—crystal blue waters, historic forts, friendly people, and mad hikes. A big thank you to Ben Hutchinson and Google Research for making my attendance possible.

Susannah Soon

Lead - Healthy Connections by Curtin, Pilbara Health Challenge | Senior Lecturer in Computing | Academic Lead, Innovation Central Perth

6 个月

Ned Cooper we found your paper "It's how you do things that matters..." incredibly insightful and will make sure our focus is on many of the approaches you suggest. My team is looking forward to talking with you and Ben Hutchinson on our joint interests. Tristan Carlisle Alastair Kho Prasanna Asokan

Dr Natalie Sheard

AI + Discrimination Researcher | Lawyer | Consultant

6 个月

Thanks for sharing these papers!

Linda Przhedetsky

Technology, AI, and Policy Researcher | PhD Candidate | Casual Academic

6 个月

Such important work!

Congratulations Ned! Sounds like you were in a really supportive, beautiful, and interesting place ??

Jay L. Cunningham, Ph.D.

Responsible AI & Society Researcher | Former UW Board Member | Ex-Google, Apple, Microsoft, Meta | Innovator, Consultant, Pilot, Investor

6 个月

It sounds like you had an amazing time! I really wanted to attend in Malta, but things didn't line up as expected. Thanks for sharing highlights of your time there. We should catch up soon!

要查看或添加评论,请登录

Ned Cooper的更多文章

社区洞察

其他会员也浏览了