NAACL 2022 Panel: "The Place of Linguistics and Symbolic Structures"
After hearing various observations/laments from faculty friends that NLP people these days are just applied math people who don't really know language, here is the wonderful #NAACL2022 panel "The Place of Linguistics and Symbolic Structures" that just took place, given by Chitta Baral, Emily Bender, Dilek Hakkani-Tur, Christopher Manning, and moderated by Dan Roth.
Here are my quick notes from the discussions (LLM = large language model; any mistake/inaccuracy is mine):
Dan: everyone seems to agree linguistics is important; observation on exclusive focus on LLM, is it true?
Emily: funding; we need to change the funding
Chitta: LLM is impressive but ppl can see failure modes
Chris: pushes back — LLM starts to show emergence of more sophisticated linguistic structures; funding goes to something that works; doesn’t feel linguistics is overlooked in conferences (semantics)
Dilek: dialog interactions don’t just need LLMs, needs a lot of components;
Don: not trying to reach the moon, just creating a car; focus on applications, e.g., educational tools; what do you want to get rid of from NLP curriculum (HMM, parsing…)
Emily: don’t want to chase trends, teach long-lasting things; need to teach fundamentals (applause)
Chitta: Still teaching fundamentals but more and more on LLMs
Chris: time is short things need to be dropped; don’t teach anymore HMM, CRF, CYK, etc; teach perspectives
Dilek: students need to learn the latest (from “ML for NLP”); important for other classes to fill the fundamentals
Salim Roukos: not about statistical/ML; need to think about reasoning; incorporating KB for inference
Chris: LLMs solving math, surprising but not the ultimate answer
Emily: LLMs can produce seemingly coherent stuff but will eclipse the real effort of doing reasoning
Audience: how much symbolics is already captured by LLMs?
Chris: quite a lot — POS, constituency etc
领英推荐
Emily: language is a symbol system; meaning is use; use is not just occurrences (forms); missing from LLMs
Chris: disagreed from Emily; LLMs built on purely forms (disembodied) can learn the true meanings; exciting on grounded LLMs research
Audience: simple methods can be as good as LLMs and more practical/cost-effective; please continue to teach classical ML; teach basic linguistic knowledge (good for engineering basic features)
Audience: bigger the model the worse performance; humans are flapping wings, models are jet engines; is there a problem that machines work very differently from humans do
Emily: they will be different, the problem is lack of study; one problem is using models to learn how human works
Audience: how do we research language acquisition (LA) with linguistics and other non-ML sources
Emily: yes we need to consider those too
Chris: LA is important, humans are more efficient language learners
Dan: different ways of supervision; kids learn multi-modally
Audience: machines need to be “interested in” learning things (motivation)
Chitta: kids also go to school to receive instructed learning, need to consider that too
Emily: why do we want to create something human-like? To what end? Rather than creating something that just helps us.
Chris: exciting time to research on social language learning; the way it is is because LLMs building on large datasets is easy
Dan: back to Salim’s question; (dealing w/ spouse?) Do we need a different paradigm for reasoning?
Emily: difference between structures of knowledge and the structures of world; encourage not to take large corpus for research; we need better training data, not just content scraped from the web
Chris: LLMs achievement is remarkable; we don’t know what will emerge
Audience: contrast — language is symbolic but brain is not, why can we learn languages?
Emily: unsolved.
Chris: believe human brains can’t have any specific structures for language; same structures have to support other non-linguistic learning; but brains must have inductive biases for learning recursive structures
Audience: linguists think about language is continuous (speech) but NLP people use discreet symbols; comment on disconnect?
Emily: need to define what symbolic is; most linguists are also just using discreet symbols
Chris: speech is continuous but categorical structures emerge for easier communication
#NLGPU #Conference #Linguistics #KnowledgeRepresentation #Reasoning
IBM Distinguished Engineer and Chief Architect, IBM NLP
2 年I am so glad to see this topic discussed with such distinguished members of the NLP community! Those interested in continuing the discussion on complementing deep learning with symbolic and linguistic approaches: when, why and how? - Submit a paper or attend our workshop co-located with COLING 2022: Pattern-based Approaches to NLP in the Age of Deep Learning: https://pan-dl.github.io/2022/about
Sr. Applied Scientist at Microsoft
2 年Indeed, my favorite panel discussion! Wish the session could have had more time for more to share in this forum!
Implement New Study | Senator | ISCA Fellow | Landeslehrpreis 2021 BW | Head Digitalization Committee
2 年Linguistic ideas are central to the future of AI?? glad you are emphasizing this more. It’s one thing to assemble a bunch of tiny OPs and call yourself a data scientist. It’s another thing to really understand your data.
Professor at Waipapa Taumata Rau (The University of Auckland); Strong AI Lab; AI4Good Foundation
2 年A fantastic and very well balanced panel, both reflecting the extrema of this discussion, and moderating them with considered intermediate positions.