Leonie Weissweiler

/ˈleːoni ˈvaɪ̯svaɪ̯lɐ/

I am a postdoc at UT Austin Linguistics, working with Kyle Mahowald, funded by the Walter Benjamin Fellowship of the German Research Foundation (DFG).

I completed my PhD in 2024 at the Center for Information and Language Processing at LMU Munich where my thesis was about Computational Approaches to Construction Grammar and Morphology. My supervisor was Hinrich Schütze. Previously, I completed my B.Sc. and M.Sc. degrees in Computational Linguistics and Computer Science at LMU, with scholarships from the German Academic Scholarship Foundation and the Max Weber Program. My M.Sc. thesis, supervised by Hinrich Schütze, was on the application of Complementary Learning Systems Theory to NLP. I spent the final year of my bachelor's degree as a visiting student at Homerton College, University of Cambridge, where I wrote my B.Sc. thesis on Character-Level RNNs under the supervision of Anna Korhonen.

Current Research Interests

Construction Grammar and NLP
Emergent structure in Language
Interactions between Cognitive Linguistics and NLP
Computational Typology and Morphosyntax

News

Invited talks at Stanford University and the University of Nevada at Reno on the topic of "Constructions all the way down: rethinking linguistic generalisation in LLMs".
Our paper Models Can and Should Embrace the Communicative Nature of Human-Generated Math was accepted at the NEURIPS 2024 workshop on Mathematical Reasoning and AI!
Our paper SYNTHEVAL: Hybrid Behavioral Testing of NLP Models with Synthetic CheckLists was accepted to Findings of EMNLP 2024.
I started my postdoc at UT Austin Linguistics with Kyle Mahowald!
I gave a keynote at KONVENS 2024 on "Constructions all the way down: Rethinking compositionality in LLMs". The slides are available here.
On July 3rd, I successfully defended my PhD thesis "Computational Approaches to Construction Grammar and Morphology" and graduated summa cum laude with a Dr.Phil. in Computational Linguistics! A recording of the 15-minute defence talk is available here.
Invited talks at NYU, Boston University, MIT, UPenn, CMU, and Cornell on the topic of "Finding the Limits of LLMs with Constructions".
I will be visiting Adele Goldberg at Princeton University for 3 months, starting in March 2024, to work on Construction Grammar and LLMs!
Three papers accepted to LREC-COLING 2024! UCxn: Typologically Informed Annotation of Constructions Atop Universal Dependencies, Constructions Are So Difficult That Even Large Language Models Get Them Right for the Wrong Reasons and Verbing Weirds Language (Models): Evaluation of English Zero-Derivation in Five LLMs.
Invited talk at SAIL workshop on Fundamental Limits of Large Language Models at the University of Bielefeld. Check out the video here!
Two papers accepted to EMNLP 2023! Counting the Bugs in ChatGPT's Wugs: A Multilingual Investigation into the Morphological Capabilities of a Large Language Model and Crosslingual Transfer Learning for Low-Resource Languages Based on Multilingual Colexification Graphs.
Invited talk at the "Poetik, Praxis und Hermeneutik Künstlicher Intelligenz" workshop at the Fritz Thyssen Foundation in Cologne, on LLMs as the newest battlefield of the Linguistics Wars, featuring opinions by Noam Chomsky, Adele Goldberg, Steve Piantadosi and myself. Slides (in German) are available here.
Invited talk at Bar Ilan University NLP
Two papers accepted to ACL 2023! A Crosslingual Investigation of Conceptualization in 1335 Languages and How to Distill your BERT: An Empirical Study on the Impact of Weight Initialisation and Distillation Objectives.
I presented The Better Your Syntax, the Better Your Semantics? Probing Pretrained Language Models for the English Comparative Correlative at the 19th Workshop on Multiword Expressions at EACL. Check out the slides and video!
Invited talk at Ben Roth's Lab at the University of Vienna
Invited talk at Ryan Cotterell's Lab at ETH Zürich on construction grammar and how we can use it to formulate new goals for syntactic and semantic probing
Invited talk at MunichNLP on the joint history of NLP and Linguistics and how we can learn from each other going forward.
Our paper Construction Grammar Provides Unique Insight into Neural Language Models was accepted to the CxGs+NLP Workshop at the Georgetown University Round Table 2023! Check it out here.
Our paper The Better Your Syntax, the Better Your Semantics? Probing Pretrained Language Models for the English Comparative Correlative was accepted to EMNLP 2022! Check it out here.
I will be visiting David Mortensen and Lori Levin's Linguistics Lab at the Language Technologies Institute at Carnegie Mellon University for 3 months, starting in mid-July 2022, to work on Construction Grammar and NLP!
Invited at the 8th International ScaDS Summer School 2022 in Leipzig on the joint history of NLP and Linguistics and how we can learn from each other going forward! Here are the abstract and slides
I presented our paper CaMEL: Case Marker Extraction without Labels 🐫 at ACL 2022 in Dublin! Click here for the slides.
Our paper CaMEL: Case Marker Extraction without Labels 🐫 was accepted to ACL 2022! Check it out here.

Selected Publications

For a full list of publications, please see my Google Scholar profile.

Leonie Weissweiler, Nina Böbel, Kirian Guiller, Santiago Herrera, Wesley Scivetti, Arthur Lorenzi, Nurit Melnik, Archna Bhatia, Hinrich Schütze, Lori Levin, Amir Zeldes, Joakim Nivre, William Croft, Nathan Schneider (2024). UCxn: Typologically Informed Annotation of Constructions Atop Universal Dependencies. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation. (LREC-COLING)
Abstract PDF

The Universal Dependencies (UD) project has created an invaluable collection of treebanks with contributions in over 140 languages. However, the UD annotations do not tell the full story. Grammatical constructions that convey meaning through a particular combination of several morphosyntactic elements -- for example, interrogative sentences with special markers and/or word orders -- are not labeled holistically. We argue for (i) augmenting UD annotations with a 'UCxn' annotation layer for such meaning-bearing grammatical constructions, and (ii) approaching this in a typologically informed way so that morphosyntactic strategies can be compared across languages. As a case study, we consider five construction families in ten languages, identifying instances of each construction in UD treebanks through the use of morphosyntactic patterns. In addition to findings regarding these particular constructions, our study yields important insights on methodology for describing and identifying constructions in language-general and language-particular ways, and lays the foundation for future constructional enrichment of UD treebanks.

Shijia Zhou, Leonie Weissweiler, Taiqi He, Hinrich Schütze, David Mortensen, Lori Levin (2024). Constructions Are So Difficult That Even Large Language Models Get Them Right for the Wrong Reasons. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation. (LREC-COLING)
Abstract PDF

In this paper, we make a contribution that can be understood from two perspectives: from an NLP perspective, we introduce a small challenge dataset for NLI with large lexical overlap, which minimises the possibility of models discerning entailment solely based on token distinctions, and show that GPT-4 and Llama 2 fail it with strong bias. We then create further challenging sub-tasks in an effort to explain this failure. From a Computational Linguistics perspective, we identify a group of constructions with three classes of adjectives which cannot be distinguished by surface features. This enables us to probe for LLM's understanding of these constructions in various ways, and we find that they fail in a variety of ways to distinguish between them, suggesting that they don't adequately represent their meaning or capture the lexical properties of phrasal heads.

Leonie Weissweiler, Abdullatif Köksal, Hinrich Schütze (2024). Hybrid Human-LLM Corpus Construction and LLM Evaluation for Rare Linguistic Phenomena. arXiv preprint. (arXiv)
Abstract PDF

Argument Structure Constructions (ASCs) are one of the most well-studied construction groups, providing a unique opportunity to demonstrate the usefulness of Construction Grammar (CxG). For example, the caused-motion construction (CMC, ``She sneezed the foam off her cappuccino'') demonstrates that constructions must carry meaning, otherwise the fact that ``sneeze'' in this context causes movement cannot be explained. We form the hypothesis that this remains challenging even for state-of-the-art Large Language Models (LLMs), for which we devise a test based on substituting the verb with a prototypical motion verb. To be able to perform this test at statistically significant scale, in the absence of adequate CxG corpora, we develop a novel pipeline of NLP-assisted collection of linguistically annotated text. We show how dependency parsing and GPT-3.5 can be used to significantly reduce annotation cost and thus enable the annotation of rare phenomena at scale. We then evaluate GPT, Gemini, Llama2 and Mistral models for their understanding of the CMC using the newly collected corpus. We find that all models struggle with understanding the motion component that the CMC adds to a sentence.

Leonie Weissweiler*, Valentin Hofmann*, Anjali Kantharuban, Anna Cai, Ritam Dutt, Amey Hengle, Anubha Kabra, Atharva Kulkarni, Abhishek Vijayakumar, Haofei Yu, Hinrich Schuetze, Kemal Oflazer, David Mortensen (2023). Counting the Bugs in ChatGPT's Wugs: A Multilingual Investigation into the Morphological Capabilities of a Large Language Model. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. (EMNLP)
Abstract PDF

Large language models (LLMs) have recently reached an impressive level of linguistic capability, prompting comparisons with human language skills. However, there have been relatively few systematic inquiries into the linguistic capabilities of the latest generation of LLMs, and those studies that do exist (i) ignore the remarkable ability of humans to generalize, (ii) focus only on English, and (iii) investigate syntax or semantics and overlook other capabilities that lie at the heart of human language, like morphology. Here, we close these gaps by conducting the first rigorous analysis of the morphological capabilities of ChatGPT in four typologically varied languages (specifically, English, German, Tamil, and Turkish). We apply a version of Berko's (1958) wug test to ChatGPT, using novel, uncontaminated datasets for the four examined languages. We find that ChatGPT massively underperforms purpose-built systems, particularly in English. Overall, our results -- through the lens of morphology -- cast a new light on the linguistic capabilities of ChatGPT, suggesting that claims of human-like language skills are premature and misleading.

Leonie Weissweiler, Taiqi He, Naoki Otani, David R. Mortensen, Lori Levin, Hinrich Schütze (2023). Construction Grammar Provides Unique Insight into Neural Language Models. Proceedings of the First International Workshop on Construction Grammars and NLP (CxGs+NLP, GURT/SyntaxFest 2023), pages 85–95, Washington, D.C.. Association for Computational Linguistics.
Abstract PDF

Construction Grammar (CxG) has recently been used as the basis for probing studies that have investigated the performance of large pretrained language models (PLMs) with respect to the structure and meaning of constructions. In this position paper, we make suggestions for the continuation and augmentation of this line of research. We look at probing methodology that was not designed with CxG in mind, as well as probing methodology that was designed for specific constructions. We analyse selected previous work in detail, and provide our view of the most important challenges and research questions that this promising new field faces.

Leonie Weissweiler, Valentin Hofmann, Abdullatif Köksal, Hinrich Schütze (2022). The Better Your Syntax, the Better Your Semantics? Probing Pretrained Language Models for the English Comparative Correlative . Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 10859–10882, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics. (EMNLP)
Abstract PDF Source code on Github

Construction Grammar (CxG) is a paradigm from cognitive linguistics emphasising the connection between syntax and semantics. Rather than rules that operate on lexical items, it posits constructions as the central building blocks of language, i.e., linguistic units of different granularity that combine syntax and semantics. As a first step towards assessing the compatibility of CxG with the syntactic and semantic knowledge demonstrated by state-of-the-art pretrained language models (PLMs), we present an investigation of their capability to classify and understand one of the most commonly studied constructions, the English comparative correlative (CC). We conduct experiments examining the classification accuracy of a syntactic probe on the one hand and the models' behaviour in a semantic application task on the other, with BERT, RoBERTa, and DeBERTa as the example PLMs. Our results show that all three investigated PLMs are able to recognise the structure of the CC but fail to use its meaning. While human-like performance of PLMs on many NLP tasks has been alleged, this indicates that PLMs still suffer from substantial shortcomings in central domains of linguistic knowledge.

Talks

Computational Approaches to Construction Grammar, dissertation defence talk at LMU Munich, 03.07.2024.

Finding the Limits of LLMs with Constructions, talk given at NYU, Boston University, MIT, UPenn, CMU, and Cornell.

Testing the Limits of LLMs with Construction Grammar, talk given at Bielefeld University.

Everything is a Construction: New Goals for Syntactic and Semantic Probing, talk given at ETH Zürich, University of Vienna, and Bar Ilan University.

Construction Grammar Provides Unique Insight into Neural Language Models, talk given at CxGs+NLP Workshop at the GURT 2023

The Better Your Syntax, the Better Your Semantics? Probing Pretrained Language Models for the English Comparative Correlative, video recorded for EMNLP 2022

The Past, Present, and Future of NLP from a Linguistic Perspective, talk given at 8th International ScaDS Summer School 2022 and MunichNLP

CaMEL: Case Marker Extraction without Labels, talk given at ACL 2022

Leonie Weissweiler

/​ˈl​eː​o​n​i ​ˈvaɪ̯​svaɪ̯​​l​​ɐ/

News

Selected Publications

Talks

Thesis supervision

/ˈleːoni ˈvaɪ̯svaɪ̯lɐ/