wugs-and-daxes

Collection of academic works in natural language processing, computational linguistics, and computational cognitive science that study models/agents on their abilities to perform tasks involving novel concepts/words.

Maintainers: Kanishka Misra and Najoung Kim

Background

The integration and acquisition of words (and their representations) that a model/agent has never seen has been a challenge in natural language processing and cognitive science. In many cases, novel words have even offered a way to add experimental controls in model evaluation. Here, we attempt to gather academic publications and preprints that aim to diagnose, evaluate, or include novel words and their integration within their methods and analyses. We hope that this list helps folks like us who are interested in this topic!

This is a growing resource, and we hope to start organizing it in a meaningful way soon. Suggestions and contributions are welcome (through PRs)!

Papers

NOTE: We currently have a bunch of different papers -- we will organize them properly (into various categories or chronologically, perhaps) once we reach sufficient mass.

Papers that exclusively focus on how novel words can be/are intregrated and used by computational models

Testing for Grammatical Category Abstraction in Neural Language Models Najoung Kim and Paul Smolensky. 2021. SCiL 2021.
NYTWIT: A Dataset of Novel Words in the New York Times Yuval Pinter, Cassandra L. Jacobs, Max Bittker. 2020. COLING 2020.
Investigating Novel Verb Learning in BERT: Selectional Preference Classes and Alternation-Based Syntactic Generalization Tristan Thrush, Ethan Wilcox, and Roger Levy. 2020. BlackboxNLP 2020.
Learning semantic representations for novel words: Leveraging both form and context Timo Schick and Hinrich Schütze. 2019. AAAI 2019.
A La Carte Embedding: Cheap but Effective Induction of Semantic Feature Vectors Mikhail Khodak, Nikunj Saunshi, Yingyu Liang, Tengyu Ma, Brandon Stewart, and Sanjeev Arora. 2018. ACL 2018.
One-shot and few-shot learning of word embeddings. Andrew Lampinen and James McClelland. 2018. arXiv.
High-risk learning: acquiring new word vectors from tiny data Aurelie Herbelot and Marco Baroni. 2017. EMNLP 2017.
Multimodal word meaning induction from minimal exposure to natural text Angeliki Lazaridou, Marco Marelli, Marco Baroni. 2017. Cognitive Science.
A Computational Cognitive Model of Novel Word Generalization Aida Nematzadeh, Erin Grant, Suzanne Stevenson. 2015. EMNLP 2015.
WinoDict: Probing language models for in-context word acquisition Julian Martin Eisenschlos, Jeremy R. Cole, Fangyu Liu, William W. Cohen. 2022 arXiv.
COMPS: Conceptual Minimal Pair Sentences for testing Robust Property Knowledge and its Inheritance in Pre-trained Language Models Kanishka Misra, Julia Taylor Rayz, Allyson Ettinger. 2023. EACL 2023.

Papers that involve the use of novel words/words in specific scenarios

Morphology

This is a BERT. Now there are several of them. Can they generalize to novel words? Coleman Haley. 2020. BlackboxNLP 2020.
Counting the Bugs in ChatGPT's Wugs: A Multilingual Investigation into the Morphological Capabilities of a Large Language Model. Leonie Weissweiler, Valentin Hofmann, Anjali Kantharuban, Anna Cai, Ritam Dutt, Amey Hengle, Anubha Kabra, Atharva Kulkarni, Abhishek Vijayakumar, Haofei Yu, Hinrich Schütze, Kemal Oflazer, David R. Mortensen. EMNLP 2023

Category-based or Property Induction

A Property Induction Framework for Neural Language Models Kanishka Misra, Julia Taylor Rayz, and Allyson Ettinger. 2022. CogSci 2022.
Do language models learn typicality judgments from text? Kanishka Misra, Allyson Ettinger, and Julia Taylor Rayz. 2021. CogSci 2021.
Inductive reasoning about chimeric creatures Charles Kemp. 2011. NeurIPS 2011.

Others (to be organized)

When More Data Hurts: A Troubling Quirk in Developing Broad-Coverage Natural Language Understanding Systems Elias Stengel-Eskin, Emmanouil Antonios Platanios, Adam Pauls, Sam Thomson, Hao Fang, Benjamin Van Durme, Jason Eisner, and Yu Su. 2022. arXiv.
The driving forces of polarity-sensitivity: Experiments with multilingual pre-trained neural language models Lisa Bylinina and Alexey Tikhonov. 2022. CogSci 2022.
Old BERT, New Tricks: Artificial Language Learning for Pre-Trained Language Models. Lisa Bylinina, Alexey Tikhonov, and Ekaterina Garmash. 2021. arXiv.
Do Language Models Learn Position-Role Mappings? Jackson Petty, Michael Wilson, and Robert Frank. 2022. BUCLD 46.
Deep daxes: Mutual exclusivity arises through both learning biases and pragmatic strategies in neural networks Kristina Gulordava, Thomas Brochhagen, and Gemma Boleda. 2020. CogSci 2020.
Towards Understanding How Machines Can Learn Causal Overhypotheses Eliza Kosoy, David M. Chan, Adrian Liu, Jasmine Collins, Bryanna Kaufmann, Sandy Han Huang, Jessica B. Hamrick, John Canny, Nan Rosemary Ke, Alison Gopnik. CogSci 2023 (forthcoming).
Investigating grammatical abstraction in language models using few-shot learning of novel noun gender Priyanka Sukumaran, Conor Houghton, Nina Kazanina. Findings of EACL 2024.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

wugs-and-daxes

Background

Papers

Papers that exclusively focus on how novel words can be/are intregrated and used by computational models

Papers that involve the use of novel words/words in specific scenarios

Morphology

Compositional Generalization

Syntax Probing/Acquisition

Parsing

Mutual Exclusivity

Category-based or Property Induction

Others (to be organized)

About

Releases

Packages

Contributors 2

License

kanishkamisra/wugs-and-daxes

Folders and files

Latest commit

History

Repository files navigation

wugs-and-daxes

Background

Papers

Papers that exclusively focus on how novel words can be/are intregrated and used by computational models

Papers that involve the use of novel words/words in specific scenarios

Morphology

Compositional Generalization

Syntax Probing/Acquisition

Parsing

Mutual Exclusivity

Category-based or Property Induction

Others (to be organized)

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages