ACL 2023 Tutorial:
Retrieval-based Language Models and Applications

1University of Washington, 2Princeton University

Sunday July 9 14:00 - 17:30 (EDT) @ Metropolitan West
Visit this link for the Zoom recording of the tutorial

About this tutorial

Language models (LMs) such as GPT-3 and PaLM have shown impressive abilities in a range of natural language processing (NLP) tasks. However, relying solely on their parameters to encode a wealth of world knowledge requires a prohibitively large number of parameters and hence massive computing, and they often struggle to learn long-tail knowledge. Moreover, these parametric LMs are fundamentally incapable of adapting over time, often hallucinate, and may leak private data from the training corpus. To overcome these limitations, there has been growing interest in retrieval-based LMs which incorporate a non-parametric datastore (e.g., text chunks from an external corpus) with their parametric counterparts. Retrieval-based LMs can outperform LMs without retrieval by a large margin with much fewer parameters, can update their knowledge by replacing their retrieval corpora, and provide citations for users to easily verify and evaluate the predictions.

In this tutorial, we aim to provide a comprehensive and coherent overview of recent advances in retrieval-based LMs. We will start by first providing preliminaries covering the foundations of LMs and retrieval systems. We will then focus on recent progress in architectures, learning approaches, and applications of retrieval-based LMs.


Our tutorial will be held on July 9 (all the times are based on EDT = Toronto local time). Slides may be subject to updates.

Time Section Presenter
14:00—14:15 Section 1: Introduction [Slides] Danqi
14:15—14:25 Section 2: Definition & Preliminaries [Slides] Sewon
14:25—15:00 Section 3: Retrieval-based LMs: Architecture [Slides] Sewon
15:00—15:25 Section 4: Retrieval-based LMs: Training [Slides] Zexuan
15:25—15:30 Q & A Session I
15:30—16:00 Coffee break
16:00—16:25 Section 4 (Cont’d): Retrieval-based LMs: Training [Slides] Zexuan
16:25—17:00 Section 5: Retrieval-based LMs: Applications [Slides] Akari
17:00—17:10 Section 6: Extension: Multilingual & Multimodal [Slides] Akari
17:10—17:20 Section 7: Challenges & Opportunities [Slides] [References] Danqi
17:20—17:30 Q & A Session II

Reading List

Bold papers are discussed in detail during our tutorial.

Section 3: Architecture

Section 4: Training

Section 5: Application

Section 6: Extension

Section 7: Challenges & Opportunities


@article{ retrieval-lm-tutorial,
  author    = { Asai, Akari and Min, Sewon and Zhong, Zexuan and Chen, Danqi },
  title     = { ACL 2023 Tutorial: Retrieval-based Language Models and Applications },
  journal   = { ACL 2023 },
  year      = { 2023 },