Professor of Natural Language Processing

The Hebrew University of Jerusalem

Schwartz lab

Roy Schwartz's lab at the School of Computer Science and Engineering at the The Hebrew University of Jerusalem studies Natural Language Processing (NLP). Our research is driven towards making text understanding technology widely accessible—to doctors, to teachers, to researchers or even to curious teenagers. To be broadly adopted, NLP technology needs to not only be accurate, but also reliable; models should provide explanations for their outputs; and the methods we use to evaluate them need to be convincing.
Our lab also studies methods to make NLP technology more efficient and green, in order to decrease the environmental impact of the field, as well as lower the cost of AI research in order to broaden participation in it.

Lab News

TWIST paper accepted to NeurIPS 2023!

Congrats to Michael!

WHOOPS! paper accepted to ICCV!

Congrats to Yonatan!

Congrats to Yonatan for submitting his PhD thesis and to Netta for successfully defending her Master’s thesis!

An awesome lab event in Nahal Halilim!

Thank you all attendees of the Efficient NLP workshop in Dagstuhl!

See workshop report for more details!

Our efficient NLP policy has been officially adopted by the ACL exec!

See document link for more details.


Biases in Datasets

We analyze the datasets on which NLP models are trained. Looking carefully into these datasets, we uncover limitations and biases in the data collection process as well as the evaluation process. Our findings indicate that the recent success of neural models on many NLP tasks has been overestimated, and pave the way for the development of more reliable methods of evaluation.

Green NLP

The computations required for deep learning research have been doubling every few months. These computations have a surprisingly large carbon footprint. Moreover, the financial cost of the computations can make it difficult for academics, students, and researchers, in particular those from emerging economies, to engage in deep learning research. Our lab studies tools to make NLP technology more efficient, and to enhance the reporting of computational budgets.


Humans learn about the world using input from multiple modalities. Machines can also leverage other modalities in order to improve their textual understanding. Our lab studies methods for combining textual information with data from images, sounds, videos and others, with the goal of making them more robust and allowing them to generalize better.

Understanding NLP

In recent years, deep learning became the leading machine learning technology in NLP. Despite its wide adoption in NLP, the theory of deep learning lags behind its empirical success, as many engineered systems are in commercial use without a solid scientific basis for their operation. Our research aims to bridge the gap between theory and practice. We devise mathematical theories that link deep neural models to classical NLP models, such as weighted finite-state automata.

Recent Publications

Quickly discover relevant content by filtering publications.

Textually Pretrained Speech Language Models

Speech language models (SpeechLMs) process and generate acoustic data only, without textual supervision. In this work, we propose …

Breaking Common Sense: WHOOPS! A Vision-and-Language Benchmark of Synthetic and Compositional Images

Weird, unusual, and uncanny images pique the curiosity of observers because they challenge commonsense. For example, an image released …

Read, Look or Listen? What’s Needed for Solving a Multimodal Dataset

The prevalence of large-scale multimodal datasets presents unique challenges in assessing dataset quality. We propose a two-step method …

Surveying (Dis)Parities and Concerns of Compute Hungry NLP Research

Many recent improvements in NLP stem from the development and use of large pre-trained language models (PLMs) with billions of …

Finding the SWEET Spot: Analysis and Improvement of Adaptive Inference in Low Resource Settings

Adaptive inference is a simple method for reducing inference costs. The method works by maintaining multiple classifiers of different …


  • School of Computer Science and Engineering, Edmond Safra Campus, Givat Ram, The Hebrew University, Jerusalem, 9190401
  • Rothberg Building C, Room C503