EMNLP 2025 Tutorial
NLP+Code: Code Intelligence in Language Models

1Monash University, 2CSIRO's Data61, 3ByteDance
4Meta, 5NVIDIA, 6Alibaba Group, 7Hugging Face
code-lm@googlegroups.com

Saturday, Nov 8, 9:00 - 12:30 @ Suzhou International Expo Centre
Visit this link for the Zoom recording of the tutorial

About this tutorial

Language models (LMs) like GPT and Claude have shown impressive abilities in a range of natural language processing (NLP) tasks. Among these tasks, code understanding and generation have quickly become one of the most popular applications of LMs, given its nature of executable logic forms. However, there is a practical understanding of how programming knowledge can be combined with natural language to automate software development. Moreover, recent studies also empirically demonstrate that code can be a better form for complex reasoning and agentic task automation, but they do not indicate their significance.

In this tutorial, we deem such superior capabilities brought by code modeling as Code Intelligence, and aim to provide a coherent overview of recent advances in this topic. We will start by first providing preliminaries of training foundation models on code and their common practices. We will then focus on downstream tasks in the domain of code and their evaluations. Then, we will cover how code can contribute to advancements in general tasks, and the opportunities of future research on Code Intelligence.

Schedule

Our tutorial will be held on Saturday, Nov 8 (all the times are based on Beijing Time = UTC+8). Schedule may be subject to updates.

Time Section Presenter
9:00—9:20 Introduction
9:20—9:50 Training Code LMs
9:50—10:20 Synthetic Data for Code
10:20—10:35 Q & A Session & Coffee Break
10:35—11:15 Evaluating Code LMs
11:15—11:45 Bridging between Code and Natural Language
11:45—12:15 Conclusion and Future Trends
12:15—12:30 Q & A Session

Reading List

Papers will be added here later.


Training Code LMs


Synthetic Data for Code


Evaluating Code LMs


Bridging between Code and Natural Language


Conclusion and Future Trends

BibTeX

@article{ code-lm-tutorial,
  author    = { Zhuo, Terry Yue and Liu, Qian and Wang, Zijian and Ahmad, Wasi Uddin and Hui, Binyuan and Ben Allal, Loubna },
  title     = { NLP+Code: Code Intelligence in Language Models },
  journal   = { EMNLP 2025 },
  year      = { 2025 },
}