Language models (LMs) like GPT and Claude have shown impressive abilities in a range of natural language processing (NLP) tasks. Among these tasks, code understanding and generation have quickly become one of the most popular applications of LMs, given its nature of executable logic forms. However, there is a practical understanding of how programming knowledge can be combined with natural language to automate software development. Moreover, recent studies also empirically demonstrate that code can be a better form for complex reasoning and agentic task automation, but they do not indicate their significance.
In this tutorial, we deem such superior capabilities brought by code modeling as Code Intelligence, and aim to provide a coherent overview of recent advances in this topic. We will start by first providing preliminaries of training foundation models on code and their common practices. We will then focus on downstream tasks in the domain of code and their evaluations. Then, we will cover how code can contribute to advancements in general tasks, and the opportunities of future research on Code Intelligence.
Our tutorial will be held on Saturday, Nov 8 (all the times are based on Beijing Time = UTC+8). Schedule may be subject to updates.
Time | Section | Presenter |
---|---|---|
9:00—9:20 | Introduction | |
9:20—9:50 | Training Code LMs | |
9:50—10:20 | Synthetic Data for Code | |
10:20—10:35 | Q & A Session & Coffee Break | |
10:35—11:15 | Evaluating Code LMs | |
11:15—11:45 | Bridging between Code and Natural Language | |
11:45—12:15 | Conclusion and Future Trends | |
12:15—12:30 | Q & A Session |
Papers will be added here later.
@article{ code-lm-tutorial,
author = { Zhuo, Terry Yue and Liu, Qian and Wang, Zijian and Ahmad, Wasi Uddin and Hui, Binyuan and Ben Allal, Loubna },
title = { NLP+Code: Code Intelligence in Language Models },
journal = { EMNLP 2025 },
year = { 2025 },
}