[행사/세미나] The Search for Meaning in Large Language Models(24. 3. 14. 10:00) 세미나 안내
- 인공지능융합학과(일반대학원)
- 조회수1740
- 2024-03-11
안녕하세요 대학원 인공지능융합학과 사무실입니다.
아래와 같이 세미나가 진행될 예정이오니, 관심있는 학생들의 많은 참여 바랍니다.
Title: The Search for Meaning in Large Language Models: An Approach Using the Formal Semantics of Code
Speaker: Charles Jin @ MIT
Time : 10:00 ~ 11:00, March 14, 2024
Location: Online
https://us02web.zoom.us/j/81818522150?pwd=ZDhVcml4T2ViMjA1Zy9jb0tDbmllUT09
Language: English speech & English slides
Abstract:
As large language models exhibit increasingly sophisticated behaviors, whether current (or near-future) language models capture any sort of meaning remains hotly debated. The answer to this question has significant ramifications for the development of current and future applications based on the capabilities of LLMs; a positive resolution would also settle an open philosophical debate on the acquisition of meaning from form, a question of central importance in the grand challenge of machine intelligence.
In this talk, we lay the groundwork for using programming languages to investigate the ability of LLMs to acquire, represent, and reason about the semantics of the languages with which they work. Programming languages are highly expressive; yet unlike natural languages, they have precise, formally defined semantics. Empirically, we present evidence that language models (LMs) of code can learn to represent the formal semantics of programs, despite being trained only to perform next-token prediction. Specifically, we train a Transformer model on a synthetic corpus of programs written in a domain-specific language for navigating 2D grid world environments. Each program in the corpus is preceded by a (partial) specification in the form of several input-output examples. Despite providing no further inductive biases, we find that a probing classifier is able to extract increasingly accurate representations of program state from the LM hidden states over the course of training, suggesting the LM acquires an emergent ability to interpret programs in the formal sense. We conclude by presenting a novel structural causal framework for interpreting what LLMs are capable of learning from their training data, which unifies our explorations of code LLMs and formal semantics with more general notions of learning and understanding.
Based on joint work with Martin Rinard.
Bio:
Charles Jin is a fifth year PhD student at MIT, advised by Martin Rinard. His primary research interests lie at the intersection of programming languages (compilers and program synthesis) and machine learning (data efficiency, robustness, and large language models). Most recently, his work has explored the extent to which large language models are capable of reasoning about and understanding code.
감사합니다.
대학원 인공지능융합학과 사무실 드림