공지사항 게시판읽기 ( The Search for Meaning in Large Language Models ( 24 . 3 . 14 . 10 : 00 ) 세미나 안내 )

[행사/세미나] The Search for Meaning in Large Language Models(24. 3. 14. 10:00) 세미나 안내

인공지능융합학과(일반대학원)
조회수966
2024-03-11

안녕하세요 대학원 인공지능융합학과 사무실입니다.

아래와 같이 세미나가 진행될 예정이오니, 관심있는 학생들의 많은 참여 바랍니다.

Title: The Search for Meaning in Large Language Models: An Approach Using the Formal Semantics of Code

Speaker: Charles Jin @ MIT

Time : 10:00 ~ 11:00, March 14, 2024

Location: Online

https://us02web.zoom.us/j/81818522150?pwd=ZDhVcml4T2ViMjA1Zy9jb0tDbmllUT09

Language: English speech & English slides

Abstract:

As large language models exhibit increasingly sophisticated behaviors, whether current (or near-future) language models capture any sort of meaning remains hotly debated. The answer to this question has significant ramifications for the development of current and future applications based on the capabilities of LLMs; a positive resolution would also settle an open philosophical debate on the acquisition of meaning from form, a question of central importance in the grand challenge of machine intelligence.

In this talk, we lay the groundwork for using programming languages to investigate the ability of LLMs to acquire, represent, and reason about the semantics of the languages with which they work. Programming languages are highly expressive; yet unlike natural languages, they have precise, formally defined semantics. Empirically, we present evidence that language models (LMs) of code can learn to represent the formal semantics of programs, despite being trained only to perform next-token prediction. Specifically, we train a Transformer model on a synthetic corpus of programs written in a domain-specific language for navigating 2D grid world environments. Each program in the corpus is preceded by a (partial) specification in the form of several input-output examples. Despite providing no further inductive biases, we find that a probing classifier is able to extract increasingly accurate representations of program state from the LM hidden states over the course of training, suggesting the LM acquires an emergent ability to interpret programs in the formal sense. We conclude by presenting a novel structural causal framework for interpreting what LLMs are capable of learning from their training data, which unifies our explorations of code LLMs and formal semantics with more general notions of learning and understanding.

Based on joint work with Martin Rinard.

Bio:

Charles Jin is a fifth year PhD student at MIT, advised by Martin Rinard. His primary research interests lie at the intersection of programming languages (compilers and program synthesis) and machine learning (data efficiency, robustness, and large language models). Most recently, his work has explored the extent to which large language models are capable of reasoning about and understanding code.

감사합니다.

대학원 인공지능융합학과 사무실 드림

이전글: 2024학년도 1학기 논문제출자격시험(전공/외국) 응시(면제) 신청 안내

다음글: Some intuitions about large language models(24. 3. 28. 10:30) 세미나 안내

공지사항

학과공지

[행사/세미나] The Search for Meaning in Large Language Models(24. 3. 14. 10:00) 세미나 안내