Palestra - Pragmatic Code Autocomplete - Gabriel Poesia

Friday, September 18, 2020 - 14:00
Gabriel Poesia, aluno de doutorado em Stanford, trabalhando com Noah Goodman
Date/Time: Friday, September 18th, 2pm (Brasília Time)


Human language is ambiguous, with intended meanings recovered via pragmatic reasoning in context. Such reliance on context is essential for the efficiency of human communication. Programming languages, in stark contrast, are defined by unambiguous grammars. In this work, we aim to make programming languages more concise by allowing programmers to utilize a controlled level of ambiguity. Specifically, we allow single-character abbreviations for common keywords and identifiers. Our system first proposes a set of strings that can be abbreviated by the user. Using only 100 abbreviations, we observe that a large dataset of Python code can be compressed by 15%, a number that can be improved even further by specializing the abbreviations to a particular code base. We then use a contextualized sequence-to-sequence model to rank potential expansions of inputs that include abbreviations. In an offline reconstruction task our model achieves accuracies ranging from 92% to 99%, depending on the programming language and user settings. The model is small enough to run on a commodity CPU in real-time, and to fine-tune to a new code base in a few CPU hours. We evaluate the usability of our system in a user study, integrating it in Microsoft VSCode, a popular code text editor. We observe that our system performs well and is complementary to traditional autocomplete features.

Short bio: 
Gabriel Poesia is a Computer Science PhD student at Stanford Univeristy. Prior to that, Gabriel co-founded a programming education startup, Loopye, in Brazil, after finishing his M. Sc. and B. Sc. in Computer Science at UFMG.
Professor envolvido: 
Fernando Magno Quintão Pereira