About

My name is Simeng Sun (). Right now, I am a research scientist at NVIDIA.
I was advised by Mohit Iyyer during my Ph.D. study at UMass Amherst and Ani Nenkova during my master's study at UPenn.
(CV) (Google Scholar) (Semantic Scholar) (X) (Linkedin) (Github)

Research

Publications

How much do contextualized representations encode long-range context?
Simeng Sun, Cheng-Ping Hsieh
NAACL Findings, long, 2025

nGPT: Normalized Transformer with Representation Learning on the Hypersphere
Ilya Loshchilov, Cheng-Ping Hsieh, Simeng Sun, Boris Ginsburg
ICLR 2025

RULER: What's the real context size of your long-context language models?
Cheng-Ping Hsieh*, Simeng Sun*, Samuel Kriman, Shantanu Acharya, Dima Rekesh, Fei Jia, Yang Zhang, Boris Ginsburg
COLM, 2024

TopicGPT: A Prompt-based Topic Modeling Framework
Chau Minh Pham, Alexander Hoyle, Simeng Sun, Philip Resnik, Mohit Iyyer
NAACL, long, 2024

PEARL: Prompting Large Language Models to Plan and Execute Actions Over Long Documents
Simeng Sun, Yang Liu, Shuohang Wang, Dan Iter, Chenguang Zhu, Mohit Iyyer
Conference of the European Chapter of the Association for Computational Linguistics (EACL), long, 2024

How Does In-Context Learning Help Prompt Tuning?
Simeng Sun, Yang Liu, Dan Iter, Chenguang Zhu, Mohit Iyyer
Conference of the European Chapter of the Association for Computational Linguistics (EACL) findings, short, 2024

Efficiently Upgrading Multilingual Machine Translation Models to Support More Languages
Simeng Sun, Maha Elbayad, Anna Sun, James Cross
Conference of the European Chapter of the Association for Computational Linguistics (EACL), long, 2023

ChapterBreak: A Challenge Dataset for Long-Range Language Models
Simeng Sun, Katherine Thai, Mohit Iyyer
Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), short, 2022

How Much Do Modifications to Transformer Language Models Affect Their Ability to Learn Linguistic Knowledge?
Simeng Sun, Brian Dillon, Mohit Iyyer
Workshop on Insights from Negative Results in NLP @ ACL 2022

Alternative Input Signals Ease Transfer in Multilingual Machine Translation
Simeng Sun, Angela Fan, James Cross, Vishrav Chaudhary, Chau Tran, Philipp Koehn, Francisco Guzman
Annual Meeting of the Association for Computational Linguistics (ACL), long, 2022

Do Long-Range Language Models Actually Use Long-Range Context?
Simeng Sun, Kalpesh Krishna, Andrew Mattarella-Micke, and Mohit Iyyer
Empirical Methods in Natural Language Processing (EMNLP), long, 2021

IGA : An Intent-Guided Authoring Assistant
Simeng Sun, Wenlong Zhao, Varun Manjunatha, Rajiv Jain, Vlad Morariu, Franck Dernoncourt, Balaji Vasan Srinivasan, Mohit Iyyer
Empirical Methods in Natural Language Processing (EMNLP), long, 2021

Energy-Based Reranking: Improving Neural Machine Translation Using Energy-Based Models.
Sumanta Bhattacharyya, Pedram Rooshenas, Subhajit Naskar, Simeng Sun, Mohit Iyyer, and Andrew McCallum.
Annual Meeting of the Association for Computational Linguistics (ACL), long, 2021

Revisiting Simple Neural Probabilistic Language Models
Simeng Sun, Mohit Iyyer
Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), short, 2021

Hard-Coded Gaussian Attention for Neural Machine Translation
Weiqiu You*, Simeng Sun*, Mohit Iyyer
Annual Meeting of the Association for Computational Linguistics (ACL), long, 2020

The Feasibility of Embedding Based Automatic Evaluation for Single Document Summarization
Simeng Sun, Ani Nenkova
Empirical Methods in Natural Language Processing (EMNLP), short, 2019

How to Compare Summarizers without Target Length? Pitfalls, Solutions and Re-Examination of the Neural Summarization Literature
Simeng Sun, Ori Shapira, Ido Dagan, Ani Nenkova
North American Chapter of the Association for Computational Linguistics (NAACL-HLT), NeuralGen Workshop, 2019

Name Disambiguation for Chinese Scientific Authors with Multi-Level Clustering
Simeng Sun, Hui Zhang, Ning Li, Yong Chen
IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing (EUC), 2017

Manuscripts

Exploring the impact of low-rank adaptation on the performance, efficiency, and regularization of RLHF
Simeng Sun, Dhawal Gupta, Mohit Iyyer
2023 Sep

Invited Talks

Towards Effective Modeling of Long-range Context
@ UPitts. 2023 Nov.
Hosted by Xiang Lorraine Li.

Efficiently Upgrading Multilingual Machine Translation Models to Support More Languages
@ University of Toronto. 2023 Aug.
Hosted by Annie Lee.

Do Long-Range Language Models Actually Use Long-Range Context?
@ Google Research N2Formal Reading Group. 2022 Aug.
Hosted by Markus Rabe.

Education

Ph.D. student in Computer Science, UMass Amherst Aug 2019 - Jan 2024
M.S.E in Computer and Information Sciences, UPenn Aug 2017 - May 2019
B.E. in Computer Science and Technology, Beihang University Sep 2013 - Jun 2017
Exchange student, Trinity College Dublin Sep 2015 - Jan 2016

Work Experience

Research Scientist @ Nvidia Feb 2024 - present
Student Researcher @ Microsoft Cognitive Service Research Dec 2022 - Aug 2023
Research Scientist Intern @ Meta AI May - Aug 2022
Research Scientist Intern @ Facebook AI Research May - Aug 2021
Research Intern @ Adobe Research May - Aug 2020
NLP Intern @ Educational Testing Service Jun - Aug 2018

Contact

firstname + lastname initial @ nvidia.com