Tengyang Xie
I am an Assistant Professor of Computer Science at University of Wisconsin-Madison. Before that, I was a Postdoctoral Researcher at Microsoft Research New England (and New York City). I received my Ph.D. in Computer Science at University of Illinois at Urbana-Champaign, where I was fortunate to work with Nan Jiang. I obtained bachelor's degree in Physics from University of Science and Technology of China. I have also spent time at Simons Institute, Amazon AI, Microsoft Research, and Google Research.
Research Interests: I work on Reinforcement Learning / Machine Learning / Artificial Intelligence. The primary goal of my research is to explore the mathematical principles and design efficient algorithms relevant to artificial general intelligence (AGI). My current interests include: 1) emerging interactive learning paradigms with/for large language models (LLMs), 2) the mathematical principles of reinforcement learning (RL) and decision-making, 3) the algorithm and system challenges of scaling up new modalities.
Publications
-
Correcting the Mythos of KL-Regularization: Direct Alignment without Overoptimization via Chi-Squared Preference Optimization.
[PDF, arXiv]
Audrey Huang, Wenhao Zhan, Tengyang Xie, Jason D. Lee, Wen Sun, Akshay Krishnamurthy, Dylan J. Foster
International Conference on Learning Representations (ICLR) 2025
-
Offline Reinforcement Learning in Large State Spaces: Algorithms and Guarantees.
[PDF]
Nan Jiang, Tengyang Xie
(STS, invited submission under review)
-
Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts.
[PDF, arXiv, Model, Blog]
Haoxiang Wang*, Wei Xiong*, Tengyang Xie, Han Zhao, Tong Zhang
Conference on Empirical Methods in Natural Language Processing, (EMNLP) 2024, Findings
-
Self-Play with Adversarial Critic: Provable and Scalable Offline Alignment for Language Models.
[PDF, arXiv]
Xiang Ji, Sanjeev Kulkarni, Mengdi Wang, Tengyang Xie
-
Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF.
[PDF, arXiv, XPO Trainer in TRL]
Tengyang Xie*, Dylan J. Foster*, Akshay Krishnamurthy, Corby Rosset, Ahmed Awadallah, Alexander Rakhlin
International Conference on Learning Representations (ICLR) 2025
-
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences.
[PDF, arXiv]
Corby Rosset, Ching-An Cheng, Arindam Mitra, Michael Santacroce, Ahmed Awadallah, Tengyang Xie
Technical Report 2024
-
Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data.
[PDF, arXiv, Website]
Fahim Tajwar*, Anikait Singh*, Archit Sharma, Rafael Rafailov, Jeff Schneider, Tengyang Xie, Stefano Ermon, Chelsea Finn, Aviral Kumar
International Conference on Machine Learning (ICML) 2024
-
CounterCurate: Enhancing Physical and Semantic Visio-Linguistic Compositional Reasoning via Counterfactual Examples
.
[PDF, arXiv, Website]
Jianrui Zhang*, Mu Cai*, Tengyang Xie, Yong Jae Lee
Annual Meeting of the Association for Computational Linguistics (ACL) 2024, Findings
-
Harnessing Density Ratios for Online Reinforcement Learning.
[PDF, arXiv]
Philip Amortila*, Dylan J. Foster*, Nan Jiang*, Ayush Sekhari*, Tengyang Xie*
International Conference on Learning Representations (ICLR) 2024 (Spotlight, top 5%)
-
Towards Principled Representation Learning from Videos for Reinforcement Learning.
[PDF, arXiv]
Dipendra Misra*, Akanksha Saran*, Tengyang Xie, Alex Lamb, John Langford
International Conference on Learning Representations (ICLR) 2024 (Spotlight, top 5%)
-
Adversarial Model for Offline Reinforcement Learning. [PDF, arXiv]
Mohak Bhardwaj*, Tengyang Xie*, Byron Boots, Nan Jiang, Ching-An Cheng
Conference on Neural Information Processing Systems (NeurIPS) 2023
-
The Role of Coverage in Online Reinforcement Learning. [PDF, arXiv]
Tengyang Xie*, Dylan J. Foster*, Yu Bai, Nan Jiang, Sham M. Kakade
International Conference on Learning Representations (ICLR) 2023 (Notable-top-5% / Oral, top 1.8%)
-
ARMOR: A Model-based Framework for Improving Arbitrary Baseline Policies with Offline Data. [PDF, arXiv]
Tengyang Xie, Mohak Bhardwaj, Nan Jiang, Ching-An Cheng
Offline RL Workshop at NeurIPS 2022
-
Interaction-Grounded Learning with Action-Inclusive Feedback. [PDF, arXiv]
Tengyang Xie*, Akanksha Saran*, Dylan J. Foster, Lekan Molu, Ida Momennejad, Nan Jiang, Paul Mineiro, John Langford
Conference on Neural Information Processing Systems (NeurIPS) 2022
Complex Feedback in Online Learning Workshop at ICML 2022
-
Adversarially Trained Actor Critic for Offline Reinforcement Learning. [PDF, arXiv, code, MSR blog]
Ching-An Cheng*, Tengyang Xie*, Nan Jiang, Alekh Agarwal
International Conference on Machine Learning (ICML) 2022 (Outstanding Paper Runner-up Award, top 0.3%)
-
Bellman-consistent Pessimism for Offline Reinforcement Learning. [PDF, arXiv, slides]
Tengyang Xie, Ching-An Cheng, Nan Jiang, Paul Mineiro, Alekh Agarwal
Conference on Neural Information Processing Systems (NeurIPS) 2021 (Oral Presentation, top 0.6%)
-
Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning. [PDF, arXiv]
Tengyang Xie, Nan Jiang, Huan Wang, Caiming Xiong, Yu Bai
Conference on Neural Information Processing Systems (NeurIPS) 2021
-
Interaction-Grounded Learning. [PDF, arXiv, add'l supplement]
Tengyang Xie, John Langford, Paul Mineiro, Ida Momennejad
International Conference on Machine Learning (ICML) 2021
-
Finite Sample Analysis of Minimax Offline Reinforcement Learning: Completeness, Fast Rates and First-Order Efficiency. [PDF, arXiv]
Masatoshi Uehara, Masaaki Imaizumi, Nan Jiang, Nathan Kallus, Wen Sun, Tengyang Xie
Submitted, 2021.
-
A Variant of the Wang-Foster-Kakade Lower Bound for the Discounted Setting. [PDF, arXiv]
Philip Amortila*, Nan Jiang*, Tengyang Xie*
-
Batch Value-function Approximation with Only Realizability. [PDF, arXiv]
Tengyang Xie, Nan Jiang
International Conference on Machine Learning (ICML) 2021
-
Q* Approximation Schemes for Batch Reinforcement Learning: A Theoretical Comparison. [PDF, arXiv]
Tengyang Xie, Nan Jiang
Conference on Uncertainty in Artificial Intelligence (UAI) 2020
-
Towards Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling. [PDF, Poster, arXiv]
Tengyang Xie, Yifei Ma, Yu-Xiang Wang
Conference on Neural Information Processing Systems (NeurIPS) 2019
Spotlight presentation at the NeurIPS 2018 Workshop on Causal Learning.
- Provably Efficient Q-Learning with Low Switching Cost. [PDF, Poster, arXiv]
Yu Bai, Tengyang Xie, Nan Jiang, Yu-Xiang Wang
Conference on Neural Information Processing Systems (NeurIPS) 2019
- A Block Coordinate Ascent Algorithm for Mean-Variance Optimization. [PDF, Poster, Link]
Tengyang Xie*, Bo Liu*, Yangyang Xu, Mohammad Ghavamzadeh, Yinlam Chow, Daoming Lyu, Daesub Yoon
Conference on Neural Information Processing Systems (NeurIPS) 2018
(* indicates equal contribution or alphabetic ordering.)
Teaching
- CS760 Machine Learning: Fall 2024, UW-Madison.
Service
- Area Chair: ICLR 2025, ICML 2025, RLC 2025
- Conference Reviewer/Program Committee: ICML Workshop Proposals, NeurIPS, ICML, AISTATS, AAAI, EWRL
- Journal Reviewer: Journal of the American Statistical Association (JASA), Journal of Machine Learning Research (JMLR), IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), Annals of Statistics, IEEE Transactions on Information Theory, Springer Machine Learning Journal.
- Workshop Organizer: Interactive Learning with Implicit Human Feedback @ ICML 2023.
- Workshop Program Committee: Optimization Foundations of RL @ NeurIPS 2019, Theoretical Foundations of RL @ ICML 2020 & 2021, Offline RL @ NeurIPS 2020-2022, RL for Real Life @ ICML 2021 & NeurIPS 2022.
© 2018-2025 Tengyang Xie
Template from Andreas Viklund