Tengyang Xie
I am an Assistant Professor of Computer Science at University of Wisconsin-Madison. Before that, I was a Postdoctoral Researcher at Microsoft Research New England (and New York City). I received my Ph.D. in Computer Science at University of Illinois at Urbana-Champaign, where I was fortunate to work with Nan Jiang. I obtained bachelor's degree in Physics from University of Science and Technology of China. I have also spent time at Simons Institute, Amazon AI, Microsoft Research, and Google Research.
Research Interests: I work on Reinforcement Learning / Machine Learning / Artificial Intelligence. The primary goal of my research is to explore the mathematical principles and design efficient algorithms relevant to artificial general intelligence (AGI). My current interests include: 1) emerging interactive learning paradigms with/for large language models (LLMs), 2) the mathematical principles of reinforcement learning (RL) and decision-making, 3) the algorithm and system challenges of scaling up new modalities.
Publications
-
Correcting the Mythos of KL-Regularization: Direct Alignment without Overoptimization via Chi-Squared Preference Optimization.
[PDF, arXiv]
Audrey Huang, Wenhao Zhan, Tengyang Xie, Jason D. Lee, Wen Sun, Akshay Krishnamurthy, Dylan J. Foster
-
Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts.
[PDF, arXiv, Model, Blog]
Haoxiang Wang*, Wei Xiong*, Tengyang Xie, Han Zhao, Tong Zhang
Conference on Empirical Methods in Natural Language Processing, (EMNLP) 2024, Findings
-
Self-Play with Adversarial Critic: Provable and Scalable Offline Alignment for Language Models.
[PDF, arXiv]
Xiang Ji, Sanjeev Kulkarni, Mengdi Wang, Tengyang Xie
-
Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF.
[PDF, arXiv, XPO Trainer in TRL]
Tengyang Xie*, Dylan J. Foster*, Akshay Krishnamurthy, Corby Rosset, Ahmed Awadallah, Alexander Rakhlin
-
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences.
[PDF, arXiv]
Corby Rosset, Ching-An Cheng, Arindam Mitra, Michael Santacroce, Ahmed Awadallah, Tengyang Xie
-
Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data.
[PDF, arXiv, Website]
Fahim Tajwar*, Anikait Singh*, Archit Sharma, Rafael Rafailov, Jeff Schneider, Tengyang Xie, Stefano Ermon, Chelsea Finn, Aviral Kumar
International Conference on Machine Learning (ICML) 2024
-
CounterCurate: Enhancing Physical and Semantic Visio-Linguistic Compositional Reasoning via Counterfactual Examples
.
[PDF, arXiv, Website]
Jianrui Zhang*, Mu Cai*, Tengyang Xie, Yong Jae Lee
Annual Meeting of the Association for Computational Linguistics (ACL) 2024, Findings
-
Harnessing Density Ratios for Online Reinforcement Learning.
[PDF, arXiv]
Philip Amortila*, Dylan J. Foster*, Nan Jiang*, Ayush Sekhari*, Tengyang Xie*
International Conference on Learning Representations (ICLR) 2024 (Spotlight, top 5%)
-
Towards Principled Representation Learning from Videos for Reinforcement Learning.
[PDF, arXiv]
Dipendra Misra*, Akanksha Saran*, Tengyang Xie, Alex Lamb, John Langford
International Conference on Learning Representations (ICLR) 2024 (Spotlight, top 5%)
-
Adversarial Model for Offline Reinforcement Learning. [PDF, arXiv]
Mohak Bhardwaj*, Tengyang Xie*, Byron Boots, Nan Jiang, Ching-An Cheng
Conference on Neural Information Processing Systems (NeurIPS) 2023
-
The Role of Coverage in Online Reinforcement Learning. [PDF, arXiv]
Tengyang Xie*, Dylan J. Foster*, Yu Bai, Nan Jiang, Sham M. Kakade
International Conference on Learning Representations (ICLR) 2023 (Notable-top-5% / Oral, top 1.8%)
-
ARMOR: A Model-based Framework for Improving Arbitrary Baseline Policies with Offline Data. [PDF, arXiv]
Tengyang Xie, Mohak Bhardwaj, Nan Jiang, Ching-An Cheng
Offline RL Workshop at NeurIPS 2022
-
Interaction-Grounded Learning with Action-Inclusive Feedback. [PDF, arXiv]
Tengyang Xie*, Akanksha Saran*, Dylan J. Foster, Lekan Molu, Ida Momennejad, Nan Jiang, Paul Mineiro, John Langford
Conference on Neural Information Processing Systems (NeurIPS) 2022
Complex Feedback in Online Learning Workshop at ICML 2022
-
Adversarially Trained Actor Critic for Offline Reinforcement Learning. [PDF, arXiv, code, MSR blog]
Ching-An Cheng*, Tengyang Xie*, Nan Jiang, Alekh Agarwal
International Conference on Machine Learning (ICML) 2022 (Outstanding Paper Runner-up Award, top 0.3%)
-
Bellman-consistent Pessimism for Offline Reinforcement Learning. [PDF, arXiv, slides]
Tengyang Xie, Ching-An Cheng, Nan Jiang, Paul Mineiro, Alekh Agarwal
Conference on Neural Information Processing Systems (NeurIPS) 2021 (Oral Presentation, top 0.6%)
-
Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning. [PDF, arXiv]
Tengyang Xie, Nan Jiang, Huan Wang, Caiming Xiong, Yu Bai
Conference on Neural Information Processing Systems (NeurIPS) 2021
-
Interaction-Grounded Learning. [PDF, arXiv, add'l supplement]
Tengyang Xie, John Langford, Paul Mineiro, Ida Momennejad
International Conference on Machine Learning (ICML) 2021
-
Finite Sample Analysis of Minimax Offline Reinforcement Learning: Completeness, Fast Rates and First-Order Efficiency. [PDF, arXiv]
Masatoshi Uehara, Masaaki Imaizumi, Nan Jiang, Nathan Kallus, Wen Sun, Tengyang Xie
Submitted, 2021.
-
A Variant of the Wang-Foster-Kakade Lower Bound for the Discounted Setting. [PDF, arXiv]
Philip Amortila*, Nan Jiang*, Tengyang Xie*
-
Batch Value-function Approximation with Only Realizability. [PDF, arXiv]
Tengyang Xie, Nan Jiang
International Conference on Machine Learning (ICML) 2021
-
Q* Approximation Schemes for Batch Reinforcement Learning: A Theoretical Comparison. [PDF, arXiv]
Tengyang Xie, Nan Jiang
Conference on Uncertainty in Artificial Intelligence (UAI) 2020
-
Towards Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling. [PDF, Poster, arXiv]
Tengyang Xie, Yifei Ma, Yu-Xiang Wang
Conference on Neural Information Processing Systems (NeurIPS) 2019
Spotlight presentation at the NeurIPS 2018 Workshop on Causal Learning.
- Provably Efficient Q-Learning with Low Switching Cost. [PDF, Poster, arXiv]
Yu Bai, Tengyang Xie, Nan Jiang, Yu-Xiang Wang
Conference on Neural Information Processing Systems (NeurIPS) 2019
- A Block Coordinate Ascent Algorithm for Mean-Variance Optimization. [PDF, Poster, Link]
Tengyang Xie*, Bo Liu*, Yangyang Xu, Mohammad Ghavamzadeh, Yinlam Chow, Daoming Lyu, Daesub Yoon
Conference on Neural Information Processing Systems (NeurIPS) 2018
(* indicates equal contribution or alphabetic ordering.)
Teaching
- CS760 Machine Learning: Fall 2024, UW-Madison.
Service
- Conference Reviewer/Program Committee: NeurIPS, ICML, AISTATS, AAAI, EWRL
- Journal Reviewer: IEEE Transactions on Pattern Analysis and Machine Intelligence, Annals of Statistics, IEEE Transactions on Information Theory, Springer Machine Learning Journal.
- Workshop Organizer: Interactive Learning with Implicit Human Feedback @ ICML 2023.
- Workshop Program Committee: Optimization Foundations of RL @ NeurIPS 2019, Theoretical Foundations of RL @ ICML 2020 & 2021, Offline RL @ NeurIPS 2020-2022, RL for Real Life @ ICML 2021 & NeurIPS 2022.
© 2018-2024 Tengyang Xie
Template from Andreas Viklund