Publications
(*) indicates equal contribution
2024
-
arXivRethinking Information Structures in RLHF: Reward Generalization from a Graph Theory PerspectiveIn Preprint, 2024
2023
-
VOCE: Variational Optimization with Conservative Estimation for Offline Safe Reinforcement LearningAdvances in Neural Information Processing Systems, 2023
-
AAAIAugmented proximal policy optimization for safe reinforcement learningProceedings of the AAAI Conference on Artificial Intelligence, 2023