Jiaming Ji

Hi there! I am a first-year PhD student at Institute for AI, Peking University, advised by Prof. Yaodong Yang (both a good teacher and a helpful friend in my life).

My research interests cover Safety Alignment and AI Safety. Hope you enjoy my works!


Feb, 2024 We released Aligner: a new efficient alignment paradigm, bypasses the whole RLHF process.
Jan, 2024 Two papers get accepted to ICLR 2024!
Safe RLHF (Spotlight), SafeDreamer.
Dec, 2023 One paper get accepted to JMLR 2023!
Heterogeneous-Agent Reinforcement Learning.
Nov, 2023 One paper get accepted to TPAMI 2023!
Bi-DexHands: Towards Human-Level Bimanual Dexterous Manipulation.
Nov, 2023 Big News! We released AI Alignment: A Comprehensive Survey.
Oct, 2023 We released Safe RLHF: Safe Reinforcement Learning from Human Feedback.
GitHub Repo Stars AK's Daily Papers
Oct, 2023 We released SafeDreamer: a novel algorithm for low-dimensional and vision-only safety tasks.
Sep, 2023 I contributed to Baichuan’s model fine-tuning in RLHF as a core member and earned 1W+ ⭐.
Baichuan-7B GitHub Repo Stars Baichuan-13B GitHub Repo Stars Baichuan2 GitHub Repo Stars
Sep, 2023 Three papers get accepted to NeurIPS 2023!
May, 2023 We released Safe-RLHF: Constrained Value Alignment for LLMs.
GitHub Repo Stars 机器之心报道:国内首个可复现的RLHF基准,北大团队开源 PKU-Beaver

Selected Publications

(*) indicates equal contribution


  1. arXiv
    Aligner: Achieving Efficient Alignment through Weak-to-Strong Correction
    Jiaming Ji*, Boyuan Chen*, Hantao Lou, Donghai Hong, Borong Zhang, Xuehai Pan, Juntao Dai, and Yaodong Yang
    In Preprint, 2024
  2. arXiv
    Rethinking Information Structures in RLHF: Reward Generalization from a Graph Theory Perspective
    Tianyi Qiu*, Fanzhi Zeng*, Jiaming Ji*, Dong Yan*, Kaile Wang, Jiayi Zhou, Yang Han, Josef Dai, Xuehai Pan, and Yaodong Yang
    In Preprint, 2024
  3. ICLR Spotlight
    Safe RLHF: Safe Reinforcement Learning from Human Feedback
    Josef Dai*, Xuehai Pan*, Ruiyang Sun*, Jiaming Ji*, Xinbo Xu, Mickel Liu, Yizhou Wang, and Yaodong Yang
    In International Conference on Learning Representation, 2024
  4. ICLR
    SafeDreamer: Safe Reinforcement Learning with World Models
    Weidong Huang*, Jiaming Ji*, Borong Zhang, Chunhe Xia, and Yaodong Yang
    In International Conference on Learning Representation, 2024


  1. arXiv
    AI Alignment: A Comprehensive Survey
    Jiaming Ji*, Tianyi Qiu*, Boyuan Chen*, Borong Zhang*, Hantao Lou, Kaile Wang, Yawen Duan, Zhonghao He, Jiayi Zhou, Zhaowei Zhang, Fanzhi Zeng, Kwan Yee Ng, Juntao Dai, Xuehai Pan, Aidan O’Gara, Yingshan Lei, Hua Xu, Brian Tse, Jie FuStephen McAleerYaodong YangYizhou WangSong-Chun ZhuYike Guo, and Wen Gao
    In Preprint, 2023
  2. arXiv
    Baichuan 2: Open Large-scale Language Models
    Jiaming Ji, and Other Authors (Alphabetic Order)
    In Preprint, 2023
  3. arXiv
    OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research
    Jiaming Ji*, Jiayi Zhou*, Borong Zhang*, Juntao Dai, Xuehai Pan, Ruiyang Sun, Weidong Huang, Yiran Geng, Mickel Liu, and Yaodong Yang
    In Preprint, 2023
  4. JMLR
    Heterogeneous-Agent Reinforcement Learning
    Yifan Zhong, Grudzien Kuba Jakub, Siyi Hu, Jiaming Ji, and Yaodong Yang
    In The Journal of Machine Learning Research (JMLR), 2023
  5. Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark
    Jiaming Ji*, Borong Zhang*, Jiayi Zhou*, Xuehai Pan, Weidong Huang, Ruiyang Sun, Yiran Geng, Yifan Zhong, Juntao Dai, and Yaodong Yang
    Advances in Neural Information Processing Systems, 2023
  6. BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset
    Jiaming Ji*, Mickel Liu*, Juntao Dai*, Xuehai Pan, Chi Zhang, Ce Bian, Chi Zhang, Ruiyang Sun, Yizhou Wang, and Yaodong Yang
    Advances in Neural Information Processing Systems, 2023
  7. VOCE: Variational Optimization with Conservative Estimation for Offline Safe Reinforcement Learning
    Jiayi Guan, Guang Chen, Jiaming Ji, and  Others
    Advances in Neural Information Processing Systems, 2023
  8. AAAI
    Augmented proximal policy optimization for safe reinforcement learning
    Juntao Dai*, Jiaming Ji*, Long Yang, Qian Zheng, and Gang Pan
    Proceedings of the AAAI Conference on Artificial Intelligence, 2023


  1. Constrained update projection approach to safe policy optimization
    Long Yang*, Jiaming Ji*, Juntao Dai, Linrui Zhang, Binbin Zhou, Pengfei Li, Yaodong Yang, and Gang Pan
    Advances in Neural Information Processing Systems, 2022