news

Feb, 2024 We released Aligner: a new efficient alignment paradigm, bypasses the whole RLHF process.
无需RLHF显著提升GPT-4/Llama2性能,北大团队提出Aligner对齐新范式
Jan, 2024 Two papers get accepted to ICLR 2024!
Safe RLHF (Spotlight), SafeDreamer.
Dec, 2023 One paper get accepted to JMLR 2023!
Heterogeneous-Agent Reinforcement Learning.
Nov, 2023 One paper get accepted to TPAMI 2023!
Bi-DexHands: Towards Human-Level Bimanual Dexterous Manipulation.
Nov, 2023 Big News! We released AI Alignment: A Comprehensive Survey.
Oct, 2023 We released Safe RLHF: Safe Reinforcement Learning from Human Feedback.
GitHub Repo Stars AK's Daily Papers
Oct, 2023 We released SafeDreamer: a novel algorithm for low-dimensional and vision-only safety tasks.
Sep, 2023 I contributed to Baichuan’s model fine-tuning in RLHF as a core member and earned 1W+ ⭐.
Baichuan-7B GitHub Repo Stars Baichuan-13B GitHub Repo Stars Baichuan2 GitHub Repo Stars
Sep, 2023 Three papers get accepted to NeurIPS 2023!
May, 2023 We released Safe-RLHF: Constrained Value Alignment for LLMs.
GitHub Repo Stars 机器之心报道:国内首个可复现的RLHF基准,北大团队开源 PKU-Beaver
Jan, 2023 We released Safety-Gymnasium: a highly scalable safeRL environment library.
GitHub Repo Stars
Nov, 2022 First Place in NeurIPS 2022 Challenge Track, MyoChallenge Page
相关报道: [北京大学前沿计算研究中心] [北京大学人工智能研究院] [北京大学] [中国青年报]
Oct, 2022 We released OmniSafe: An Infrastructure for Accelerating SafeRL Research.
GitHub Repo Stars