news
Feb, 2024 |
We released Aligner: a new efficient alignment paradigm, bypasses the whole RLHF process.
无需RLHF显著提升GPT-4/Llama2性能,北大团队提出Aligner对齐新范式 |
---|---|
Jan, 2024 |
Two papers get accepted to ICLR 2024!
Safe RLHF (Spotlight), SafeDreamer. |
Dec, 2023 |
One paper get accepted to JMLR 2023!
Heterogeneous-Agent Reinforcement Learning. |
Nov, 2023 |
One paper get accepted to TPAMI 2023!
Bi-DexHands: Towards Human-Level Bimanual Dexterous Manipulation. |
Nov, 2023 | Big News! We released AI Alignment: A Comprehensive Survey. |
Oct, 2023 |
We released Safe RLHF: Safe Reinforcement Learning from Human Feedback.
AK's Daily Papers |
Oct, 2023 | We released SafeDreamer: a novel algorithm for low-dimensional and vision-only safety tasks. |
Sep, 2023 |
I contributed to Baichuan’s model fine-tuning in RLHF as a core member and earned 1W+ ⭐.
Baichuan-7B Baichuan-13B Baichuan2 |
Sep, 2023 | Three papers get accepted to NeurIPS 2023! |
May, 2023 |
We released Safe-RLHF: Constrained Value Alignment for LLMs.
机器之心报道:国内首个可复现的RLHF基准,北大团队开源 PKU-Beaver |
Jan, 2023 |
We released Safety-Gymnasium: a highly scalable safeRL environment library.
|
Nov, 2022 |
First Place in NeurIPS 2022 Challenge Track, MyoChallenge Page
相关报道: [北京大学前沿计算研究中心] [北京大学人工智能研究院] [北京大学] [中国青年报] |
Oct, 2022 |
We released OmniSafe: An Infrastructure for Accelerating SafeRL Research.
|