ML//model//GPT//InstructGPT
- OpenAI (2022). GPT-3 aligned to follow instructions via RLHF
OpenAI (2022). GPT-3 aligned to follow instructions via RLHF
The 3-step recipe: SFT on demonstrations → train a RM on human comparisons → optimize with PPO
Bridge from GPT-3 to ChatGPT — proved alignment makes models both safer and more useful.
1.3B InstructGPT preferred over 175B GPT-3 by human raters. Alignment beats scale for user satisfaction.