ML//model//GPT//InstructGPT

- OpenAI (2022). GPT-3 aligned to follow instructions via RLHF


OpenAI (2022). GPT-3 aligned to follow instructions via RLHF

The 3-step recipe: SFT on demonstrations → train a RM on human comparisons → optimize with PPO

Bridge from GPT-3 to ChatGPT — proved alignment makes models both safer and more useful.

1.3B InstructGPT preferred over 175B GPT-3 by human raters. Alignment beats scale for user satisfaction.