ML//model//GPT//InstructGPT

2026-02-27

- OpenAI (2022). GPT-3 aligned to follow instructions via RLHF

OpenAI (2022). GPT-3 aligned to follow instructions via RLHF

The 3-step recipe: SFT on demonstrations → train a RM on human comparisons → optimize with PPO

Bridge from GPT-3 to ChatGPT, proving alignment makes models both safer and more useful.

1.3B InstructGPT preferred over 175B GPT-3 by human raters. Alignment beats scale for user satisfaction.