ML//chain of thought

2023-01-05

"Let's think step by step." Prompt the model to show reasoning before the answer.

"Let's think step by step." Prompt the model to show reasoning before the answer.

Wei et al. (2022): dramatically improves math, logic, multi-step reasoning.

Works because intermediate tokens change the context before the answer: triggers a distributional shift toward the "reasoned conclusion" region of latent space

Tree-of-thought generalizes it to branching exploration of multiple reasoning paths.

Extended thinking is the trained, scaled version: same principle (more reasoning tokens = better answers), but optimized with RL and dedicated compute instead of a prompt hack.

Reasoning models (o1, o3, R1) prove CoT scales: trained thinking chains outperform prompted ones by orders of magnitude on hard problems.