ML//chain of thought

"Let's think step by step." — prompt the model to show reasoning before the answer.


"Let's think step by step." — prompt the model to show reasoning before the answer.

Wei et al. (2022): dramatically improves math, logic, multi-step reasoning.

Works because intermediate tokens change the context before the answer — triggers a distributional shift toward the "reasoned conclusion" region of latent space

Tree-of-thought generalizes it to branching exploration of multiple reasoning paths.

Extended thinking is the trained, scaled version: same principle (more reasoning tokens = better answers), but optimized with RL and dedicated compute instead of a prompt hack.

Reasoning models (o1, o3, R1) prove CoT scales — trained thinking chains outperform prompted ones by orders of magnitude on hard problems.