ML//Training//SFT

You give the AI a script: input X -> output A — we call that an instruction tuning dataset.


You give the AI a script: input X -> output A — we call that an instruction tuning dataset.

Creates a rigid ceiling — A acts as a massive gravity attractor (singular attractor learning via cross-entropy loss + SGD), so the AI stops exploring.

Valid, beautiful synonyms (Bx, By, Bz) are sacrificed at the altar of literal imitation.

It becomes a "helpful companion" — obedient but uncreative.

Exposure bias: during SFT the model always sees perfect human-written context, but at inference it sees its own outputs — errors compound because the model never trained on its own mistakes as context.

Iterative SFT

Iterative SFT (rejection sampling / RFT): generate many answers, pick the best, pretend it was ground truth, train on it, repeat.

Downside of iterative: echo chamber amplifies biases and slop until the AI loses edge and variety — also "average in, average out" (if best of 100 is a 7/10, you're training mediocrity)

Synthetic SFT umbrella: RFT, Constitutional AI outputs for SFT, distillation (big model writes, small model learns), self-instruct (AI generates the questions too)