LLMs Can Improve at Code by Training on Their Own Wrong Answers
Simple Self-Distillation (SSD) lets LLMs improve at code generation by training on their own unverified outputs, no correctness labels or execution environment needed. Qwen3-30B jumps 12.9 points on LiveCodeBench v6.