Two LoRA cold-start SFT experiments teaching structured think/answer reasoning to Nanbeige4-3B-Base using distilled traces from frontier models