TIDE: Token-Informed Depth Execution for Per-Token Early Exit in LLM Inference Paper • 2603.21365 • Published Mar 22 • 2
Decoding as Optimisation on the Probability Simplex: From Top-K to Top-P (Nucleus) to Best-of-K Samplers Paper • 2602.18292 • Published Feb 20 • 13