Thank you!
Hey Team,
Just wanted to drop a quick note here thanking you for your excellent work on this cutting edge field. The results have been phenomenal in all my tests.
Thank you!
They are still updating the model, it's not fully trained yet
Oh? Did I mess something up then?
8K raw /v1/completions results, each with an exact 8192-token prompt:
Case No DFlash DFlash
8192 -> 256 TTFT 32.304s TTFT 38.336s
E2E 48.873s E2E 42.579s
Gen 15.45 tok/s Gen 60.38 tok/s
Finish length Finish stop
8192 -> 2048 TTFT 32.290s TTFT 38.348s
E2E 165.223s E2E 72.972s
Gen 15.41 tok/s Gen 59.16 tok/s
Finish length Finish stop
8192 -> 10000 TTFT 32.295s TTFT 38.346s
E2E 691.417s E2E 228.615s
Gen 15.17 tok/s Gen 52.56 tok/s
Finish length Finish stop
thank you