Why do you set `use_cache=False`? Removing it will speed up generation
#8
by borzunov - opened
Hi,
I wonder why do you set use_cache=False in config.json?
As far as I understand, this gives identical results to use_cache=True for autoregressive models but runs the O(n^3) generation algorithm instead of the O(n^2) one (i.e., re-runs prefix for generating every new token). I think you can significantly speed up generation for this model by removing this line from the config.
borzunov changed discussion title from Why do you set `use_cache=False`? to Why do you set `use_cache=False`? Removing it will speed up generation
It is needed in train proccess. To my mind, you can chage to True in inference.