junk outputs

#41
by rirv938 - opened

If I eval this model on GCP Vertex AI model garden its great and no junk outputs

If i use vllm myself I see a huge number of junk outputs and my eval metrics decline.

e.g
"*Ayano's eyes light up, her eyes expressionlijkly shifting"
""he doesn't even torightly look at the screen"
"He stays exactly where you're lean against him"

I have tried a lot of different settings including copying the vllm settings which get used by Vertex AI AND using the same docker container that vertex AI uses too but I still get issues.

It might be the slightly different weights that are used by vertex AI

My vllm arguments look like:
"engine_args": {
'gpu_memory_utilization': 0.92,
'language_model_only': True,
'max_model_len': 10240,
'max_num_batched_tokens': 10240,
'max_num_seqs': 64,
'tensor_parallel_size': 1,
'trust_remote_code': True,
'tool-call-parser': 'gemma4',
'reasoning-parser': 'gemma4'
},

im using the default settings in the config (temp 1.0, top_k 64, top_p 0.95 etc. --> they get set automatically by vllm)

Sign up or log in to comment