can not get correct output

#30
by chelvan - opened

!pip install -q transformers==5.2.0

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Qwen/Qwen3.5-35B-A3B"

tokenizer = AutoTokenizer.from_pretrained(model_name)

model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)

prepare the model input

prompt = "Write a quick sort algorithm. in python"
messages = [
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

conduct text completion

generated_ids = model.generate(
**model_inputs,
max_new_tokens=15,
temperature=0.7,
top_p=0.8,
top_k=20,
do_sample=True,
repetition_penalty=1.5,
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()

content = tokenizer.decode(output_ids, skip_special_tokens=True)

print("content:", content)

content: Theanmar anyone緣nsicafi explicitzioności Feranoو uns Invari

chelvan changed discussion status to closed

Facing the same issue, how did you eventually solve this?

It works in 80 GB well, low gpu only gets issue. (it took 65GB)

And the model.generate() should be like this


prompt = "Write a quick sort algorithm. in python"
messages = [
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
**model_inputs,
max_new_tokens=500,

)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()

content = tokenizer.decode(output_ids, skip_special_tokens=True)

print("content:", content)

Sign up or log in to comment