YongganFu's picture
Update README.md
a11afdb verified
|
raw
history blame
1.43 kB
metadata
library_name: transformers
tags: []

Nemotron-Diffusion-Exp-Ministral-3B

Developed by DLER team @ NVR and will be updated actively. Contact Yonggan Fu and Pavlo Molchanov for any question.

Environment

Docker path: /lustre/fsw/portfolios/nvr/users/yongganf/docker/megatron_py25_dllm_ministral.sqsh on CW-DFW. Apply for interactive nodes with the following command:

srun -A {account} --partition interactive --time 4:00:00 --gpus 8 --container-image /lustre/fsw/portfolios/nvr/users/yongganf/docker/megatron_py25_dllm_ministral.sqsh --container-mounts=$HOME:/home,/lustre:/lustre  --pty bash

Chat with Our Model

from transformers import AutoModel, AutoTokenizer
import torch

repo_name = "nvidia/Nemotron-Diffusion-Exp-Ministral-3B"

tokenizer = AutoTokenizer.from_pretrained(repo_name, trust_remote_code=True)
model = AutoModel.from_pretrained(repo_name, trust_remote_code=True)
model = model.cuda().to(torch.bfloat16)

user_input = input("User: ").strip()

prompt_ids = tokenizer(user_input,return_tensors='pt').input_ids.to(device='cuda')
out_ids, nfe = model.generate(prompt_ids, max_new_tokens=128, steps=128, block_length=32, shift_logits=False, causal_context=True, threshold=0.9)

tokenized_out = tokenizer.batch_decode(out_ids[:, prompt_ids.shape[1]:], skip_special_tokens=True)[0]
print(f"Model: {tokenized_out}")
print(f"[Num Function Eval (NFE)={nfe}]")