--- library_name: transformers tags: [] --- # Nemotron-Diffusion-Exp-Ministral-8B Developed by [DLER team](https://nv-dler.github.io/) @ NVR and will be updated actively. Contact Yonggan Fu and Pavlo Molchanov for any question. # Environment Docker path: `/lustre/fsw/portfolios/nvr/users/yongganf/docker/megatron_py25_dllm_ministral.sqsh` on CW-DFW. Apply for interactive nodes with the following command: ``` srun -A {account} --partition interactive --time 4:00:00 --gpus 8 --container-image /lustre/fsw/portfolios/nvr/users/yongganf/docker/megatron_py25_dllm_ministral.sqsh --container-mounts=$HOME:/home,/lustre:/lustre --pty bash ``` ## Chat with Our Model ``` from transformers import AutoModel, AutoTokenizer import torch repo_name = "nvidia/Nemotron-Diffusion-Exp-Ministral-8B" tokenizer = AutoTokenizer.from_pretrained(repo_name, trust_remote_code=True) model = AutoModel.from_pretrained(repo_name, trust_remote_code=True) model = model.cuda().to(torch.bfloat16) user_input = input("User: ").strip() prompt_ids = tokenizer(user_input,return_tensors='pt').input_ids.to(device='cuda') out_ids, nfe = model.generate(prompt_ids, max_new_tokens=128, steps=128, block_length=32, shift_logits=False, causal_context=True, threshold=0.9) tokenized_out = tokenizer.batch_decode(out_ids[:, prompt_ids.shape[1]:], skip_special_tokens=True)[0] print(f"Model: {tokenized_out}") print(f"[Num Function Eval (NFE)={nfe}]") ```