Needs Restart

#4
by benny-lnrz - opened

The model seems to have crashed with the following error:

Exit code: 1. Reason: ๏ฟฝ๏ฟฝโ–Ž | 13.7G/16.5G [00:50<00:02, 1.02GB/s]

model.safetensors:  91%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ | 15.0G/16.5G [00:51<00:01, 1.09GB/s]

model.safetensors:  98%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–Š| 16.2G/16.5G [00:52<00:00, 1.11GB/s]
model.safetensors: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 16.5G/16.5G [00:52<00:00, 313MB/s] 
Traceback (most recent call last):
  File "/app/app.py", line 293, in <module>
    model = AutoModel.from_pretrained(
            ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/transformers/models/auto/auto_factory.py", line 372, in from_pretrained
    return model_class.from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/transformers/modeling_utils.py", line 4076, in from_pretrained
    model, missing_keys, unexpected_keys, mismatched_keys, offload_index, error_msgs = cls._load_pretrained_model(
                                                                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/transformers/modeling_utils.py", line 4185, in _load_pretrained_model
    caching_allocator_warmup(model, expanded_device_map, hf_quantizer)
  File "/usr/local/lib/python3.12/site-packages/transformers/modeling_utils.py", line 4719, in caching_allocator_warmup
    device_memory = torch_accelerator_module.mem_get_info(index)[0]
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/torch/cuda/memory.py", line 838, in mem_get_info
    return torch.cuda.cudart().cudaMemGetInfo(device)
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/torch/cuda/__init__.py", line 487, in cudart
    _lazy_init()
  File "/usr/local/lib/python3.12/site-packages/torch/cuda/__init__.py", line 410, in _lazy_init
    torch._C._cuda_init()
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx
NVIDIA org

done thank you for the ping!

SreyanG-NVIDIA changed discussion status to closed

Sign up or log in to comment