Spaces:
Running on Zero
Running on Zero
Needs Restart
#4
by benny-lnrz - opened
The model seems to have crashed with the following error:
Exit code: 1. Reason: ๏ฟฝ๏ฟฝโ | 13.7G/16.5G [00:50<00:02, 1.02GB/s][A
model.safetensors: 91%|โโโโโโโโโ | 15.0G/16.5G [00:51<00:01, 1.09GB/s][A
model.safetensors: 98%|โโโโโโโโโโ| 16.2G/16.5G [00:52<00:00, 1.11GB/s][A
model.safetensors: 100%|โโโโโโโโโโ| 16.5G/16.5G [00:52<00:00, 313MB/s]
Traceback (most recent call last):
File "/app/app.py", line 293, in <module>
model = AutoModel.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/transformers/models/auto/auto_factory.py", line 372, in from_pretrained
return model_class.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/transformers/modeling_utils.py", line 4076, in from_pretrained
model, missing_keys, unexpected_keys, mismatched_keys, offload_index, error_msgs = cls._load_pretrained_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/transformers/modeling_utils.py", line 4185, in _load_pretrained_model
caching_allocator_warmup(model, expanded_device_map, hf_quantizer)
File "/usr/local/lib/python3.12/site-packages/transformers/modeling_utils.py", line 4719, in caching_allocator_warmup
device_memory = torch_accelerator_module.mem_get_info(index)[0]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/torch/cuda/memory.py", line 838, in mem_get_info
return torch.cuda.cudart().cudaMemGetInfo(device)
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/torch/cuda/__init__.py", line 487, in cudart
_lazy_init()
File "/usr/local/lib/python3.12/site-packages/torch/cuda/__init__.py", line 410, in _lazy_init
torch._C._cuda_init()
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx
done thank you for the ping!
SreyanG-NVIDIA changed discussion status to closed