Corrupted upload
Thank you for everyone for reporting the broken files issues, and sorry for the inconvenience.
What happening is this- even after reuploading, the FP16 got broken, I don't know why they get corrupted.
The GGUF were verified to be fine β
I'll reupload everything else, and verify by hand, and report back.
SOME FILES MISSING, WILL UPDATE
I know this is annoying, it annoys me too :\
UPDATE:
Tested the FP16, quanted it to make sure it works, it is β
Uploading it now.
Thanks for re-uploading. Are you aware that some model-files are missing?
Oh god, yes, I see it now. For fuck sake.
I used HF and it still skipped a few. This is getting ridiculous.
I'll add them later on 2day, on the roads right now.
looks good now.
i've no idea what was this insane streak of what seems like bad luck with the upload, but thankfully everything was reuploaded and seems fine now.
i need a vacation.
Can confirm this latest batch of 14 shards quantizes properly while the 50 shard version has NAN corruption.
--- Scanning MERGED_MODEL ---
Checking model-00028-of-00030.safetensors: 83%|βββββββββββββββββββββββββββββββββ | 20/24 [00:06<00:01, 3.02it/s]
[!] NaN DETECTED: model.layers.77.mlp.up_proj.weight in model-00028-of-00030.safetensors
Max Value: nan
Checking model-00029-of-00030.safetensors: 50%|ββββββββββββββββββββ | 13/26 [00:03<00:03, 4.12it/s]
[!] NaN DETECTED: model.layers.79.mlp.gate_proj.weight in model-00029-of-00030.safetensors
Max Value: nan
Result: MERGED_MODEL has 2 corrupted tensors.
--- Scanning ASSISTANT_PEPE ---
Checking model-00048-of-00050.safetensors: 6%|ββββββββββ | 1/16 [00:02<00:41, 2.79s/it]
[!] NaN DETECTED: model.layers.77.mlp.up_proj.weight in model-00048-of-00050.safetensors
Max Value: nan
Checking model-00049-of-00050.safetensors: 8%|βββββββββββββ | 1/12 [00:03<00:33, 3.06s/it]
[!] NaN DETECTED: model.layers.79.mlp.gate_proj.weight in model-00049-of-00050.safetensors
Max Value: nan
Result: ASSISTANT_PEPE has 2 corrupted tensors.
--- Scanning ASSISTANT_PEPE ---
Result: ASSISTANT_PEPE is CLEAN.
I ran into this problem while merging and created a "sanity scanner". Remerging with clean pepe fixes the issues.
import torch
from safetensors import safe_open
import os
import glob
import re
from tqdm import tqdm
# --- CONFIGURATION ---
models_to_scan = {
"MERGED_MODEL": r"B:\70B\v1_della",
"ASSISTANT_PEPE": r"B:\70B\SicariusSicariiStuff--Assistant_Pepe_70B",
}
# ---------------------
def scan_model(name, path):
print(f"\n--- Scanning {name} ---")
if not os.path.exists(path):
print(f"Skipping: Path not found: {path}")
return
files = glob.glob(os.path.join(path, "*.safetensors"))
if not files:
print(f"No safetensors found in {path}")
return
issues_found = 0
for f in files:
with safe_open(f, framework="pt", device="cpu") as st:
for key in tqdm(st.keys(), desc=f"Checking {os.path.basename(f)}", leave=False):
# --- LAYER 70+ FILTER ---
# Matches 'layers.N' or 'blk.N'
layer_match = re.search(r'\.(?:layers|blk)\.(\d+)\.', key)
if layer_match:
layer_num = int(layer_match.group(1))
if layer_num < 60:
continue # Skip early layers to save time
# ------------------------
tensor = st.get_tensor(key)
has_nan = torch.isnan(tensor).any()
has_inf = torch.isinf(tensor).any()
if has_nan or has_inf:
problem = "NaN" if has_nan else "INF"
print(f"\n[!] {problem} DETECTED: {key} in {os.path.basename(f)}")
# Check max value to see if it's a blowout
print(f" Max Value: {tensor.abs().max().item()}")
issues_found += 1
if issues_found == 0:
print(f"Result: {name} is CLEAN.")
else:
print(f"Result: {name} has {issues_found} corrupted tensors.")
if __name__ == "__main__":
for name, path in models_to_scan.items():
scan_model(name, path)