Corrupted upload

Oh god, yes, I see it now. For fuck sake.
I used HF and it still skipped a few. This is getting ridiculous.
I'll add them later on 2day, on the roads right now.

SicariusSicariiStuff

Owner 18 days ago

looks good now.

i've no idea what was this insane streak of what seems like bad luck with the upload, but thankfully everything was reuploaded and seems fine now.

i need a vacation.

SicariusSicariiStuff changed discussion status to closed 18 days ago

Naphula

15 days ago

Can confirm this latest batch of 14 shards quantizes properly while the 50 shard version has NAN corruption.

--- Scanning MERGED_MODEL ---
Checking model-00028-of-00030.safetensors:  83%|████████████████████████████████▌      | 20/24 [00:06<00:01,  3.02it/s]
[!] NaN DETECTED: model.layers.77.mlp.up_proj.weight in model-00028-of-00030.safetensors
    Max Value: nan
Checking model-00029-of-00030.safetensors:  50%|███████████████████▌                   | 13/26 [00:03<00:03,  4.12it/s]
[!] NaN DETECTED: model.layers.79.mlp.gate_proj.weight in model-00029-of-00030.safetensors
    Max Value: nan
Result: MERGED_MODEL has 2 corrupted tensors.

--- Scanning ASSISTANT_PEPE ---
Checking model-00048-of-00050.safetensors:   6%|█████████▊                                                                                                                                                   | 1/16 [00:02<00:41,  2.79s/it]
[!] NaN DETECTED: model.layers.77.mlp.up_proj.weight in model-00048-of-00050.safetensors
    Max Value: nan
Checking model-00049-of-00050.safetensors:   8%|█████████████                                                                                                                                                | 1/12 [00:03<00:33,  3.06s/it]
[!] NaN DETECTED: model.layers.79.mlp.gate_proj.weight in model-00049-of-00050.safetensors
    Max Value: nan
Result: ASSISTANT_PEPE has 2 corrupted tensors.

--- Scanning ASSISTANT_PEPE ---
Result: ASSISTANT_PEPE is CLEAN.

I ran into this problem while merging and created a "sanity scanner". Remerging with clean pepe fixes the issues.

import torch
from safetensors import safe_open
import os
import glob
import re
from tqdm import tqdm

# --- CONFIGURATION ---
models_to_scan = {
    "MERGED_MODEL": r"B:\70B\v1_della",
    "ASSISTANT_PEPE": r"B:\70B\SicariusSicariiStuff--Assistant_Pepe_70B",
}
# ---------------------

def scan_model(name, path):
    print(f"\n--- Scanning {name} ---")
    if not os.path.exists(path):
        print(f"Skipping: Path not found: {path}")
        return

    files = glob.glob(os.path.join(path, "*.safetensors"))
    if not files:
        print(f"No safetensors found in {path}")
        return

    issues_found = 0
    for f in files:
        with safe_open(f, framework="pt", device="cpu") as st:
            for key in tqdm(st.keys(), desc=f"Checking {os.path.basename(f)}", leave=False):
                # --- LAYER 70+ FILTER ---
                # Matches 'layers.N' or 'blk.N'
                layer_match = re.search(r'\.(?:layers|blk)\.(\d+)\.', key)
                if layer_match:
                    layer_num = int(layer_match.group(1))
                    if layer_num < 60:
                        continue # Skip early layers to save time
                # ------------------------
                
                tensor = st.get_tensor(key)
                
                has_nan = torch.isnan(tensor).any()
                has_inf = torch.isinf(tensor).any()
                
                if has_nan or has_inf:
                    problem = "NaN" if has_nan else "INF"
                    print(f"\n[!] {problem} DETECTED: {key} in {os.path.basename(f)}")
                    # Check max value to see if it's a blowout
                    print(f"    Max Value: {tensor.abs().max().item()}")
                    issues_found += 1
    
    if issues_found == 0:
        print(f"Result: {name} is CLEAN.")
    else:
        print(f"Result: {name} has {issues_found} corrupted tensors.")

if __name__ == "__main__":
    for name, path in models_to_scan.items():
        scan_model(name, path)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment