DeepWater Pleroma broken prototype - 90 model merge
This merges 90 Mistral Nemo models into one, but its bugged.
These are broken (uploaded for archive). I recommend using the polished version at EldritchLabs/DeepWater-Pleroma-12B-v1
Polished version not available yet
I tested a lot of ideas including a custom patcher script that replaced inf/nan sections with base_model vectors. After HEALING I tested SLERP, Karcher, and Arcee Fusion (with different Tukey Fence values) and it was still ruined.
Anyway here is the healer script in case anyone wants to try patching this model. Something must be done with the tokenizers too but I'm not sure what.
If you run the healer script first you can get the model to quantize, but it still generates endless repition midway through regardless of chat template.
For now I just say this merge is failed and test some other combinations next, maybe in smaller chunks (based on tokenizers) to see what works.
more info https://huggingface.co/EldritchLabs/Kraken-Stock-12B-v1/discussions/4
import torch
from safetensors.torch import load_file, save_file
import os
import gc
# Configuration
base_path = r'B:\12B\models--mistralai--Mistral-Nemo-Instruct-2407'
broken_path = r'C:\Quanter\model_cache\EldritchLabs__DeepWater-Pleroma-12B-v1'
output_path = r'C:\Quanter\model_cache\EldritchLabs__DeepWater-Pleroma-12B-v1\DeepWater-Healed'
os.makedirs(output_path, exist_ok=True)
print("Step 1: Indexing base model shards...")
base_map = {}
base_files = [f for f in os.listdir(base_path) if f.endswith('.safetensors')]
for bf in base_files:
# We only load the header to index keys (fast)
from safetensors import safe_open
with safe_open(os.path.join(base_path, bf), framework="pt") as f:
for k in f.keys():
base_map[k] = bf
def get_base_tensor(name):
"""Helper to find and load a specific tensor from the base model."""
if name not in base_map:
return None
target_file = os.path.join(base_path, base_map[name])
sd = load_file(target_file)
return sd[name].clone().detach()
print("Step 2: Healing broken shards...")
broken_files = [f for f in os.listdir(broken_path) if f.endswith('.safetensors')]
for bf in broken_files:
print(f"Processing {bf}...")
broken_file_path = os.path.join(broken_path, bf)
# Load broken shard into RAM and break disk link
mmap_broken = load_file(broken_file_path)
broken_sd = {k: v.clone().detach() for k, v in mmap_broken.items()}
del mmap_broken
gc.collect()
shard_healed = False
for k in list(broken_sd.keys()):
broken_t = broken_sd[k]
# Check for inf/nan
invalid_mask = ~torch.isfinite(broken_t)
if invalid_mask.any():
num_broken = torch.sum(invalid_mask).item()
print(f" !! Found {num_broken} inf/nan in {k}. Fetching base weights...")
base_t = get_base_tensor(k)
if base_t is not None:
# Ensure shapes match (handle potential mergekit resizing)
if base_t.shape != broken_t.shape:
print(f" Shape mismatch for {k}: Base {base_t.shape} vs Broken {broken_t.shape}. Skipping.")
continue
# Heal: Keep broken where finite, take base where infinite
broken_sd[k] = torch.where(invalid_mask, base_t, broken_t)
shard_healed = True
del base_t
else:
print(f" Warning: {k} not found in base model. Cannot heal.")
# Save the healed shard to the NEW directory
save_file(broken_sd, os.path.join(output_path, bf))
print(f" Saved {bf}")
del broken_sd
gc.collect()
print("\nHealing complete. Output saved to:", output_path)
models:
- model: B:\12B\models--aixonlab--Aether-12b
- model: B:\12B\models--aixonlab--Zinakha-12b
- model: B:\12B\models--allura-org--Bigger-Body-12b
- model: B:\12B\models--allura-org--MN-12b-RP-Ink
- model: B:\12B\models--allura-org--remnant-mn-12b
- model: B:\12B\models--allura-org--Tlacuilo-12B
- model: B:\12B\models--anthracite-org--magnum-v4-12b
- model: B:\12B\models--ArliAI--Mistral-Nemo-12B-ArliAI-RPMax-v1.2
- model: B:\12B\models--axolotl-ai-co--romulus-mistral-nemo-12b-simpo
- model: B:\12B\models--Babsie--Opulus-12B-v3
- model: B:\12B\models--BeaverAI--mistral-doryV2-12b
- model: B:\12B\models--BeaverAI--MN-2407-DSK-QwQify-v0.1-12B
- model: B:\12B\models--cgato--Nemo-12b-Humanize-KTO-Experimental-Latest
- model: B:\12B\models--cgato--Nemo-12b-Humanize-SFT-v0.2.5-KTO
- model: B:\12B\models--crestf411--MN-Slush
- model: B:\12B\models--crestf411--nemo-sunfall-v0.6.1
- model: B:\12B\models--D1rtyB1rd--Egregore-Alice-RP-NSFW-12B
- model: B:\12B\models--Delta-Vector--Francois-PE-V2-Huali-12B
- model: B:\12B\models--Delta-Vector--Ohashi-NeMo-12B
- model: B:\12B\models--Delta-Vector--Rei-V3-KTO-12B
- model: B:\12B\models--dphn--dolphin-2.9.3-mistral-nemo-12b
- model: B:\12B\models--EldritchLabs--Altair-Stock-12B-v1
- model: B:\12B\models--elinas--Chronos-Gold-12B-1.0
- model: B:\12B\models--Elizezen--Himeyuri-v0.1-12B
- model: B:\12B\models--Epiculous--Azure_Dusk-v0.2
- model: B:\12B\models--Epiculous--Crimson_Dawn-v0.2
- model: B:\12B\models--EpistemeAI2--Fireball-Mistral-Nemo-12B-Philos
- model: B:\12B\models--EpistemeAI--Mistral-Nemo-Instruct-12B-Philosophy-Math
- model: B:\12B\models--Fizzarolli--MN-12b-Rosier-v1
- model: B:\12B\models--Fizzarolli--MN-12b-Sunrose
- model: B:\12B\models--flammenai--Flammades-Mistral-Nemo-12B
- model: B:\12B\models--flammenai--Mahou-1.5-mistral-nemo-12B
- model: B:\12B\models--GreenerPastures--Golden-Curry-12B
- model: B:\12B\models--Gryphe--Pantheon-RP-1.5-12b-Nemo
- model: B:\12B\models--Gryphe--Pantheon-RP-1.6.1-12b-Nemo
- model: B:\12B\models--HumanLLMs--Human-Like-Mistral-Nemo-Instruct-2407
- model: B:\12B\models--IIEleven11--Kalypso
- model: B:\12B\models--inflatebot--MN-12B-Mag-Mell-R1
- model: B:\12B\models--intervitens--mini-magnum-12b-v1.1
- model: B:\12B\models--jtatman--mistral_nemo_12b_reasoning_psychology_lora
- model: B:\12B\models--KOOWEEYUS--BlackSheep-RP-12B
- model: B:\12B\models--Lambent--Arsenic-Shahrazad-12B-v2
- model: B:\12B\models--Lambent--Arsenic-Shahrazad-12B-v3
- model: B:\12B\models--Lambent--arsenic-nemo-unleashed-12B
- model: B:\12B\models--Lambent--Gilded-Arsenic-12B
- model: B:\12B\models--LatitudeGames--Muse-12B
- model: B:\12B\models--LatitudeGames--Wayfarer-12B
- model: B:\12B\models--LatitudeGames--Wayfarer-2-12B
- model: B:\12B\models--MarinaraSpaghetti--NemoMix-Unleashed-12B
- model: B:\12B\models--migtissera--Tess-3-Mistral-Nemo-12B
- model: B:\12B\models--mistralai--Mistral-Nemo-Instruct-2407
- model: B:\12B\models--mpasila--Mistral-freeLiPPA-LoRA-12B
# - model: B:\12B\models--nbeerbower--Denker-mistral-nemo-12B
# - model: B:\12B\models--nbeerbower--Lyra-Gutenberg-mistral-nemo-12B
# - model: B:\12B\models--nbeerbower--Lyra4-Gutenberg-12B
# - model: B:\12B\models--nbeerbower--Merlina-ORPO-12B
# - model: B:\12B\models--nbeerbower--mistral-nemo-bophades-12B
# - model: B:\12B\models--nbeerbower--mistral-nemo-cc-12B
# - model: B:\12B\models--nbeerbower--mistral-nemo-gutenberg-12B-v4
# - model: B:\12B\models--nbeerbower--Mistral-Nemo-Gutenberg-Doppel-12B
# - model: B:\12B\models--nbeerbower--Mistral-Nemo-Gutenberg-Vitus-12B
# - model: B:\12B\models--nbeerbower--mistral-nemo-kartoffel-12B
# - model: B:\12B\models--nbeerbower--Mistral-Nemo-Prism-12B
# - model: B:\12B\models--nbeerbower--mistral-nemo-wissenschaft-12B
- model: B:\12B\models--MuXodious--Irix-12B-Model_Stock-absolute-heresy
- model: B:\12B\models--NeverSleepHistorical--lumi-nemo-e2.0
- model: B:\12B\models--NeverSleep--Lumimaid-v0.2-12B
- model: B:\12B\models--nothingiisreal--MN-12B-Celeste-V1.9
- model: B:\12B\models--p-e-w--Mistral-Nemo-Instruct-2407-heretic-noslop
- model: B:\12B\models--PocketDoc--Dans-DangerousWinds-V1.1.0-12b
- model: B:\12B\models--PocketDoc--Dans-PersonalityEngine-V1.1.0-12b
- model: B:\12B\models--PocketDoc--Dans-PersonalityEngine-V1.3.0-12b
- model: B:\12B\models--PocketDoc--Dans-SakuraKaze-V1.0.0-12b
- model: B:\12B\models--PygmalionAI--Eleusis-12B
- model: B:\12B\models--PygmalionAI--Pygmalion-3-12B
- model: B:\12B\models--rAIfle--Questionable-MN-bf16
- model: B:\12B\models--ReadyArt--Dark-Nexus-12B-v2.0
- model: B:\12B\models--ReadyArt--Forgotten-Safeword-12B-v4.0
- model: B:\12B\models--ReadyArt--Omega-Darker_The-Final-Directive-12B
# - model: B:\12B\models--ReadyArt--Safeword-Casual-V1-12B
# - model: B:\12B\models--ReadyArt--The-Omega-Directive-M-12B-Unslop-v2.0
# - model: B:\12B\models--RicardoEstep--RPBizkit-v5-12B-Lorablated
- model: B:\12B\models--romaingrx--red-teamer-mistral-nemo
- model: B:\12B\models--Sao10K--MN-12B-Lyra-v1
- model: B:\12B\models--Sao10K--MN-12B-Lyra-v4
- model: B:\12B\models--Sao10K--MN-12B-Vespa-x1
- model: B:\12B\models--Sao10K--MN-BackyardAI-Party-12B-v1
- model: B:\12B\models--shisa-ai--shisa-v2-mistral-nemo-12b
- model: B:\12B\models--SicariusSicariiStuff--Angelic_Eclipse_12B
- model: B:\12B\models--SicariusSicariiStuff--Impish_Bloodmoon_12B
- model: B:\12B\models--SicariusSicariiStuff--Impish_Longtail_12B
- model: B:\12B\models--SicariusSicariiStuff--Impish_Nemo_12B
- model: B:\12B\models--SicariusSicariiStuff--Sweet_Dreams_12B
- model: B:\12B\models--sleepdeprived3--Christian-Bible-Expert-v2.0-12B
- model: B:\12B\models--SuperbEmphasis--MN-12b-RP-Ink-RP-Longform
- model: B:\12B\models--SuperbEmphasis--Omega-Darker_The-Final-Directive-Longform-Stage2-ERP-12B-v0.2
- model: B:\12B\models--TheDrummer--Rivermind-Lux-12B-v1
- model: B:\12B\models--TheDrummer--Rocinante-12B-v1
- model: B:\12B\models--TheDrummer--Rocinante-12B-v1.1
- model: B:\12B\models--TheDrummer--Rocinante-X-12B-v1
- model: B:\12B\models--TheDrummer--UnslopNemo-12B-v4.1
- model: B:\12B\models--Trappu--Nemo-Picaro-12B
- model: B:\12B\models--Undi95--LocalC-12B-e2.0
- model: B:\12B\models--UsernameJustAnother--Nemo-12B-Marlin-v8
- model: B:\12B\models--VAGOsolutions--SauerkrautLM-Nemo-12b-Instruct
merge_method: karcher
base_model: B:\12B\models--SicariusSicariiStuff--Sweet_Dreams_12B
parameters:
tol: 1e-9
max_iter: 300
dtype: float32
out_dtype: bfloat16
tokenizer:
source: base
chat_template: auto
- Downloads last month
- 32