Spaces:

Naphula
/

model_tools

Running

App Files Files Community

Naphula commited on 22 days ago

Commit

0caf794

verified ·

1 Parent(s): f43fd2b

Update README.md

Browse files

Files changed (1) hide show

README.md +6 -1

README.md CHANGED Viewed

@@ -8,7 +8,7 @@ pinned: false
 ---
 # Model Tools by Naphula
-Tools to enhance LLM quantizations and merging. Merge and audit large language models with low VRAM.
 # [graph_v18.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/graph_v18.py)
 - Merge models in minutes instead of hours on low VRAM. For a 3060/3060 Ti user: This script enables functionality that is otherwise impossible (merging 70B models or large 7B merges with `--cuda`) without OOM. [More details here](https://huggingface.co/spaces/Naphula/model_tools/blob/main/mergekit_low-VRAM-graph_patch.md)
@@ -17,6 +17,10 @@ Tools to enhance LLM quantizations and merging. Merge and audit large language m
 # config.py
 - Simply replace line 13 | BEFORE `ScalarOrGradient: TypeAlias = Union[float, List[float]]` → AFTER `ScalarOrGradient: TypeAlias = Union[float, List[float], str, bool]` | to allow for custom filepath strings within parameter settings.
 # [enable_fix_mistral_regex_true.md](https://huggingface.co/spaces/Naphula/model_tools/blob/main/enable_fix_mistral_regex_true.md)
 - Merge models with extreme tokenizer incompatibility. Requires modifying the `mergekit.yaml` `tokenizer` section and adding `--fix-mistral-regex` to your merge commands. (Note: Do not use `token_surgeon.py`, `gen_id_patcher.py`, or `vocab_id_patcher.py` with this, they are obsolete now.) Configured for MN 12B by default. Follow the steps in this guide to modify these scripts:
 - `mergekit/merge.py`
@@ -55,6 +59,7 @@ Tools to enhance LLM quantizations and merging. Merge and audit large language m
 # [arcee_fusion_salience_scanner.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/arcee_fusion_salience_scanner.py)
 - Scan the salience % of your arcee_fusion merges. The default `tukey_fence` value is 1.5 which results in 12.5% salience, but [this can be adjusted (see guide here)](modify_arcee_fusion_tukey_fence_parameter.md).
 # [eos_scanner.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/eos_scanner.py)
 - Updated! This tool scans the tokenizer jsons to detect any mismatches with EOS tokens, which cause early termination bugs. You can then use the [gen_id_patcher.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/gen_id_patcher.py) and [vocab_id_patcher.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/vocab_id_patcher.py), or the [chatml_to_mistral.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/chatml_to_mistral.py) to patch missing `generation_config.json` files for EOS token. See [this post](https://huggingface.co/Naphula/Q0_Bench/discussions/1?not-for-all-audiences=true#6987717c762f0a45f672e250) as well as the [EOS Scanner ReadMe](https://huggingface.co/spaces/Naphula/model_tools/blob/main/eos_scanner_readme.md) for more info.

 ---
 # Model Tools by Naphula
+Tools to enhance LLM quantizations and merging. Merge and audit large language models on low VRAM GPUs.
 # [graph_v18.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/graph_v18.py)
 - Merge models in minutes instead of hours on low VRAM. For a 3060/3060 Ti user: This script enables functionality that is otherwise impossible (merging 70B models or large 7B merges with `--cuda`) without OOM. [More details here](https://huggingface.co/spaces/Naphula/model_tools/blob/main/mergekit_low-VRAM-graph_patch.md)
 # config.py
 - Simply replace line 13 | BEFORE `ScalarOrGradient: TypeAlias = Union[float, List[float]]` → AFTER `ScalarOrGradient: TypeAlias = Union[float, List[float], str, bool]` | to allow for custom filepath strings within parameter settings.
+# [embed_12B.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/embed_12B.py) and [embed_24B.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/embed_24B.py)
+- This is an alternate solution in cases where `--fix-mistral-regex` and `tokensurgeon` fail, such as `della` or `passthrough` merges between models with mismatched `vocab_size`. Read [the guide](https://huggingface.co/spaces/Naphula/model_tools/blob/main/Mergekit-Robustness-Patch-embed_v2.md) here, download either file and save it as `mergekit-main\mergekit\tokenizer\embed.py`. Attached is one for Mistral Nemo 12B (v2d), and another for Mistral Small 24B (v2a).
+- I noticed that sometimes the default `embed.py` works best so keep a copy of that too, and if it fails for some reason try the 12B or 24B version.
 # [enable_fix_mistral_regex_true.md](https://huggingface.co/spaces/Naphula/model_tools/blob/main/enable_fix_mistral_regex_true.md)
 - Merge models with extreme tokenizer incompatibility. Requires modifying the `mergekit.yaml` `tokenizer` section and adding `--fix-mistral-regex` to your merge commands. (Note: Do not use `token_surgeon.py`, `gen_id_patcher.py`, or `vocab_id_patcher.py` with this, they are obsolete now.) Configured for MN 12B by default. Follow the steps in this guide to modify these scripts:
 - `mergekit/merge.py`
 # [arcee_fusion_salience_scanner.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/arcee_fusion_salience_scanner.py)
 - Scan the salience % of your arcee_fusion merges. The default `tukey_fence` value is 1.5 which results in 12.5% salience, but [this can be adjusted (see guide here)](modify_arcee_fusion_tukey_fence_parameter.md).
+- Updated version here [arcee_fusion_salience_scanner_v3.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/arcee_fusion_salience_scanner_v3.py)
 # [eos_scanner.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/eos_scanner.py)
 - Updated! This tool scans the tokenizer jsons to detect any mismatches with EOS tokens, which cause early termination bugs. You can then use the [gen_id_patcher.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/gen_id_patcher.py) and [vocab_id_patcher.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/vocab_id_patcher.py), or the [chatml_to_mistral.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/chatml_to_mistral.py) to patch missing `generation_config.json` files for EOS token. See [this post](https://huggingface.co/Naphula/Q0_Bench/discussions/1?not-for-all-audiences=true#6987717c762f0a45f672e250) as well as the [EOS Scanner ReadMe](https://huggingface.co/spaces/Naphula/model_tools/blob/main/eos_scanner_readme.md) for more info.