Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
|
@@ -8,7 +8,7 @@ pinned: false
|
|
| 8 |
---
|
| 9 |
|
| 10 |
# Model Tools by Naphula
|
| 11 |
-
Tools to enhance LLM quantizations and merging. Merge and audit large language models
|
| 12 |
|
| 13 |
# [graph_v18.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/graph_v18.py)
|
| 14 |
- Merge models in minutes instead of hours on low VRAM. For a 3060/3060 Ti user: This script enables functionality that is otherwise impossible (merging 70B models or large 7B merges with `--cuda`) without OOM. [More details here](https://huggingface.co/spaces/Naphula/model_tools/blob/main/mergekit_low-VRAM-graph_patch.md)
|
|
@@ -17,6 +17,10 @@ Tools to enhance LLM quantizations and merging. Merge and audit large language m
|
|
| 17 |
# config.py
|
| 18 |
- Simply replace line 13 | BEFORE `ScalarOrGradient: TypeAlias = Union[float, List[float]]` → AFTER `ScalarOrGradient: TypeAlias = Union[float, List[float], str, bool]` | to allow for custom filepath strings within parameter settings.
|
| 19 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 |
# [enable_fix_mistral_regex_true.md](https://huggingface.co/spaces/Naphula/model_tools/blob/main/enable_fix_mistral_regex_true.md)
|
| 21 |
- Merge models with extreme tokenizer incompatibility. Requires modifying the `mergekit.yaml` `tokenizer` section and adding `--fix-mistral-regex` to your merge commands. (Note: Do not use `token_surgeon.py`, `gen_id_patcher.py`, or `vocab_id_patcher.py` with this, they are obsolete now.) Configured for MN 12B by default. Follow the steps in this guide to modify these scripts:
|
| 22 |
- `mergekit/merge.py`
|
|
@@ -55,6 +59,7 @@ Tools to enhance LLM quantizations and merging. Merge and audit large language m
|
|
| 55 |
|
| 56 |
# [arcee_fusion_salience_scanner.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/arcee_fusion_salience_scanner.py)
|
| 57 |
- Scan the salience % of your arcee_fusion merges. The default `tukey_fence` value is 1.5 which results in 12.5% salience, but [this can be adjusted (see guide here)](modify_arcee_fusion_tukey_fence_parameter.md).
|
|
|
|
| 58 |
|
| 59 |
# [eos_scanner.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/eos_scanner.py)
|
| 60 |
- Updated! This tool scans the tokenizer jsons to detect any mismatches with EOS tokens, which cause early termination bugs. You can then use the [gen_id_patcher.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/gen_id_patcher.py) and [vocab_id_patcher.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/vocab_id_patcher.py), or the [chatml_to_mistral.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/chatml_to_mistral.py) to patch missing `generation_config.json` files for EOS token. See [this post](https://huggingface.co/Naphula/Q0_Bench/discussions/1?not-for-all-audiences=true#6987717c762f0a45f672e250) as well as the [EOS Scanner ReadMe](https://huggingface.co/spaces/Naphula/model_tools/blob/main/eos_scanner_readme.md) for more info.
|
|
|
|
| 8 |
---
|
| 9 |
|
| 10 |
# Model Tools by Naphula
|
| 11 |
+
Tools to enhance LLM quantizations and merging. Merge and audit large language models on low VRAM GPUs.
|
| 12 |
|
| 13 |
# [graph_v18.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/graph_v18.py)
|
| 14 |
- Merge models in minutes instead of hours on low VRAM. For a 3060/3060 Ti user: This script enables functionality that is otherwise impossible (merging 70B models or large 7B merges with `--cuda`) without OOM. [More details here](https://huggingface.co/spaces/Naphula/model_tools/blob/main/mergekit_low-VRAM-graph_patch.md)
|
|
|
|
| 17 |
# config.py
|
| 18 |
- Simply replace line 13 | BEFORE `ScalarOrGradient: TypeAlias = Union[float, List[float]]` → AFTER `ScalarOrGradient: TypeAlias = Union[float, List[float], str, bool]` | to allow for custom filepath strings within parameter settings.
|
| 19 |
|
| 20 |
+
# [embed_12B.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/embed_12B.py) and [embed_24B.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/embed_24B.py)
|
| 21 |
+
- This is an alternate solution in cases where `--fix-mistral-regex` and `tokensurgeon` fail, such as `della` or `passthrough` merges between models with mismatched `vocab_size`. Read [the guide](https://huggingface.co/spaces/Naphula/model_tools/blob/main/Mergekit-Robustness-Patch-embed_v2.md) here, download either file and save it as `mergekit-main\mergekit\tokenizer\embed.py`. Attached is one for Mistral Nemo 12B (v2d), and another for Mistral Small 24B (v2a).
|
| 22 |
+
- I noticed that sometimes the default `embed.py` works best so keep a copy of that too, and if it fails for some reason try the 12B or 24B version.
|
| 23 |
+
|
| 24 |
# [enable_fix_mistral_regex_true.md](https://huggingface.co/spaces/Naphula/model_tools/blob/main/enable_fix_mistral_regex_true.md)
|
| 25 |
- Merge models with extreme tokenizer incompatibility. Requires modifying the `mergekit.yaml` `tokenizer` section and adding `--fix-mistral-regex` to your merge commands. (Note: Do not use `token_surgeon.py`, `gen_id_patcher.py`, or `vocab_id_patcher.py` with this, they are obsolete now.) Configured for MN 12B by default. Follow the steps in this guide to modify these scripts:
|
| 26 |
- `mergekit/merge.py`
|
|
|
|
| 59 |
|
| 60 |
# [arcee_fusion_salience_scanner.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/arcee_fusion_salience_scanner.py)
|
| 61 |
- Scan the salience % of your arcee_fusion merges. The default `tukey_fence` value is 1.5 which results in 12.5% salience, but [this can be adjusted (see guide here)](modify_arcee_fusion_tukey_fence_parameter.md).
|
| 62 |
+
- Updated version here [arcee_fusion_salience_scanner_v3.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/arcee_fusion_salience_scanner_v3.py)
|
| 63 |
|
| 64 |
# [eos_scanner.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/eos_scanner.py)
|
| 65 |
- Updated! This tool scans the tokenizer jsons to detect any mismatches with EOS tokens, which cause early termination bugs. You can then use the [gen_id_patcher.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/gen_id_patcher.py) and [vocab_id_patcher.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/vocab_id_patcher.py), or the [chatml_to_mistral.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/chatml_to_mistral.py) to patch missing `generation_config.json` files for EOS token. See [this post](https://huggingface.co/Naphula/Q0_Bench/discussions/1?not-for-all-audiences=true#6987717c762f0a45f672e250) as well as the [EOS Scanner ReadMe](https://huggingface.co/spaces/Naphula/model_tools/blob/main/eos_scanner_readme.md) for more info.
|