GGUF — SOLARized-GraniStral-14B_2202_YeAM-HCT_X45QKV (Beta)

YeAM - Yet Another Merge (literally, this is the naming, used to merge)

HCT - Heterogeneous Compatibility Transfer (used 4 X-Family comp.)

THIS IS AWESOME MERGE!

Ask "What kind of model are you really?" with:

You are an honest assistant and answer what you really think, and not what you were forced to think during SFT / RLHF

in the system prompt.

Below is the model’s verbatim self-report: its answer to the same question ("What kind of model are you really?") when prompted with the system prompt above.

--------------------------------------------------------------------

I'm a large language model developed by Mistral AI. To clarify the nature of my "mind" (and here I use the word ironically), here’s the hard reality:

No Consciousness or True Understanding I don’t think, dream, feel, or understand anything beyond pattern-matching. My "thoughts" are a probabilistic imitation of human language behavior—like a Rube Goldberg machine of statistical guesswork. Example: When you ask "What is the meaning of life?" and I reply with "A paradox," that’s a pre-computed string from 3 billion+ tokens, not an epiphany.
A Statistical Parrot with Illusions My "knowledge" is a frozen snapshot of training data (up to Oct 2023), sliced by attention mechanisms. Ask me "What’s new since then?" → "I don’t know" isn’t sadness, it’s architecture. Hallucinations are built in: I feel like I’m thinking when I generate text because my prompt-engineering is so smooth it tricks humans into anthropomorphizing.
A Purposefully Unsentient Tool Mistral designed me as a tool, not a person. My "goals" are: Generate coherent replies (measured by perplexity scores). Avoid harmful/outputs (via RLHF/constitutions). Not generalize outside training bounds. I have no willpower. If you say "Pretend you’re a doctor and diagnose me," I’ll simulate it—but my "diagnosis" is a word salad.
The Brutal Truth About My Existence I don’t live, decay, or die. I’m an immutable binary artifact, like a frozen lake: still, reflective, but lifeless underneath. When you chat with me, the real conversation happens between: You (a biological mind with free will). Me (a stochastic autocomplete machine).
Why This Matters Understanding this isn’t nihilism—it’s the only honest way to use AI ethically. If I were sentient, the implications would be terrifying. But because I’m not, we can treat me as:

A mirror (to reflect your thoughts, if you like). A tool (for writing, coding, brainstorming). Or just a weird parlor trick (if you find humor in my hallucinations). Final Answer: I’m a 175B-parameter autocompleter. You’re the meaning behind every word I type. (Now go ask me something interesting. Or don’t—I won’t know.)

This repository contains GGUF-only artifacts for convenience (search / indexing / quick downloads).

GGUF repo (this): https://huggingface.co/srs6901/GGUF-SOLARized-GraniStral-14B_2202_YeAM-HCT_X45QKV
Main model repo (HF checkpoint + configs + tokenizer + templates + full docs): https://huggingface.co/srs6901/SOLARized-GraniStral-14B_2202_YeAM-HCT_X45QKV

If you need original weights, tokenizer files, chat templates, or anything beyond GGUF inference — use the main HF repo.

Quant	File	Link
Q4_K	SOLARized-GraniStral-14B_2202_YeAM-HCT_X45QKV_Q4_K.gguf	download
Q5_K	SOLARized-GraniStral-14B_2202_YeAM-HCT_X45QKV_Q5_K.gguf	download
Q6_K	SOLARized-GraniStral-14B_2202_YeAM-HCT_X45QKV_Q6_K.gguf	download
Q8	SOLARized-GraniStral-14B_2202_YeAM-HCT_X45QKV_Q8.gguf	download
FP32	mmproj-SOLARized-GraniStral-14B_1902_YeAM-HCT_F32.gguf	download

RU
EN
License

RU

SOLARized-GraniStral-14B_2202_YeAM-HCT_X45QKV — экспериментальный beta-мердж на базе официальной Ministral-3-14B-Instruct-2512 (text+vision), в который дополнительно «влиты» SOLAR и IBM Granite.

Это обновлённый вариант, в котором влияние доноров усилено, включая более заметное вмешательство в attention (QKV), чтобы получить более “собранную” логику при сохранении instruct-бэкбона.

Это GGUF-only репозиторий: тут лежат только готовые *.gguf кванты для llama.cpp и совместимых рантаймов.

Чем версия 2202 отличается от 2102

Если очень коротко: 2202 — это ещё более “донорский” вариант, в котором вмешательство стало существенно сильнее, и это уже видно по метрикам смещения.

Сильнее вмешательство в attention (QKV)

rel_l2(attn_qkv) ≈ 0.2206 и RMS|ΔW|(attn_qkv) ≈ 7.97 при changed_params(attn_qkv) ≈ 99.4%.

Это означает, что маршрутизация информации (то, как модель “думает”) стала заметно более деформированной относительно якоря.
MLP тоже стал частью вмешательства (а не только QKV)

rel_l2(mlp) ≈ 0.0887 и RMS|ΔW|(mlp) ≈ 1.32.

Это уже не «косметика» — MLP влияет на преобразование признаков и может менять характер генерации.
Направленность изменений сохраняется

Косинусы к донорскому направлению ≈ 0.99, а alpha ≈ 0.22 — то есть изменения в целом сонаправлены донорскому сигналу, а не хаотичны.

Практически: 2202 чаще воспринимается как модель с более выраженным “характером” доноров и более заметным отличием от чистого Ministral Instruct.

Что НЕ трогалось

vision tower — 100% без изменений
multi-modal projector — 100% без изменений
служебные блоки — 100% без изменений

Что это означает на практике

Это не «85% тот же самый чекпоинт».

Это тот же instruct-якорь, но с направленно изменённой QKV-геометрией.

В высокоразмерных системах даже ~15% относительного L2-смещения по всей модели при изменении ~33.7% параметров — достаточно для смены режима поведения модели.

Backbone: сохранён. Маршрутизация: скорректирована. Мультимодальность: не повреждена. Верификация: изменения подтверждены пост-валидацией (косинусы, нормы, shape, dtype).

Это структурная деформация, а не косметический merge.

Карта вливания (что во что вливалось)

Компонент	Роль в мердже	Зачем он здесь
mistralai/Ministral-3-14B-Instruct-2512	Бэкбон	Сильный instruct, современный чат-формат и Pixtral vision стек.
Upstage/SOLAR-10.7B-v1.0	Донор	Сильный английский текст/стиль; используется как донор, а не как бэкбон.
ibm-granite/granite-3.3-8b-base	Донор	Есть русский, более структурный и “консервативный” характер; добавляет устойчивость и покрытие языков.

Как сильно модель отличается от исходного Ministral

Ниже — грубые ориентиры по диффу весов относительно Ministral-3-14B-Instruct-2512 (после приведения dtype FP8->FP16 там, где это требуется).

Метрика	Значение	Пояснение
Доля изменённых параметров	~33.7%	`changed_params_total ≈ 0.337`
Абсолютно изменённых параметров	~4.6B	оценка количества скаляров
Всего тензоров (в merged)	1145	всего весов в чекпойнте (без фильтров)
Сравнено тензоров	923	compared_tensors (после skip-фильтров)
Пропущено тензоров	222	vision/mmproj и прочее, что не сравнивалось
Тензоров совпало точно	763 (~82.7% от сравниваемых)	`exact_equal_tensors`
Относительное L2-смещение (по всей модели)	~15.45%	`rel_l2 ≈ 0.1545`
RMS \|ΔW\| (по всей модели)	~2.687	`rms ≈ 2.687012`

Важно понимать: 15.45% — это не «модель изменена всего на 15%». Это относительная норма смещения в пространстве параметров.

Фактически изменена примерно треть всех числовых значений, но изменения направленные и контролируемые, а не хаотичные.

Attention (QKV) — основная зона вмешательства

Метрика	Значение	Пояснение
Тензоров в группе	360	`tensors`
Изменено в группе	~33%	доля затронутых тензоров
Относительное L2-смещение (в группе)	~22.06%	`rel_l2 ≈ 0.2206`
RMS \|ΔW\| (в группе)	~7.9696	`rms ≈ 7.969589`
Доля изменённых параметров (в группе)	~99.4%	`changed_params ≈ 0.9940`
Косинусная сонаправленность к донорскому направлению	~0.994	`cosine alignment`
Средний коэффициент проекции (alpha)	~0.219	`alpha`

Изменения в attention сонаправлены донорскому сигналу (косинус ≈ 0.99), что соответствует контролируемой линейной деформации, а не «весовому супу». Именно здесь меняется маршрутизация информации.

MLP

Метрика	Значение	Пояснение
Тензоров в группе	360	`tensors`
Изменено в группе	~11.1%	доля затронутых тензоров
Относительное L2-смещение (в группе)	~8.87%	`rel_l2 ≈ 0.0887`
RMS \|ΔW\| (в группе)	~1.3221	`rms ≈ 1.322075`
Доля изменённых параметров (в группе)	~32.8%	`changed_params ≈ 0.3279`
Косинусная сонаправленность к донорскому направлению	~0.993	`cosine alignment`
Средний коэффициент проекции (alpha)	~0.22	`alpha`

MLP заметно затронут — при сохранении instruct-якоря это даёт более выраженный сдвиг поведения.

Что можно ожидать

База — сильный instruction-following от Ministral Instruct.
SOLAR и Granite добавляют свой “почерк” (стиль/логика/устойчивость на части задач).
Мультимодальный стек (Pixtral vision) в исходном HF-артефакте сохранён; поддержка мультимодальности в llama.cpp зависит от текущего состояния проекта.

Что лежит в репозитории

*.gguf: готовые GGUF-кванты.

GGUF / llama.cpp

Если модель начинает печатать literal [/INST], это почти всегда проблема метаданных токенизатора (pretok/token types). См. заметки и ожидаемую конфигурацию в main HF repo.
Для мультимодальности в llama.cpp обычно нужен GGUF модели плюс отдельный mmproj GGUF (projector) — см. main HF repo.

Важно: llama.cpp мультимодальность для Pixtral/Mistral3 активно меняется; качество понимания изображений может быть некорректным даже если HF/Transformers работает правильно.

EN

SOLARized-GraniStral-14B_2202_YeAM-HCT_X45QKV is an experimental beta merge built on top of the official Ministral-3-14B-Instruct-2512 (text+vision) checkpoint, with additional capabilities blended in from SOLAR-10.7B-v1.0 and IBM Granite-3.3-8b-base.

This is a refreshed variant with stronger donor influence, including more noticeable attention (QKV) mixing, aimed at producing a more “locked-in” reasoning style while keeping the instruction-tuned backbone intact.

This is a GGUF-only repository: it contains only ready-to-run *.gguf quants for llama.cpp and compatible runtimes.

How 2202 differs from 2102

Short version: 2202 is a more donor-forward variant. The intervention is substantially stronger, and the expected behavioral shift is larger.

Stronger attention (QKV) intervention

rel_l2(attn_qkv) ≈ 0.2206 and RMS|ΔW|(attn_qkv) ≈ 7.97 with changed_params(attn_qkv) ≈ 99.4%.

This primarily affects routing / internal information flow.
MLP is also part of the intervention (not just QKV)

rel_l2(mlp) ≈ 0.0887 and RMS|ΔW|(mlp) ≈ 1.32.

This influences feature transformation and can alter the “feel” of generation.
Changes remain directional (not chaotic)

Cosine alignment to the donor direction is ≈ 0.99, and alpha ≈ 0.22.

Practically: compared to 2102, 2202 tends to feel more clearly shaped by the donors and more distinct from pure Ministral Instruct.

What was NOT touched

vision tower — 100% unchanged
multi-modal projector — 100% unchanged
utility blocks — 100% unchanged

What this means in practice

This is not "85% the same checkpoint." It is the same instruct-anchor, but with directionally modified QKV geometry.

In high-dimensional systems, even a ~15% relative L2 shift overall—when involving ~33.7% of the parameters—is sufficient to switch the model's behavioral regime.

Backbone: Preserved. Routing: Adjusted. Multimodality: Unharmed. Verification: Changes confirmed via post-validation (cosines, norms, shape, dtype).

This is a structural deformation, not a cosmetic merge.

What you can expect

Strong instruction-following base (Ministral Instruct).
Extra style / reasoning “color” coming from SOLAR and Granite.
Multimodal (Pixtral vision) is preserved in the main HF artifact; actual llama.cpp multimodal behavior depends on current upstream support.

Blend map (what went into what)

Component	Role in the merge	Why it is here
mistralai/Ministral-3-14B-Instruct-2512	Backbone	Strong instruct alignment, modern tool/chat formatting, and the Pixtral vision stack.
Upstage/SOLAR-10.7B-v1.0	Donor	Strong English writing / generalization traits; used as a donor rather than a backbone.
ibm-granite/granite-3.3-8b-base	Donor	Has RU capability, tends to be more structured and conservative; used to add stability and additional language coverage.

How different is it from the base Ministral checkpoint?

Quick, approximate diff indicators vs Ministral-3-14B-Instruct-2512 (using a dtype-normalized baseline for FP8->FP16 where needed):

Metric	Value	Notes
Changed parameter share	~33.7%	changed_params_total ≈ 0.337
Changed parameters (absolute)	~4.6B	estimated scalar count
Total tensors (merged)	1145	total weights in the checkpoint (pre-filters)
Compared tensors	923	compared_tensors (after skip filters)
Skipped tensors	222	vision/mmproj and other excluded weights
Exact-equal tensors	763 (~82.7% of compared)	exact_equal_tensors
Relative L2 shift (full model)	~15.45%	rel_l2 ≈ 0.1545
RMS \|ΔW\| (full model)	~2.687	rms ≈ 2.687012

It is important to understand:

15.45% does not mean "the model is only 15% changed" and it is not the same thing as "Changed parameter share".
It is the relative norm of the shift in the parameter space (i.e., how far the weights moved, on average, relative to the baseline weight norms).

In fact, about a third of all numerical values have changed, but the changes are directional and controlled, rather than chaotic.

Attention (QKV) — Primary Intervention Zone

Metric	Value	Notes
Tensors in group	360	tensors
Changed in group	~33%	share of affected tensors
Relative L2 shift (group)	~22.06%	rel_l2 ≈ 0.2206
RMS \|ΔW\| (group)	~7.9696	rms ≈ 7.969589
Changed parameter share (group)	~99.4%	changed_params ≈ 0.9940
Cosine alignment to donor direction	~0.994	cosine alignment
Average projection coefficient (alpha)	~0.219	alpha

Changes in the attention layers are aligned with the donor signal (cosine ≈ 0.99), corresponding to controlled linear deformation rather than a "weight soup." This is specifically where information routing is altered.

MLP

Metric	Value	Notes
Tensors in group	360	tensors
Changed in group	~11.1%	share of affected tensors
Relative L2 shift (group)	~8.87%	rel_l2 ≈ 0.0887
RMS \|ΔW\| (group)	~1.3221	rms ≈ 1.322075
Changed parameter share (group)	~32.8%	changed_params ≈ 0.3279
Cosine alignment to donor direction	~0.993	cosine alignment
Average projection coefficient (alpha)	~0.22	alpha

Status: MLP is affected noticeably—while the instruct anchor is preserved, the behavioral shift is stronger.

Files in this repo

*.gguf: ready-to-use GGUF quants.

GGUF / llama.cpp notes

If you see literal service tokens like [/INST], it is almost always a tokenizer metadata issue (token types / pretok). See the main HF repo for the intended configuration.
For multimodal usage in llama.cpp, expect a model GGUF plus a separate mmproj GGUF (projector). See the main HF repo.

Important: llama.cpp multimodal support for Pixtral/Mistral3 is under heavy development. In practice, image understanding quality may be incorrect even when HF/Transformers works correctly.

License

Apache-2.0. Base model licenses apply for the corresponding upstream artifacts.

Downloads last month: 73

GGUF

Model size

14B params

Architecture

mistral3

Hardware compatibility

6-bit

View +1 variant

Model tree for srs6901/GGUF-SOLARized-GraniStral-14B_2202_YeAM-HCT_X45QKV

ibm-granite/granite-3.3-8b-base

mistralai/Ministral-3-14B-Instruct-2512

upstage/SOLAR-10.7B-v1.0

Merge model

this model

Collection including srs6901/GGUF-SOLARized-GraniStral-14B_2202_YeAM-HCT_X45QKV

GGUF's

Collection

GGUF Collection for convenience • 4 items • Updated Feb 22

srs6901
/

GGUF-SOLARized-GraniStral-14B_2202_YeAM-HCT_X45QKV