Vikras — Experimental Family of Language Models

Содержание

Коротко о проекте
Что это за репозиторий
Текущий релиз: HCT/YeAM (12-18B)
Vikra MixedPrc (MixP_4.9b_S)
MixP_4.9b_S: детали
Артефакты
HCT (архитектура) / YeAM (инвариант реализации)

Коротко о проекте

Vikra — экспериментальное семейство языковых моделей, исследующее влияние:

геометрии представлений
квантования
гибридных мерджей

на численную динамику трансформеров.

Проект Vikras не ограничивается одной базой или одной архитектурой: это семейство моделей, объединённых идеей численной инвариантности эксперимента.

Vikra_% — имя конкретной модели
Vikras — семейство экспериментов
S / M / L — степень агрессивности и распределения битности
MixP / FullP / HCT — схемы и инварианты квантования/мерджей

Что это за репозиторий

Это витрина моделей Vikras в диапазоне 12–18B. Здесь лежат только крупные релизы и связанные с ними GGUF-артефакты.

Сноска на основной репозиторий (лаборатория/склад):

https://huggingface.co/srs6901/Vikras-MixP

В основном репозитории релизы и эксперименты появляются быстрее, там же полный каталог.

Текущий релиз: HCT/YeAM (12-18B)

Релизы

Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B
- Что с чем: Vikhr-Nemo-12B + Gemma-3-1b (HCT/YeAM-производный релиз)
- HF: https://huggingface.co/srs6901/Vikras-12-to-18b-collection/tree/main/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B
- GGUF: https://huggingface.co/srs6901/Vikras-12-to-18b-collection/blob/main/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B-Q6_K.gguf

Vikra MixedPrc (MixP_4.9b_S)

GGUF: https://huggingface.co/srs6901/Vikras-12-to-18b-collection/blob/main/Vikra-MixP_4.9b_S.gguf

MixP_4.9b_S: детали

Архитектура (для MixP_4.9b_S релиза)

Параметр	Значение
Architecture	Mistral-based
Params	~12.25B
Layers	40
Hidden size	5120
FFN size	14336
Heads	32 (8 KV heads, GQA)
Context	1,024,000
Vocab	131,072 (Tekken BPE)
RoPE theta	1,000,000

MixP_4.9b_S — схема квантования

Гибридная mixed precision схема с покомпонентным распределением типов.

Tensor group	Quant type	BPW
token_embd, output	BF16	16
attn_norm, ffn_norm, output_norm	F32	32
attn_q	Q4_K	4.5
attn_k	Q5_K	5.5
attn_v	Q3_K	3.44
attn_output	Q4_K	4.5
ffn_gate	Q3_K	3.44
ffn_up	Q5_K	5.5
ffn_down	Q5_K / Q6_K	5.5–6.56

Итого:

Quantized layers only: ~4.89 BPW
Full model average: ~6.11 BPW
File size: ~8.71 GB

Ключевая идея MixP

MixP — это не «сжать всё одинаково».

Это анизотропное квантование информационных каналов:

Q/K сохраняются в более высокой точности
V и gate намеренно квантованы до Q3_K
Нормы и выходной слой остаются в высокой точности

Такое распределение изменяет численную динамику модели:

усиливается структурная sparsification
меняется распределение норм скрытых представлений
меняется энтропия логитов
появляется режимная чувствительность

Это не новая архитектура. Это изменение численной геометрии существующей.

Наблюдаемые эффекты

сохранение top-1 предсказаний на простых задачах
рост entropy без разрушения максимальной вероятности
расширение hidden norm на сложных задачах
бифуркация режимов: простые задачи ≈ инвариантны, сложные — чувствительны

Эти эффекты описываются как геометрический сдвиг представлений, а не как универсальное улучшение качества.

math_subattention (рабочая гипотеза)

В экспериментах наблюдается эффект, условно обозначенный как:

“math_subattention”

Под этим подразумевается:

уменьшение вклада мелких компонент V
усиление доминирующих направлений residual stream
повышенная инерция предыдущего токена
снижение частоты мелких переключений логитов

Это не claim о новой архитектуре. Это рабочая гипотеза о динамике, возникающей при Q3_K symmetric quantization.

Термин используется описательно.

Перплексия

Метрика измерена на wikitext-2-raw-test (full):

Model	Precision	PPL
Vikra MixP_4.9b_S	6.11 BPW	5.50 ± 0.03
Baseline BF16	Full	6.02 ± 0.03

Артефакты

Vikra-MXFP4.gguf

HCT (архитектура) / YeAM (инвариант реализации)

HCT — архитектурный инвариант: практический способ собирать совместимые модели и производные релизы при переносе между базами/семействами.

YeAM (Yet Another Merge) — инвариант реализации HCT и самостоятельная схема мерджа HF→HF: это не «ещё один SLERP/DARE/TILES» и не косметическая вариация усреднения.

YeAM выдаёт стандартный HF-результат (safetensors + index) и поддерживает:

прямой weight-to-weight мердж
направленное добавление знаний в выбранную модель (knowledge distillation / knowledge injection), согласованное по нескольким источникам
дополнительный мердж Attention-слоёв как отдельную технику поверх YeAM
мердж меньших моделей в более крупные (scale-up merge) при сохранении совместимого HF-формата

Математически YeAM работает в реальной 4D-постановке: обновления кодируются геометрически и согласуются через пересечения лучей в пространстве параметров. Это даёт управляемый мердж с сохранением структуры и без вырождения в наивное усреднение.

Vikras — Experimental Family of Language Models (EN)

Project overview
What this repository is
Current Release: HCT/YeAM (12-18B)
Vikra MixedPrc (MixP_4.9b_S)
MixP_4.9b_S: details
Artifacts
HCT (architecture) / YeAM (implementation invariant)

Project overview

Vikra is an experimental family of language models exploring how:

representation geometry
quantization
hybrid merges

affect transformer numerical dynamics.

The Vikras project is not tied to a single base model or architecture. It is a family of models unified by a numerical invariance philosophy of experimentation.

Vikra_% — a specific model
Vikras — the experimental family
S / M / L — aggressiveness and bit allocation variants
MixP / FullP / HCT — quantization / merge invariants

What this repository is

This is a curated 12–18B showcase for the Vikras family. Only larger releases and their GGUF artifacts are mirrored here.

Footnote / main repository (lab + full catalog):

https://huggingface.co/srs6901/Vikras-MixP

The main repository is updated faster and contains the complete set of experiments.

Current Release: HCT/YeAM (12-18B)

Releases

Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B
- Merge recipe: Vikhr-Nemo-12B + Gemma-3-1b (HCT/YeAM-derived release)
- HF: https://huggingface.co/srs6901/Vikras-12-to-18b-collection/tree/main/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B
- GGUF: https://huggingface.co/srs6901/Vikras-12-to-18b-collection/blob/main/Vikra-HCT-YeAM-Vikhr-NemoGemma-12B_plus_1B-Q6_K.gguf

Vikra MixedPrc (MixP_4.9b_S)

GGUF: https://huggingface.co/srs6901/Vikras-12-to-18b-collection/blob/main/Vikra-MixP_4.9b_S.gguf

MixP_4.9b_S: details

Architecture (for the MixP_4.9b_S release)

Parameter	Value
Architecture	Mistral-based
Params	~12.25B
Layers	40
Hidden size	5120
FFN size	14336
Heads	32 (8 KV heads, GQA)
Context	1,024,000
Vocab	131,072 (Tekken BPE)
RoPE theta	1,000,000

MixP_4.9b_S — quantization scheme

A hybrid mixed-precision scheme with per-tensor type allocation.

Tensor group	Quant type	BPW
token_embd, output	BF16	16
attn_norm, ffn_norm, output_norm	F32	32
attn_q	Q4_K	4.5
attn_k	Q5_K	5.5
attn_v	Q3_K	3.44
attn_output	Q4_K	4.5
ffn_gate	Q3_K	3.44
ffn_up	Q5_K	5.5
ffn_down	Q5_K / Q6_K	5.5–6.56

Totals:

Quantized layers only: ~4.89 BPW
Full model average: ~6.11 BPW
File size: ~8.71 GB

Core idea of MixP

MixP is not “compress everything equally”.

It is anisotropic quantization of information channels:

Q/K remain in higher precision
V and gate are intentionally quantized down to Q3_K
norms and the output layer remain in higher precision

This redistribution changes the numerical dynamics of the model:

increased structural sparsification
shifts in hidden norm distribution
changes in logit entropy
regime sensitivity

This is not a new architecture. It is a modification of the numerical geometry of an existing one.

Observed effects

preservation of top-1 predictions on simple tasks
increased entropy without collapse of maximum probability
expansion of hidden norms on complex tasks
mode bifurcation: simple tasks ≈ invariant, complex tasks sensitive

These effects are interpreted as a geometric shift of representations rather than a universal quality improvement.

math_subattention (working hypothesis)

In experiments, an effect informally referred to as:

“math_subattention”

This describes:

reduced contribution of small V components
dominance of stronger residual directions
increased inertia from previous token state
reduced frequency of small logit switching

This is not an architectural claim. It is a working hypothesis of dynamics emerging from Q3_K symmetric quantization.

The term is used descriptively.

Perplexity

Measured on wikitext-2-raw-test (full):

Model	Precision	PPL
Vikra MixP_4.9b_S	6.11 BPW	5.50 ± 0.03
Baseline BF16	Full	6.02 ± 0.03

Artifacts

Vikra-MXFP4.gguf

HCT (architecture) / YeAM (implementation invariant)

HCT is an architectural invariant. In English: Heterogeneous Compatibility Transfer — a practical way to assemble compatible checkpoints and derived releases while moving across bases / model families.

YeAM (Yet Another Merge) is an implementation invariant of HCT and a standalone HF→HF merge scheme: it is not “just another SLERP/DARE/TILES” and not a cosmetic variant of averaging.

YeAM produces a standard HF output (safetensors + index) and supports:

direct weight-to-weight merging
targeted knowledge injection into a chosen model (knowledge distillation mode), aligned across multiple sources
an additional Attention-layer merge as a second technique on top of YeAM
merging smaller models into larger ones (scale-up merge) while keeping a compatible HF format

YeAM operates in a real 4D formulation: updates are encoded geometrically and aligned via ray intersections in parameter space. This produces controlled merges that preserve structure instead of collapsing into naive averaging.

Downloads last month: 6

GGUF

Model size

12B params

Architecture

llama

Hardware compatibility

6-bit

View +2 variants

Collection including srs6901/Vikras-12-to-18b-collection

Vikras family

Collection

3 items • Updated Feb 22 • 1

srs6901
/

Vikras-12-to-18b-collection

Vikras — Experimental Family of Language Models

Содержание

Коротко о проекте

Что это за репозиторий

Текущий релиз: HCT/YeAM (12-18B)

Релизы

Vikra MixedPrc (MixP_4.9b_S)

MixP_4.9b_S: детали

Архитектура (для MixP_4.9b_S релиза)

MixP_4.9b_S — схема квантования

Ключевая идея MixP

Наблюдаемые эффекты

math_subattention (рабочая гипотеза)

Перплексия

Артефакты

HCT (архитектура) / YeAM (инвариант реализации)

Vikras — Experimental Family of Language Models (EN)

Table of Contents

Project overview

What this repository is

Current Release: HCT/YeAM (12-18B)

Releases

Vikra MixedPrc (MixP_4.9b_S)

MixP_4.9b_S: details

Architecture (for the MixP_4.9b_S release)

MixP_4.9b_S — quantization scheme

Core idea of MixP

Observed effects

math_subattention (working hypothesis)

Perplexity

Artifacts

HCT (architecture) / YeAM (implementation invariant)

Collection including srs6901/Vikras-12-to-18b-collection

Vikras family