asdf98
/

LuminaRS

Model card Files Files and versions

LuminaRS / README.md

asdf98's picture

Upload README.md

dc40562 verified 11 days ago

|

history blame contribute delete

2.47 kB

	# LuminaRS — Lightweight Recursive Art Image Generator

	A novel ~90M parameter image generation model for art/illustration that runs on mobile devices (2-4GB VRAM).

	## Why LuminaRS?

	\| Problem \| Current Solutions \| LuminaRS \|
	\|---------\|------------------\|----------\|
	\| Heavy models (6-12GB) \| SDXL, Flux \| ~90M params, <500MB \|
	\| Can't run mobile \| Quantized SD (quality loss) \| Designed small from scratch \|
	\| Poor prompt adherence \| SD 1.5 \| TRM-style recursive reasoning \|
	\| No art specialization \| General photo models \| Art-focused training stages \|
	\| Unstable training \| Diffusion (score matching) \| Flow matching (stable ODE) \|

	## Architecture (Novel Contributions)

	### 1. Recursive Shared-Weight Refinement (from TRM)
	Inspired by [Tiny Recursive Models](https://arxiv.org/abs/2510.04871) — beat 200x larger LLMs with 7M params.
	```python
	for _ in range(T): z = z + unet(z, text, t) # shared-weight refinement
	```
	Effective depth = T x L without Tx parameters.

	### 2. Flow Matching (instead of Diffusion)
	- v(x_t, t) = x_clean - x_noise (straight-line velocity)
	- 10-12 inference steps vs 50+ for diffusion
	- No score matching instability

	### 3. ConvNeXt + MQA Cross-Attention
	Depthwise 7x7 conv, Adaptive LayerNorm (time), MQA cross-attn (text), GELU MLP

	### 4. Staged Freeze/Thaw Training
	\| Stage \| What's Trained \| LR \|
	\|-------\|---------------\|-----\|
	\| 1 \| All denoiser params \| 1e-4 \|
	\| 2 \| Cross-attention only \| 1e-5 \|
	\| 3 \| All params, joint \| 1e-6 \|

	VAE and CLIP always frozen.

	## Parameter Budget
	\| Component \| Params \|
	\|-----------\|--------\|
	\| Encoder \| ~35M \|
	\| Bottleneck \| ~15M \|
	\| Decoder \| ~35M \|
	\| Embeds \| ~5M \|
	\| Total trainable \| ~90M \|
	\| VAE (frozen) \| ~83M \|
	\| CLIP (frozen) \| ~303M \|
	\| Inference VRAM (b=1) \| ~1.5-2GB \|

	## Quick Start
	```python
	from luminars.model import LuminaRS
	from luminars.config import LuminaRSConfig
	from luminars.sampler import sample_flow
	cfg = LuminaRSConfig()
	model = LuminaRS(cfg)
	latents = sample_flow(model, text_emb, (1,16,32,32), 12)
	```

	## Files
	- luminars/ -- model, config, loss, sampler, train helpers
	- train.py -- main training script
	- LuminaRS_Colab.ipynb -- Colab notebook

	## Research Foundations
	- TRM (Jolicoeur-Martineau 2025): Recursive reasoning
	- SnapGen (2024): Mobile UNet design
	- ZigMa (2024): Mamba diffusion
	- Flow Matching (Lipman 2023): Stable ODE generation
	- MQA (Shazeer 2019): Multi-query attention
	- ConvNeXt (Liu 2022): Modernized CNN

	MIT License