Simo76 commited on
Commit
5ccbada
Β·
verified Β·
1 Parent(s): 4e7d5ad

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -24
README.md CHANGED
@@ -6,6 +6,7 @@ tags:
6
  - adaptive
7
  - research
8
  - nested-lora
 
9
  - rank-adaptation
10
  library_name: transformers
11
  datasets:
@@ -15,52 +16,46 @@ pipeline_tag: text-classification
15
 
16
  # Unified-LoRA
17
 
18
- **Adaptive rank controller for LoRA fine-tuning via nested orbital slicing.**
19
 
20
- ⚠️ **This is NOT a pretrained model.** Unified-LoRA is a training method/controller for LoRA.
21
 
22
  πŸ‘‰ **Code**: [github.com/Sva76/Unified-LoRa](https://github.com/Sva76/Unified-LoRa)
23
  πŸ‘‰ **Demo**: [unified_lora_demo.ipynb](https://github.com/Sva76/Unified-LoRa/blob/main/notebooks/unified_lora_demo.ipynb)
24
 
25
  ## What It Does
26
 
27
- Instead of fixing `rank=8` and hoping it works, Unified-LoRA allocates a single LoRA matrix pair at max rank and controls active capacity via **matrix slicing** (r4 βŠ‚ r8 βŠ‚ r16). An OrbitalController monitors gradient stress per layer and promotes/demotes rank using adaptive thresholds (ΞΌ Β± kΟƒ).
28
 
29
- **Key properties:**
30
- - Zero cold-start on rank transitions (lower ranks are subsets of higher ranks)
31
- - Per-layer independence (each adapter finds its own optimal rank)
32
- - ~100 lines of code, no SVD, negligible overhead
 
33
 
34
- ## Results
 
 
35
 
36
- **GLUE (DistilBERT, 67M):** Comparable or better on 3/4 tasks with 33–56% rank reduction.
37
 
38
- | Task | Baseline (r=16) | Adaptive | Rank Reduction |
39
- |------|-----------------|----------|----------------|
40
- | MRPC | 0.882 F1 | **0.886**| 42% |
41
- | CoLA | 0.488 MCC | **0.491**| 56% |
42
- | RTE | 0.556 Acc | **0.592**| 33% |
43
 
44
- **Noise resilience (validated use case):** +31 F1 points at 50% label noise, 9Γ— lower variance vs fixed rank. No benefit on clean data. Pattern confirmed at 67M, 1.1B, and 3B scales.
45
 
46
- **NestedLoRA stress tests:** Performance parity with baseline, ~15% rank saving, zero cold-start degradation.
47
 
48
  ## Quick Start
49
 
50
  ```python
51
  from controller import setup_unified_lora
52
 
53
- adapters, ctrl = setup_unified_lora(
54
- model,
55
- target_modules=["q_proj", "v_proj"],
56
- max_rank=16,
57
- rank_levels=[4, 8, 16],
58
- )
59
 
60
  for batch in dataloader:
61
  loss = model(**batch).loss
62
  loss.backward()
63
- ctrl.step()
64
  optimizer.step()
65
  optimizer.zero_grad()
66
  ```
@@ -70,7 +65,7 @@ for batch in dataloader:
70
  ```bibtex
71
  @software{unified_lora_2025,
72
  author = {Simona Vargiu},
73
- title = {Unified-LoRA: Adaptive Rank Controller via Nested Orbital Slicing},
74
  year = {2025},
75
  url = {https://github.com/Sva76/Unified-LoRa}
76
  }
 
6
  - adaptive
7
  - research
8
  - nested-lora
9
+ - synaptic-plasticity
10
  - rank-adaptation
11
  library_name: transformers
12
  datasets:
 
16
 
17
  # Unified-LoRA
18
 
19
+ **LoRA fine-tuning with synaptic plasticity: a neurobiologically-inspired controller that switches between qualitatively different operational modes based on training stress.**
20
 
21
+ ⚠️ **This is NOT a pretrained model.** Unified-LoRA is a training method/controller.
22
 
23
  πŸ‘‰ **Code**: [github.com/Sva76/Unified-LoRa](https://github.com/Sva76/Unified-LoRa)
24
  πŸ‘‰ **Demo**: [unified_lora_demo.ipynb](https://github.com/Sva76/Unified-LoRa/blob/main/notebooks/unified_lora_demo.ipynb)
25
 
26
  ## What It Does
27
 
28
+ A composite synaptic stress signal **Ο†(t) = f(Convergence, Entropy, Stress)** drives a 3-state FSM:
29
 
30
+ | Mode | Ο† range | Rank | Behavior |
31
+ |------|---------|------|----------|
32
+ | SINGLE | Ο† < 0.3 | r=4 | Efficient cruise |
33
+ | MULTI | 0.3 ≀ Ο† < 0.7 | r=8 | Active learning |
34
+ | MIRROR | Ο† β‰₯ 0.7 | r=16 | Max capacity + weight snapshot for rollback |
35
 
36
+ Rank transitions use **nested matrix slicing** (r4 βŠ‚ r8 βŠ‚ r16) β€” zero cold-start, zero re-allocation.
37
+
38
+ Mirror mode saves a weight snapshot on entry. On exit, if weights drifted <5% (transient noise), the snapshot is restored. If drift was significant (real signal), the new weights are kept.
39
 
40
+ ## Results
41
 
42
+ **GLUE (DistilBERT):** 3/4 tasks equal or better with 33–56% rank reduction.
 
 
 
 
43
 
44
+ **Noise resilience:** +31 F1 at 50% label noise, 9Γ— lower variance. No benefit on clean data. Confirmed at 67M–3B.
45
 
46
+ **Stress-recovery cycle (Tinker/Llama-3.2-1B):** Ο† returns to pre-shock baseline (0.33 β†’ 0.83 β†’ 0.33), demonstrating fully reversible stress handling.
47
 
48
  ## Quick Start
49
 
50
  ```python
51
  from controller import setup_unified_lora
52
 
53
+ adapters, ctrl = setup_unified_lora(model, target_modules=["q_proj", "v_proj"])
 
 
 
 
 
54
 
55
  for batch in dataloader:
56
  loss = model(**batch).loss
57
  loss.backward()
58
+ ctrl.step(loss=loss.item()) # Ο†(t) needs the loss for convergence signal
59
  optimizer.step()
60
  optimizer.zero_grad()
61
  ```
 
65
  ```bibtex
66
  @software{unified_lora_2025,
67
  author = {Simona Vargiu},
68
+ title = {Unified-LoRA: Synaptic Plasticity Controller for Adaptive LoRA Fine-Tuning},
69
  year = {2025},
70
  url = {https://github.com/Sva76/Unified-LoRa}
71
  }