CCRss commited on
Commit
3dee25e
·
verified ·
1 Parent(s): 3c4700c

README: embed system overview figure with caption at top

Browse files
Files changed (1) hide show
  1. README.md +6 -1
README.md CHANGED
@@ -20,7 +20,12 @@ library_name: transformers
20
 
21
  > **A 0.6B parameter edge LLM trained to emit a calibrated verbalized confidence score before its answer, enabling efficient edge–cloud routing without an external router.**
22
 
23
- FogGen is a small, self-aware edge model that knows when to answer locally and when to defer to a stronger cloud model. The model emits a discrete confidence score (one of `0.0, 0.25, 0.5, 0.75, 1.0`) before producing its answer, and a routing threshold `τ` decides whether to keep the local answer or escalate to the cloud.
 
 
 
 
 
24
 
25
  The released checkpoint is the endpoint (`R14`) of a 14-round continual-learning chain that trained the model across seven domains: finance, science, coding, law, math, Kazakh culture, and medicine.
26
 
 
20
 
21
  > **A 0.6B parameter edge LLM trained to emit a calibrated verbalized confidence score before its answer, enabling efficient edge–cloud routing without an external router.**
22
 
23
+ ![FogGen overview: (a) self-aware routing at inference, (b) self-evolving training loop](./foggen_overview.png)
24
+
25
+ **At a glance.** FogGen is a small, self-aware edge model that knows when to answer locally and when to defer to a stronger cloud model. The figure above summarizes the two halves of the recipe:
26
+
27
+ - **(a) Inference — self-aware routing.** The edge model `M_N` (Qwen3-0.6B) processes a query and emits two output spans in one forward pass: a *confidence span* (positions 1–8, e.g. `Confidence: 0.75`) followed by an *answer span* (positions 10–13, e.g. `Final answer: B`). The routing decision compares the parsed confidence `c` to a threshold `τ`: if `c ≥ τ` the edge answer is returned; otherwise the query is routed to the cloud model.
28
+ - **(b) Training — self-evolving data loop.** Each round consumes a cloud-labeled dataset (Stage 1), uses the current checkpoint `M_N` to self-sample 8 generations per question at T=0.7 to derive a confidence bucket via the `k correct → bucket` mapping (Stage 2), then trains the next checkpoint `M_{N+1}` on the resulting `(question, confidence, answer)` triples via SFT with LoRA merge (Stage 3).
29
 
30
  The released checkpoint is the endpoint (`R14`) of a 14-round continual-learning chain that trained the model across seven domains: finance, science, coding, law, math, Kazakh culture, and medicine.
31