Spaces:
Configuration error
Configuration error
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,123 +1,242 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 8 |
---
|
| 9 |
|
| 10 |
-
# VeriLoop
|
| 11 |
|
| 12 |
-
|
|
|
|
| 13 |
|
| 14 |
-
|
| 15 |
-
VeriLoop explores a different path-one where open-weight models can be upgraded into an **agentic runtime system** through **context engineering as the primary control surface**, reinforced only by **minimal, targeted PEFT** where necessary.
|
| 16 |
|
| 17 |
-
|
| 18 |
|
| 19 |
-
|
| 20 |
-
VeriLoop treats the base model as a **replaceable cognitive substrate**.
|
| 21 |
|
| 22 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 23 |
|
| 24 |
-
|
| 25 |
-
- **evidence–conclusion alignment**
|
| 26 |
-
- **execution-trigger discipline**
|
| 27 |
-
- **revision fidelity under contradiction**
|
| 28 |
-
- **near-zero tolerance for high-risk fabrication**
|
| 29 |
|
| 30 |
-
|
|
|
|
| 31 |
|
| 32 |
-
|
| 33 |
-
- **evidence-
|
| 34 |
-
- **external verification loops**
|
| 35 |
-
- **rollback and revision governance**
|
| 36 |
-
- **persistent runtime contracts**
|
| 37 |
-
- **context-engineered agentic behavior**
|
| 38 |
|
| 39 |
-
|
| 40 |
-
It is a different answer to what a “foundation model” should be.
|
| 41 |
|
| 42 |
-
|
|
|
|
| 43 |
|
| 44 |
-
|
| 45 |
-
|
| 46 |
|
| 47 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 48 |
It should be able to:
|
| 49 |
|
| 50 |
1. form a working hypothesis,
|
| 51 |
-
2.
|
| 52 |
-
3. retrieve or execute when
|
| 53 |
-
4. detect contradiction,
|
| 54 |
-
5. revise minimally
|
| 55 |
-
6.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 56 |
|
| 57 |
-
|
|
|
|
|
|
|
|
|
|
| 58 |
|
| 59 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 60 |
|
| 61 |
VeriLoop is designed to work **with** open-weight ecosystems, not against them.
|
| 62 |
|
| 63 |
-
We believe
|
| 64 |
-
|
| 65 |
|
| 66 |
-
|
| 67 |
-
|
|
|
|
|
|
|
| 68 |
|
| 69 |
-
|
|
|
|
| 70 |
|
| 71 |
-
|
| 72 |
|
| 73 |
-
|
| 74 |
|
| 75 |
-
|
| 76 |
|
| 77 |
-
|
| 78 |
|
| 79 |
-
|
|
|
|
| 80 |
|
| 81 |
-
|
| 82 |
|
| 83 |
-
|
| 84 |
-
- to a **runtime-upgradable reasoning substrate**
|
| 85 |
|
| 86 |
-
|
| 87 |
-
- to **stateful evidence-governed control**
|
| 88 |
|
| 89 |
-
-
|
| 90 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 91 |
|
| 92 |
-
|
| 93 |
-
|
| 94 |
|
| 95 |
-
|
| 96 |
|
| 97 |
-
|
| 98 |
|
| 99 |
-
|
| 100 |
|
| 101 |
-
|
| 102 |
-
It will come from building the right control architecture on top of open intelligence.
|
| 103 |
|
| 104 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 105 |
|
| 106 |
-
|
| 107 |
|
| 108 |
-
|
| 109 |
|
| 110 |
-
|
| 111 |
-
|
| 112 |
-
|
| 113 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 114 |
|
| 115 |
-
|
| 116 |
|
| 117 |
-
|
| 118 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 119 |
|
| 120 |
---
|
| 121 |
|
| 122 |
**VeriLoop (循证)**
|
| 123 |
-
*Evidence-driven runtime intelligence for the open-weight era.*
|
|
|
|
| 1 |
+
# VeriLoop
|
| 2 |
+
|
| 3 |
+
**VeriLoop (循证)** is an evidence-driven model family and runtime initiative built around **E³-Loop**, a closed-loop reasoning architecture created and designed by **Libo Wang**.
|
| 4 |
+
|
| 5 |
+
VeriLoop is not positioned as another general chat model optimized only for response fluency or benchmark-facing conversation quality. It is built to turn open-weight models into **evidence-governed, executable, verifiable, revisable, and auditable runtime systems** that can operate inside real workflows.
|
| 6 |
+
|
| 7 |
---
|
| 8 |
+
|
| 9 |
+
## Overview
|
| 10 |
+
|
| 11 |
+
Most model efforts still treat the model checkpoint as the final product.
|
| 12 |
+
VeriLoop takes a different view: the checkpoint is only the **cognitive substrate**. The real system value emerges when reasoning, evidence, execution, validation, revision, rollback, and stopping criteria are organized into one runtime control architecture.
|
| 13 |
+
|
| 14 |
+
At the center of the VeriLoop family is **E³-Loop**, which shifts the goal of language models:
|
| 15 |
+
|
| 16 |
+
- from best-effort generation to **budget-bounded convergence toward truth**
|
| 17 |
+
- from static prompting to **stateful evidence-governed control**
|
| 18 |
+
- from isolated outputs to **closed-loop execution and verification**
|
| 19 |
+
- from one-off fine-tuning cycles to **portable runtime capability transfer**
|
| 20 |
+
|
| 21 |
+
VeriLoop is therefore not only a model family. It is a proposal for redefining what a base model should become in the open-weight era.
|
| 22 |
+
|
| 23 |
---
|
| 24 |
|
| 25 |
+
## Why VeriLoop is Different
|
| 26 |
|
| 27 |
+
VeriLoop is not simply building another model that can chat.
|
| 28 |
+
It is building an **evidence-driven, executable, rollback-capable, and auditable runtime architecture**.
|
| 29 |
|
| 30 |
+
Many model stacks are still primarily compared by parameter count, conversational smoothness, or single-turn benchmark performance. VeriLoop is built to solve a different problem: **how a model can retrieve evidence, make decisions, execute actions, detect contradiction, revise minimally, and stop under explicit budget constraints**.
|
|
|
|
| 31 |
|
| 32 |
+
That is the difference between generating an answer and running a **truth-seeking closed loop**.
|
| 33 |
|
| 34 |
+
At the same time, VeriLoop is not a single undifferentiated assistant. It is organized as a **professional model family aligned to distinct workflow classes**:
|
|
|
|
| 35 |
|
| 36 |
+
- **VeriLoop Coder** — high-intensity software engineering, repository understanding, test repair, CI debugging, and toolchain-closed execution
|
| 37 |
+
- **VeriLoop Interaction** — high-quality multi-turn interaction, long-context continuity, intent understanding, and controllable tool collaboration
|
| 38 |
+
- **VeriLoop Skills** — robotic and agent task orchestration, converting natural-language goals into executable skill sequences, action constraints, and step-level planning
|
| 39 |
+
- **VeriLoop VLA** — embodied perception-to-action convergence for real-world visuomotor execution
|
| 40 |
+
- **VeriLoop Scientist** — hypothesis generation, literature-grounded evidence gathering, contradiction discovery, simulation-based verification, differential revision, and research-plan formation
|
| 41 |
+
- **VeriLoop Computer Use** — enterprise knowledge work and digital-interface execution across browsers, desktop software, and document systems, with retrieval, action, validation, and rollback loops
|
| 42 |
+
|
| 43 |
+
For this reason, VeriLoop should not be understood as a generic assistant.
|
| 44 |
+
It is a **family of specialized execution-oriented models** designed to deliver results inside real workflows.
|
| 45 |
+
|
| 46 |
+
---
|
| 47 |
|
| 48 |
+
## If Someone Asks: “Why not just use Doubao?”
|
|
|
|
|
|
|
|
|
|
|
|
|
| 49 |
|
| 50 |
+
If the evaluation criterion is only casual conversation quality or generic question-answering scores, then comparing directly to a mainstream assistant is natural.
|
| 51 |
+
But **that is not the point of VeriLoop**.
|
| 52 |
|
| 53 |
+
VeriLoop does not aim to become “another general chat model.”
|
| 54 |
+
Its purpose is to provide a **runtime substrate** that can attach to compatible open-weight backbones and elevate them through **evidence gating, sandbox verification, differential revision, rollback discipline, budget governance, and API-oriented runtime control**.
|
|
|
|
|
|
|
|
|
|
|
|
|
| 55 |
|
| 56 |
+
In other words:
|
|
|
|
| 57 |
|
| 58 |
+
- a general assistant is mainly judged by how well it answers;
|
| 59 |
+
- **VeriLoop is judged by whether it can complete a high-value workflow with evidence, validation, rollback, and auditability preserved**.
|
| 60 |
|
| 61 |
+
This is why VeriLoop matters even when compared against strong general assistants.
|
| 62 |
+
The target is not superficial similarity. The target is **reliable closed-loop execution**.
|
| 63 |
|
| 64 |
+
---
|
| 65 |
+
|
| 66 |
+
## The E³-Loop View
|
| 67 |
+
|
| 68 |
+
E³-Loop is the architectural core of the VeriLoop family.
|
| 69 |
+
It is not a thin wrapper around an LLM, and it is not a cosmetic agent shell.
|
| 70 |
+
It is a **runtime control plane** that organizes:
|
| 71 |
+
|
| 72 |
+
- state
|
| 73 |
+
- uncertainty
|
| 74 |
+
- budget
|
| 75 |
+
- evidence
|
| 76 |
+
- claims
|
| 77 |
+
- actions
|
| 78 |
+
- execution
|
| 79 |
+
- rollback
|
| 80 |
+
- trace logging
|
| 81 |
+
- termination
|
| 82 |
+
|
| 83 |
+
into one auditable reasoning loop.
|
| 84 |
+
|
| 85 |
+
In the VeriLoop view, a capable model should not simply produce plausible text.
|
| 86 |
It should be able to:
|
| 87 |
|
| 88 |
1. form a working hypothesis,
|
| 89 |
+
2. determine whether additional external evidence is required,
|
| 90 |
+
3. retrieve, remember, or execute when necessary,
|
| 91 |
+
4. detect contradiction or incompleteness,
|
| 92 |
+
5. revise minimally instead of regenerating blindly,
|
| 93 |
+
6. terminate when further cost no longer justifies further truth-seeking gain.
|
| 94 |
+
|
| 95 |
+
This is the operational meaning of **循证** inside VeriLoop.
|
| 96 |
+
|
| 97 |
+
---
|
| 98 |
+
|
| 99 |
+
## Harness-First Technical Direction
|
| 100 |
+
|
| 101 |
+
VeriLoop is built around a **Harness Engineering-first** strategy.
|
| 102 |
+
|
| 103 |
+
We use the term **Harness Engineering** to describe the system-level discipline that makes model behavior converge more reliably inside real environments: structured state control, hard constraints, knowledge entry points, execution harnesses, verification loops, failure signals, and completion criteria.
|
| 104 |
+
|
| 105 |
+
Under this strategy:
|
| 106 |
|
| 107 |
+
- **Harness Engineering** is the primary driver of system behavior and workflow reliability
|
| 108 |
+
- **Context Engineering** remains important, but functions as one controlled layer inside the larger runtime harness
|
| 109 |
+
- **PEFT is used selectively and minimally**, only where targeted stabilization is necessary
|
| 110 |
+
- repeated large-scale fine-tuning is intentionally avoided whenever the same goal can be achieved through better runtime control
|
| 111 |
|
| 112 |
+
This direction is not anti-model.
|
| 113 |
+
It is anti-fragility.
|
| 114 |
+
|
| 115 |
+
The VeriLoop thesis is that many of the most expensive failure modes in model development do not come from missing raw capability, but from poor control over:
|
| 116 |
+
|
| 117 |
+
- state continuity,
|
| 118 |
+
- evidence discipline,
|
| 119 |
+
- tool-use boundaries,
|
| 120 |
+
- verification feedback,
|
| 121 |
+
- revision fidelity,
|
| 122 |
+
- and termination criteria.
|
| 123 |
+
|
| 124 |
+
Harness Engineering is the system response to that problem.
|
| 125 |
+
|
| 126 |
+
---
|
| 127 |
+
|
| 128 |
+
## Minimal PEFT, Not Fine-Tuning Dependency
|
| 129 |
+
|
| 130 |
+
VeriLoop does not reject parameter-efficient tuning.
|
| 131 |
+
It rejects **fine-tuning dependency as the default answer to every problem**.
|
| 132 |
+
|
| 133 |
+
Our current direction is to use **minimal, targeted PEFT** only where it creates stable interfaces for the runtime system, such as:
|
| 134 |
+
|
| 135 |
+
- identity stabilization
|
| 136 |
+
- uncertainty calibration
|
| 137 |
+
- evidence-binding discipline
|
| 138 |
+
- tool-spec alignment
|
| 139 |
+
- revision and rollback fidelity
|
| 140 |
+
|
| 141 |
+
The goal is not to rebuild the entire model distribution.
|
| 142 |
+
The goal is to create a better substrate for a **Harness-first, evidence-driven runtime**.
|
| 143 |
+
|
| 144 |
+
This is important because model versions change quickly.
|
| 145 |
+
If every backbone upgrade forces a full retraining cycle, engineering investment becomes brittle.
|
| 146 |
+
VeriLoop is designed to preserve more value across model generations.
|
| 147 |
+
|
| 148 |
+
---
|
| 149 |
+
|
| 150 |
+
## Open-Weight Compatibility by Design
|
| 151 |
|
| 152 |
VeriLoop is designed to work **with** open-weight ecosystems, not against them.
|
| 153 |
|
| 154 |
+
We believe long-term value should not be trapped inside one permanently fixed checkpoint.
|
| 155 |
+
Instead, the E³-Loop runtime is designed to make multiple compatible backbones participate in the VeriLoop paradigm through:
|
| 156 |
|
| 157 |
+
- runtime adaptation,
|
| 158 |
+
- harness-controlled execution,
|
| 159 |
+
- context and evidence integration,
|
| 160 |
+
- and minimal targeted alignment where necessary.
|
| 161 |
|
| 162 |
+
That means model evolution should not automatically erase prior engineering work.
|
| 163 |
+
The intended outcome is **open-weight continuity under a stable runtime architecture**.
|
| 164 |
|
| 165 |
+
---
|
| 166 |
|
| 167 |
+
## API-First Service Vision
|
| 168 |
|
| 169 |
+
VeriLoop is being built with an **API-first service direction**.
|
| 170 |
|
| 171 |
+
Our long-term goal is to make the VeriLoop effect available as a technical service layer that upgrades compatible model backbones into **evidence-driven closed-loop runtime systems**.
|
| 172 |
|
| 173 |
+
The strategic value is not only in owning checkpoints.
|
| 174 |
+
It is in building the right runtime architecture on top of open intelligence.
|
| 175 |
|
| 176 |
+
---
|
| 177 |
|
| 178 |
+
## Current Public Product Lines
|
|
|
|
| 179 |
|
| 180 |
+
The current public VeriLoop family is organized around six application lines:
|
|
|
|
| 181 |
|
| 182 |
+
- **VeriLoop Coder**
|
| 183 |
+
- **VeriLoop Interaction**
|
| 184 |
+
- **VeriLoop Skills**
|
| 185 |
+
- **VeriLoop VLA**
|
| 186 |
+
- **VeriLoop Scientist**
|
| 187 |
+
- **VeriLoop Computer Use**
|
| 188 |
|
| 189 |
+
These names identify application-facing product lines.
|
| 190 |
+
They do **not** imply permanent binding to one fixed underlying backbone.
|
| 191 |
|
| 192 |
+
---
|
| 193 |
|
| 194 |
+
## Development Status
|
| 195 |
|
| 196 |
+
VeriLoop is an active research and engineering initiative.
|
| 197 |
|
| 198 |
+
Current work focuses on the control-plane and runtime foundations required for evidence-driven closed-loop operation, including:
|
|
|
|
| 199 |
|
| 200 |
+
- state and schema contracts
|
| 201 |
+
- evidence memory and contradiction management
|
| 202 |
+
- sandbox-linked verification
|
| 203 |
+
- harness-controlled execution
|
| 204 |
+
- trace and audit ledgers
|
| 205 |
+
- targeted PEFT for interface stabilization
|
| 206 |
+
- backbone adaptation across different open-weight families
|
| 207 |
|
| 208 |
+
---
|
| 209 |
|
| 210 |
+
## Open-Weight and License Notice
|
| 211 |
|
| 212 |
+
VeriLoop is built in the open-weight ecosystem and may be developed on top of, adapted from, or interoperable with upstream open-weight backbones.
|
| 213 |
+
Where applicable, upstream attribution, license terms, and third-party notices must be preserved in downstream releases.
|
| 214 |
+
|
| 215 |
+
Current default backbone mapping for the public VeriLoop product lines is as follows:
|
| 216 |
+
|
| 217 |
+
- **VeriLoop Coder** → Qwen3-Coder-Next
|
| 218 |
+
- **VeriLoop Interaction** → Qwen3.5-27B
|
| 219 |
+
- **VeriLoop Skills** → Kimi-K2-Thinking
|
| 220 |
+
- **VeriLoop VLA** → Psi-Zero
|
| 221 |
+
- **VeriLoop Scientist** → S1-Base-1.5-32B-128K
|
| 222 |
+
- **VeriLoop Computer Use** → Qwen3.5-35B-A3B
|
| 223 |
|
| 224 |
+
These mappings describe the **current default backbone choices** and may evolve over time as the VeriLoop runtime is validated across additional open-weight models.
|
| 225 |
|
| 226 |
+
---
|
| 227 |
+
|
| 228 |
+
## Founder and Architecture Origin
|
| 229 |
+
|
| 230 |
+
**Libo Wang** is the creator and architectural designer of the **E³-Loop** framework that defines the VeriLoop family.
|
| 231 |
+
|
| 232 |
+
VeriLoop exists to explore a new paradigm for model systems:
|
| 233 |
+
|
| 234 |
+
- more rigorous than prompt-only interaction,
|
| 235 |
+
- more reusable than backbone-specific repeated fine-tuning,
|
| 236 |
+
- more auditable than opaque agent stacks,
|
| 237 |
+
- and more economically realistic for the open-weight era.
|
| 238 |
|
| 239 |
---
|
| 240 |
|
| 241 |
**VeriLoop (循证)**
|
| 242 |
+
*Evidence-driven closed-loop runtime intelligence for the open-weight era.*
|