Gradience now has a stable use case
That is a narrower claim than “merge optimizer,” and a more useful one.
The practical question it now answers well is:
Before I spend real evaluation budget on merge experiments, how do I shrink a mixed adapter pool into a smaller, more defensible set of candidates?
That is the center of the project now.
The workflow
The stable workflow is simple:
- run source QA
- run pair reports
- run inventory summary / neighborhoods
- inspect task-boundary advisories
- identify same-task safe zones and cross-task caution zones
- reduce the candidate set
- only then run expensive evaluation
The point is not to produce more reports. The point is to make the next decision better.
What the evidence now supports
The recent work clarified where Gradience is actually strong.
Same-task pairs are broadly safe
On small encoder models, same-task pairs stayed broadly safe across multiple targeted studies, including:
- training-style variation
- domain shift inside a high-transfer task
- source-strength asymmetry
That means same-task rescue logic is not the center of the project.
Task identity is the key boundary
The strongest and most stable distinction is:
- same-task vs
- cross-task
That is now the main regime boundary in Gradience.
The task-relationship advisory is established infrastructure
The advisory is now part of the stable interpretive layer.
Across the current evidence base:
- 132+ pairs
- 2 backbones
- 0 false positives
It behaves exactly as it should:
- silent on same-task pairs
- active on different-task pairs
Mixed-task inventory preflight is where the system earns its keep
The highest-value use case is now very clear: mixed-task inventories where pair-risk alone leaves too many plausible candidates alive.
That is where Gradience helps most.
Utility result
A recent preflight utility round asked a simple question:
How much does Gradience reduce wasted merge exploration before evaluation begins?
The answer was strong.
In mixed-task inventories, Gradience reduced the candidate space by 65–90%, with 81% average reduction in cases where the task-relationship advisory was the main discriminator.
That is the clearest practical result in the repo right now.
Gradience is not just tagging pairs with extra metadata. It is helping collapse a flat pool of possibilities into a much smaller evaluation subset.
What that means in practice
The current stable stack is strongest when it can do some combination of:
- exclude weak or low-value sources
- identify a same-task safe zone
- identify a cross-task caution zone
- use neighborhoods to make a larger pair matrix legible again
- reduce the number of pairs worth actually evaluating
That is the value proposition now.
Not “predict every merge outcome.”
Not “solve cross-task severity.”
Just:
turn a messy mixed-task inventory into a smaller, more defensible plan before expensive evaluation begins.
What Gradience does not yet solve
One important problem is still open:
- cross-task severity grading
Gradience can now reliably tell you when you have crossed into the risky regime. It does not yet reliably tell you how severe a given cross-task merge will be across backbones.
That remains open research.
Recent follow-up studies tested:
- exact task-pair identity
- core-space shared-basis as a severity signal
Neither turned out to be stable enough across backbones to justify featureization.
So the current position is pretty clean:
- boundary detection is solved
- search-space reduction is solved enough to be useful
- severity grading remains open
Bottom line
The repo, examples, docs, and utility evidence are now aligned around one practical use case:
run Gradience before merge evaluation on a mixed-task inventory if you want to shrink the candidate space and avoid spending eval budget on the wrong region of the pool.
That is the strongest current version of the project.