OpenOneRec
/

KSA-4B-base

@@ -48,6 +48,7 @@ This repository contains:
 - **2026-04-28** — KSA technical report is released on arXiv: [arXiv:2604.24432](https://arxiv.org/abs/2604.24432).
 - **2026-04-28** — Code, training recipes, block-sparse kernel, and HuggingFace `trust_remote_code` template are open-sourced under this repository.
 ## ✨ Highlights
@@ -60,11 +61,11 @@ This repository contains:
 ## 🤖 Model Zoo
-*Coming soon.* Pretrained checkpoints will be published on Hugging Face once the technical report is released.
 | Model         | Backbone    | Parameters | Context | Training              | Link  |
 | :------------ | :---------- | :--------- | :------ | :-------------------- | :---- |
-| KSA-4B (CPT)  | Qwen3-4B    | 4B         | 128k    | Continual pretraining | *TBD* |
 The 1.9B *from-scratch* configuration is provided as a reproducible recipe only; no 1.9B weights will be released.
@@ -257,8 +258,7 @@ The inference path uses HuggingFace's `AutoModelForCausalLM` with `trust_remote_
 We are actively working on:
 - [x] Technical report on arXiv ([arXiv:2604.24432](https://arxiv.org/abs/2604.24432)).
-- [ ] Publish pretrained 1.9B checkpoints on Hugging Face.
-- [ ] Release the 4B continual-pretraining recipe and checkpoint.
 - [ ] Expanded evaluation scripts for RULER / NIAH / LongBench v2 reproduction.
 - [ ] A reference serving stack with the ring-buffer KV cache.
 - [ ] Additional ablations and tutorials.
@@ -292,4 +292,4 @@ KSA is built upon and inspired by the open-source ecosystem. We would like to th
 - **HuggingFace Transformers** — for the model / tokenizer / generation abstractions that make `trust_remote_code` deployment painless.
 - **PyTorch distributed training** — for FSDP, DCP, and the communication primitives that make large-scale pretraining tractable.
-We sincerely thank these projects for their outstanding work.

 - **2026-04-28** — KSA technical report is released on arXiv: [arXiv:2604.24432](https://arxiv.org/abs/2604.24432).
 - **2026-04-28** — Code, training recipes, block-sparse kernel, and HuggingFace `trust_remote_code` template are open-sourced under this repository.
+- **2026-05-08** — [KSA-4B-base](https://huggingface.co/OpenOneRec/KSA-4B-base) (CPT from Qwen3-4B, 128K context) weights are released on HuggingFace.
 ## ✨ Highlights
 ## 🤖 Model Zoo
+Pretrained checkpoints published on HuggingFace.
 | Model         | Backbone    | Parameters | Context | Training              | Link  |
 | :------------ | :---------- | :--------- | :------ | :-------------------- | :---- |
+| KSA-4B-base   | Qwen3-4B    | 4B         | 128k    | Continual pretraining | [🤗 OpenOneRec/KSA-4B-base](https://huggingface.co/OpenOneRec/KSA-4B-base) |
 The 1.9B *from-scratch* configuration is provided as a reproducible recipe only; no 1.9B weights will be released.
 We are actively working on:
 - [x] Technical report on arXiv ([arXiv:2604.24432](https://arxiv.org/abs/2604.24432)).
+- [x] Release the 4B continual-pretraining checkpoint ([KSA-4B-base](https://huggingface.co/OpenOneRec/KSA-4B-base)).
 - [ ] Expanded evaluation scripts for RULER / NIAH / LongBench v2 reproduction.
 - [ ] A reference serving stack with the ring-buffer KV cache.
 - [ ] Additional ablations and tutorials.
 - **HuggingFace Transformers** — for the model / tokenizer / generation abstractions that make `trust_remote_code` deployment painless.
 - **PyTorch distributed training** — for FSDP, DCP, and the communication primitives that make large-scale pretraining tractable.
+We sincerely thank these projects for their outstanding work.