varunrandery commited on
Commit
afed5b9
·
verified ·
1 Parent(s): 825ca3a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -12
README.md CHANGED
@@ -1,11 +1,9 @@
1
  ---
2
  library_name: vllm
3
  inference: false
4
- base_model:
5
- - poolside/Laguna-XS.2-base
6
  extra_gated_description: >-
7
  To learn more about how we process your personal data, please read our <a
8
- href="https://poolside.ai/privacy">Privacy Policy</a>.
9
  tags:
10
  - laguna-xs.2
11
  license: apache-2.0
@@ -28,9 +26,7 @@ pipeline_tag: text-generation
28
  Laguna XS.2 is a 33B total parameter Mixture-of-Experts model with 3B activated parameters per token designed for agentic coding and long-horizon work on a local machine. It uses Sliding Window Attention with per-head gating in 30 out of 40 layers for fast inference and low KV cache requirements.
29
 
30
  > [!NOTE]
31
- > This is the final model with native reasoning support and interleaved thinking. For the base model, see [Laguna XS.2-base](https://huggingface.co/poolside/Laguna-XS.2-base).
32
-
33
- For more details on how we trained this model, including on data automixing and async off-policy agent RL, check out our [release blog post](https://poolside.ai/blog/laguna-a-deeper-dive).
34
 
35
  ## Highlights
36
  - **Mixed SWA and global attention layout**: Laguna XS.2 uses sigmoid gating with per-layer rotary scales, enabling mixed SWA (Sliding Window Attention) and global attention layers in a 3:1 ratio (across 40 total layers)
@@ -51,7 +47,7 @@ For more details on how we trained this model, including on data automixing and
51
  - Sliding Window: 512 tokens
52
  - Modality: text-to-text
53
  - Context window: 131,072 tokens
54
- - Reasoning support: thinking default enabled; interleaved thinking with preserved thinking supported
55
 
56
  ## Benchmark results
57
 
@@ -107,8 +103,6 @@ Launch and *Log in with Poolside* to get a free API key.
107
  pool
108
  ```
109
 
110
- [Placeholder for screenshot]
111
-
112
  Use in any [ACP client](https://agentclientprotocol.com/get-started/clients). Configure Zed and JetBrains automatically:
113
 
114
  ```shell
@@ -187,8 +181,8 @@ response = client.chat.completions.create(
187
  reasoning, content, tool_calls = "", "", []
188
  for chunk in response:
189
  delta = chunk.choices[0].delta
190
- if hasattr(delta, "reasoning") and delta.reasoning:
191
- reasoning += delta.reasoning
192
  if hasattr(delta, "content") and delta.content:
193
  content += delta.content
194
  if hasattr(delta, "tool_calls") and delta.tool_calls:
@@ -206,7 +200,7 @@ print(f"Reasoning: {reasoning}\nContent: {content}\nTool calls: {tool_calls}\n")
206
  messages.append({
207
  "role": "assistant",
208
  "content": content,
209
- "reasoning": reasoning,
210
  "tool_calls": [{"id": tc["id"], "type": "function", "function": tc["function"]} for tc in tool_calls]
211
  })
212
 
 
1
  ---
2
  library_name: vllm
3
  inference: false
 
 
4
  extra_gated_description: >-
5
  To learn more about how we process your personal data, please read our <a
6
+ href="https://poolside.ai/legal/privacy">Privacy Policy</a>.
7
  tags:
8
  - laguna-xs.2
9
  license: apache-2.0
 
26
  Laguna XS.2 is a 33B total parameter Mixture-of-Experts model with 3B activated parameters per token designed for agentic coding and long-horizon work on a local machine. It uses Sliding Window Attention with per-head gating in 30 out of 40 layers for fast inference and low KV cache requirements.
27
 
28
  > [!NOTE]
29
+ > For more details on how we trained this model, including on data automixing and async off-policy agent RL, check out our [release blog post](https://poolside.ai/blog/laguna-a-deeper-dive).
 
 
30
 
31
  ## Highlights
32
  - **Mixed SWA and global attention layout**: Laguna XS.2 uses sigmoid gating with per-layer rotary scales, enabling mixed SWA (Sliding Window Attention) and global attention layers in a 3:1 ratio (across 40 total layers)
 
47
  - Sliding Window: 512 tokens
48
  - Modality: text-to-text
49
  - Context window: 131,072 tokens
50
+ - Reasoning support: interleaved thinking with preserved thinking
51
 
52
  ## Benchmark results
53
 
 
103
  pool
104
  ```
105
 
 
 
106
  Use in any [ACP client](https://agentclientprotocol.com/get-started/clients). Configure Zed and JetBrains automatically:
107
 
108
  ```shell
 
181
  reasoning, content, tool_calls = "", "", []
182
  for chunk in response:
183
  delta = chunk.choices[0].delta
184
+ if hasattr(delta, "reasoning_content") and delta.reasoning_content:
185
+ reasoning += delta.reasoning_content
186
  if hasattr(delta, "content") and delta.content:
187
  content += delta.content
188
  if hasattr(delta, "tool_calls") and delta.tool_calls:
 
200
  messages.append({
201
  "role": "assistant",
202
  "content": content,
203
+ "reasoning_content": reasoning,
204
  "tool_calls": [{"id": tc["id"], "type": "function", "function": tc["function"]} for tc in tool_calls]
205
  })
206