alexrs commited on
Commit
780fa22
·
verified ·
1 Parent(s): f2360bf

Update README via Huggy

Browse files
Files changed (1) hide show
  1. README.md +13 -17
README.md CHANGED
@@ -67,17 +67,16 @@ Command A+ is an open source model with 25 billion active parameters and 218B to
67
 
68
  Developed by: [Cohere](https://cohere.com/) and [Cohere Labs](https://cohere.com/research)
69
 
70
- * Point of Contact: [**Cohere Labs**](https://cohere.com/research)
71
- * License: [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
72
- * Model: command-a-plus-05-2026
73
- * Model Size: 25B active parameters, 218B total parameters
74
  * Context length: 128K input
75
 
76
- For more details about this model, please check out our [blog post](http://cohere.com/blog/command-a-plus).
77
 
78
  You can try out Command A+ before downloading the weights in our hosted [Hugging Face Space](https://huggingface.co/spaces/CohereLabs/command-a-plus-05-2026).
79
 
80
-
81
  **Available quantizations**
82
 
83
  The following quantizations are available with example minimum GPU requirements
@@ -90,8 +89,7 @@ The following quantizations are available with example minimum GPU requirements
90
 
91
  All three quantizations show negligible differences in benchmark quality and performance. **Our recommended quantization for most uses is [W4A4](https://huggingface.co/CohereLabs/command-a-plus-05-2026-w4a4) which boasts superior speed and latency characteristics alongside a smaller hardware footprint.**
92
 
93
- For more details, please check out our [blog post](http://cohere.com/blog/command-a-plus).
94
-
95
 
96
  **Usage**
97
 
@@ -117,9 +115,9 @@ input_ids = tokenizer.apply_chat_template(
117
  )
118
 
119
  gen_tokens = model.generate(
120
- input_ids,
121
- max_new_tokens=4096,
122
- do_sample=True,
123
  temperature=0.6,
124
  top_p=0.95
125
  )
@@ -171,10 +169,10 @@ print(outputs[0]["generated_text"][-1])
171
 
172
  **vLLM**
173
 
174
- You can also run the model in vLLM. `vllm>=0.21.0` is required for Command A+ and accurate response parsing also requires installing [Cohere’s `melody` library](https://pypi.org/project/cohere-melody/).
175
 
176
  ```
177
- uv pip install vllm>=0.21.0
178
  uv pip install transformers uv pip install cohere_melody>=0.9.0
179
  ```
180
 
@@ -188,9 +186,9 @@ Then the vllm server can be started with the following command:
188
 
189
  **Input**: Text and images.
190
 
191
- **Output**: Model generates text.
192
 
193
- **Model Architecture**: Command A+ is a decoder-only Sparse Mixture-of-Experts Transformer Model. With 25B active parameters and 218B total parameters, it has 128 experts, out of which 8 are active per token, and a single shared expert is applied to all tokens. The attention layers interleave sliding-window attention layers with Rotational Positional Embeddings and global attention layers without positional embeddings in a 3:1 ratio, as first introduced in Command A. The sparse MoE layer is trained in a fully dropless manner and uses a token-choice router. We use additive-bias-based load balancing to encourage balanced token load across all experts, and swap out the softmax router activation function with a normalized sigmoid over the topk expert logits per token.
194
 
195
  **Languages covered:** The model has been trained on 48 languages: English, Arabic, Bulgarian, Bengali, Catalan, Czech, Danish, German, Greek, Spanish, Estonian, Persian, Finnish, Filipino, French, Irish, Hebrew, Hindi, Croatian, Hungarian, Indonesian, Icelandic, Italian, Japanese, Korean, Lithuanian, Latvian, Malay, Maltese, Dutch, Norwegian, Punjabi, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Serbian, Swedish, Tamil, Telugu, Thai, Turkish, Ukrainian, Urdu, Vietnamese, Chinese.
196
 
@@ -289,5 +287,3 @@ For errors or additional questions about details in this model card, contact \[[
289
  **Try it now:**
290
 
291
  You can try Command A+ in the [playground](https://dashboard.cohere.com/playground/chat?model=command-a-plus-05-2026). You can also use it in our dedicated [Hugging Face Space](https://huggingface.co/spaces/CohereLabs/command-a-plus-05-2026).
292
-
293
-
 
67
 
68
  Developed by: [Cohere](https://cohere.com/) and [Cohere Labs](https://cohere.com/research)
69
 
70
+ * Point of Contact: [**Cohere Labs**](https://cohere.com/research)
71
+ * License: [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
72
+ * Model: command-a-plus-05-2026
73
+ * Model Size: 25B active parameters, 218B total parameters
74
  * Context length: 128K input
75
 
76
+ For more details about this model, please check out our [blog post](http://cohere.com/blog/command-a-plus).
77
 
78
  You can try out Command A+ before downloading the weights in our hosted [Hugging Face Space](https://huggingface.co/spaces/CohereLabs/command-a-plus-05-2026).
79
 
 
80
  **Available quantizations**
81
 
82
  The following quantizations are available with example minimum GPU requirements
 
89
 
90
  All three quantizations show negligible differences in benchmark quality and performance. **Our recommended quantization for most uses is [W4A4](https://huggingface.co/CohereLabs/command-a-plus-05-2026-w4a4) which boasts superior speed and latency characteristics alongside a smaller hardware footprint.**
91
 
92
+ For more details, please check out our [blog post](http://cohere.com/blog/command-a-plus).
 
93
 
94
  **Usage**
95
 
 
115
  )
116
 
117
  gen_tokens = model.generate(
118
+ input_ids,
119
+ max_new_tokens=4096,
120
+ do_sample=True,
121
  temperature=0.6,
122
  top_p=0.95
123
  )
 
169
 
170
  **vLLM**
171
 
172
+ You can also run the model in vLLM. `vllm>=0.21.0` is required for Command A+ and accurate response parsing also requires installing [Cohere’s `melody` library](https://pypi.org/project/cohere-melody/).
173
 
174
  ```
175
+ uv pip install vllm>=0.21.0
176
  uv pip install transformers uv pip install cohere_melody>=0.9.0
177
  ```
178
 
 
186
 
187
  **Input**: Text and images.
188
 
189
+ **Output**: Model generates text.
190
 
191
+ **Model Architecture**: Command A+ is a decoder-only Sparse Mixture-of-Experts Transformer Model. With 25B active parameters and 218B total parameters, it has 128 experts, out of which 8 are active per token, and a single shared expert is applied to all tokens. The attention layers interleave sliding-window attention layers with Rotational Positional Embeddings and global attention layers without positional embeddings in a 3:1 ratio, as first introduced in Command A. The sparse MoE layer is trained in a fully dropless manner and uses a token-choice router. We use additive-bias-based load balancing to encourage balanced token load across all experts, and swap out the softmax router activation function with a normalized sigmoid over the topk expert logits per token.
192
 
193
  **Languages covered:** The model has been trained on 48 languages: English, Arabic, Bulgarian, Bengali, Catalan, Czech, Danish, German, Greek, Spanish, Estonian, Persian, Finnish, Filipino, French, Irish, Hebrew, Hindi, Croatian, Hungarian, Indonesian, Icelandic, Italian, Japanese, Korean, Lithuanian, Latvian, Malay, Maltese, Dutch, Norwegian, Punjabi, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Serbian, Swedish, Tamil, Telugu, Thai, Turkish, Ukrainian, Urdu, Vietnamese, Chinese.
194
 
 
287
  **Try it now:**
288
 
289
  You can try Command A+ in the [playground](https://dashboard.cohere.com/playground/chat?model=command-a-plus-05-2026). You can also use it in our dedicated [Hugging Face Space](https://huggingface.co/spaces/CohereLabs/command-a-plus-05-2026).