Question about experts select
#186 opened about 1 year ago
by
BXset
Hardware Requirements to run the original model - 671B params
👍 1
4
#185 opened about 1 year ago
by
EdilCamil
Holding paper in hand
1
#184 opened about 1 year ago
by
Loveyl
Update config.json
#182 opened about 1 year ago
by
Empolean2640
Regression in Reasoning Tag Output - Missing <think> in Model Responses
2
#181 opened about 1 year ago
by
divinerapier
Delete model.safetensors.index.json
#180 opened about 1 year ago
by
Huggingfaceliaj
Unknown quantization type, got fp8
#179 opened about 1 year ago
by
DenisFavaCerchiaro
如何取消/省略<think></think>过程。
3
#178 opened about 1 year ago
by
yech520
Request: DOI
🤗 1
#177 opened about 1 year ago
by
Tamwyn
Request: DOI
#176 opened about 1 year ago
by
saathwik
Request: DOI
#175 opened about 1 year ago
by
Paulabad
Draft model as accelerator for DeepSeek-R1?
4
#174 opened about 1 year ago
by
inputout
Deploying production ready service with Unsloth GGUF quants on your AWS account. (4 x L40S)
🔥 2
8
#171 opened about 1 year ago
by
samagra-tensorfuse
是否可以关注Perplexity推出的“r1-1776”模型?
4
#170 opened about 1 year ago
by
yanyihan
Just crossed 10,000 likes!
1
#169 opened about 1 year ago
by
clem
mac上面无法下载flash_attn
#168 opened about 1 year ago
by
earlyIsLate
Can this model be used for commercial use?
2
#167 opened about 1 year ago
by
henrycwf
90+ tokens per second for MI300x8 using batch_size = 1
1
#166 opened about 1 year ago
by
ghostplant
"aha moment" comment deleted by Perplexity (recovered)
👍 1
3
#164 opened about 1 year ago
by
FalconNet
'num_hidden_layers': 61, but layer 62 has weights.
#162 opened about 1 year ago
by
xinhe
Upload GTG Breaking every Limit
#161 opened about 1 year ago
by
GTGenesis
support prefix complete
❤️👍 3
3
#158 opened about 1 year ago
by
HuggineAllen
Create app.py
#157 opened about 1 year ago
by
SpaceAgeRobotics
Brokersponsor
#155 opened about 1 year ago
by
Brokersponsor
Update README.md
#154 opened about 1 year ago
by
egegvner
Upload IMG_4530.png
#152 opened about 1 year ago
by
Noemie202586
Upload IMG_1745.JPG
#151 opened about 1 year ago
by
Ladib
Create Clara
1
#150 opened about 1 year ago
by
Clblinks
If I understand correctly, evaluating MATH-500 requires 64*500 model calls?
1
#149 opened about 1 year ago
by
Rorschaaaach
Request: DOI
🚀 1
#148 opened about 1 year ago
by
Tarush-Appreciate
Update README.md
#147 opened about 1 year ago
by
tekno-power
Update README.md
#146 opened about 1 year ago
by
Ekimnedops6969
Update README.md
❤️ 1
1
#143 opened about 1 year ago
by
MuhammadEhsan
Request for Information on Purchasing Reasoning API Key
2
#142 opened about 1 year ago
by
brahamaandai
ssss
🔥 1
1
#140 opened about 1 year ago
by
DZGT
Update model_max_length in tokenizer_config.json
👍 3
1
#139 opened about 1 year ago
by
kkokkie2360
Host of the model
3
#138 opened about 1 year ago
by
henrycwf
Lite version for DeepSeek-R1?
👍👀 6
1
#137 opened about 1 year ago
by
haili-tian
[Bug] assert not self.training
4
#136 opened about 1 year ago
by
Gaie
Upload IMG_0253.HEIC
#134 opened about 1 year ago
by
rynty
Upload comment-sample.xlsx
#133 opened about 1 year ago
by
faham123
non-reasoning data
#132 opened about 1 year ago
by
mccatec
能不能放一些 4bit的权重,现在手里面的卡都不支持FP8
🔥 2
1
#131 opened about 1 year ago
by
zhnagchenchne
For the universe! DeepPhaser.py DeepCoralX.py and DeepSynapse.py
❤️👀 2
3
#129 opened about 1 year ago
by
karmikovic
Request: Create distill of Mistral Small 24B
3
#128 opened about 1 year ago
by
Kenshiro-28
which vision model is R1 using for text extraction from image or pdfs.
2
#127 opened about 1 year ago
by
ashutoshroy02