Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
10
16
7
Deqing Fu
PRO
deqing
Follow
shk-bd's profile picture
Mi6paulino's profile picture
kramp's profile picture
13 followers
·
17 following
https://deqingfu.github.io
DeqingFu
DeqingFu
AI & ML interests
None yet
Recent Activity
updated
a model
about 3 hours ago
deqing/vanilla-llama-3.2-1B-fineweb-sample-100BT-v4
updated
a model
about 4 hours ago
deqing/fone-llama-3.2-1B-fineweb-sample-100BT-fone3d-hybrid-tile-v4
liked
a model
3 days ago
deqing/vanilla-llama-3.2-1B-fineweb-sample-100BT-v4
View all activity
Organizations
deqing
's models
129
Sort: Recently updated
deqing/convergent-llama-300M-adamw-window_2
Text Generation
•
0.3B
•
Updated
16 days ago
•
343
deqing/convergent-llama-300M-adamw-swap_numbers
Text Generation
•
0.3B
•
Updated
16 days ago
•
354
deqing/convergent-llama-300M-adamw-isolate
Text Generation
•
0.3B
•
Updated
16 days ago
•
344
deqing/convergent-llama-300M-adamw-unigram
Text Generation
•
0.3B
•
Updated
16 days ago
•
340
deqing/convergent-mamba2-300M-muon-original
Text Generation
•
0.3B
•
Updated
16 days ago
•
380
deqing/llama-window-4-old
Text Generation
•
0.3B
•
Updated
16 days ago
•
343
deqing/llama-window-2-old
Text Generation
•
0.3B
•
Updated
16 days ago
•
327
deqing/convergent-llama-300M-muon-unk_number
Text Generation
•
0.3B
•
Updated
16 days ago
•
315
deqing/convergent-llama-300M-muon-swap_numbers
Text Generation
•
0.3B
•
Updated
16 days ago
•
304
deqing/llama-isolate-old
Text Generation
•
0.3B
•
Updated
16 days ago
•
309
deqing/convergent-llama-300M-muon-fivegram
Text Generation
•
0.3B
•
Updated
16 days ago
•
287
deqing/convergent-llama-300M-muon-permute
Text Generation
•
0.3B
•
Updated
16 days ago
•
273
deqing/convergent-llama-300M-muon-bigram
Text Generation
•
0.3B
•
Updated
16 days ago
•
261
deqing/convergent-llama-300M-muon-unigram
Text Generation
•
0.3B
•
Updated
16 days ago
•
271
deqing/mamba2-300M-v5-mamba2
Text Generation
•
0.3B
•
Updated
16 days ago
•
4.95k
deqing/lstm-12layer-v5
0.2B
•
Updated
17 days ago
•
779
deqing/llama-300M-v5-original
Text Generation
•
0.3B
•
Updated
18 days ago
•
3.39k
deqing/llama-300M-v5-unk_number
Text Generation
•
0.3B
•
Updated
20 days ago
•
2k
deqing/llama-300M-v5-addition_3digit_adamw
0.3B
•
Updated
21 days ago
•
1.38k
deqing/llama-300M-v5-addition_3digit
0.3B
•
Updated
21 days ago
•
1.88k
deqing/llama-300M-v5-addition
Text Generation
•
0.3B
•
Updated
21 days ago
•
4.17k
deqing/llama-300M-v5-addition_adamw
Text Generation
•
0.3B
•
Updated
21 days ago
•
4.27k
deqing/llama-300M-v5-addition_adamw-old
0.3B
•
Updated
24 days ago
•
380
deqing/llama-300M-v5-addition_3digit-old
0.3B
•
Updated
24 days ago
•
3
deqing/llama-300M-v5-adamw-addition_3digit_adamw-old
0.3B
•
Updated
24 days ago
•
3
deqing/llama-300M-v5-original-random_init_sft
Updated
24 days ago
•
1
deqing/llama-300M-v5-isolate_sft
Updated
25 days ago
•
1
deqing/llama-300M-v5-swap_numbers_sft
Updated
25 days ago
deqing/llama-300M-v5-addition-old
0.3B
•
Updated
25 days ago
•
1.65k
deqing/llama-300M-v5-original_sft
Updated
25 days ago
•
5
Previous
1
2
3
4
5
Next