RWKV7-G1 "GooseOne" pure RNN reasoning model

These are BASE models (pretrained with web/code/synthetic + instruction/chat/reasoning data), suitable for post-training and fine-tuning (check https://huggingface.co/spaces/Jellyfish042/UncheatableEval to see their performance at language modeling).

More info & Gradio demo: https://rwkv.com/


Search "RWKV Chat" in play store / app store for our local inference app

RWKV Chat: https://rwkv.halowang.cloud/ (local inference for mobile/desktop) and https://github.com/RWKV-APP/RWKV_APP

GGUF: https://huggingface.co/collections/shoumenchougou/rwkv7-gxx-gguf

Ollama GGUF: https://ollama.com/mollysama

RWKV-7 pth => GGUF script: https://github.com/MollySophia/rwkv-mobile/blob/master/converter/convert_rwkv_pth_to_gguf.py

Training: https://github.com/BlinkDL/RWKV-LM and https://github.com/Joluck/RWKV-PEFT


Efficient inference: https://github.com/BlinkDL/Albatross

  • 145+ token/s RWKV-7 7.2B fp16 bsz1 decoding @ RTX5090 (always const speed & vram)
  • 10250+ token/s RWKV-7 7.2B fp16 bsz960 decoding @ RTX5090 (always const speed & vram)
  • 9650+ token/s RWKV-7 7.2B fp16 bsz320 decoding @ RTX5090 (always const speed & vram)
  • 11289 token/s RWKV-7 7.2B fp16 bsz1 prefill @ RTX5090 (always const speed & vram)

pip inference: https://pypi.org/project/rwkv/

mobile inference: https://github.com/MollySophia/rwkv-mobile


Please always use latest models (with newest date) (better at everything).

Note: rwkv7a has DeepEmbed

Decoding Suggestion (note: this is for RWKV pip pkg, which apply temp after topp):

Chat: temp 1, topp 0.5, alpha_presence 2, alpha_frequency 0.1, alpha_decay 0.99

Creative (great for fiction etc.): temp 0.6, topp 0.6 ~ 0.8, alpha_presence 2, alpha_frequency 0.2, alpha_decay 0.99

There should not be any space at the end of your input (so strip it) or you will upset the tokenizer and see non-English reponse.

Chat prompt (note: better replace all \n\n in USER_PROMPT to \n as i am using \n\n as "chat round separator" in pretrain data):

System: YOU_CAN_USE_SYSTEM_IF_NEEDED

User: PREVIOUS_STUFF

Assistant: PREVIOUS_STUFF

User: USER_PROMPT

Assistant:

Think prompt (for hard prompts):

User: USER_PROMPT

Assistant: <think

Fake think prompt (great result, highly recommended):

User: USER_PROMPT

Assistant: <think></think

Think prompt, alternative style, for G1c and newer models. Note there is a space before the "(think)" after USER_PROMPT:

User: USER_PROMPT (think)

Assistant: <think

Shorter think, same style:

User: USER_PROMPT (think a bit)

Assistant: <think

Longer think, same style:

User: USER_PROMPT (think a lot)

Assistant: <think

FIM prompt (for G1c and newer models, works for text & code & everything):

✿prefix✿When I was young, I only liked to✿suffix✿and that’s how first I got interested in AI research.✿middle✿

Better (recommended):

✿prefix✿✿suffix✿and that’s how first I got interested in AI research.✿middle✿When I was young, I only liked to

Note "✿" will always be tokenized to one single token in RWKV tokenizer, so I picked it.

Gxx = Data Version

G0x = less than 1 epoch, as training 1 epoch for a large model is expensive :(
G0 G0a G0a2 G0a3 ... G0b ... = adding more (newer and better) data, so G0a has better quality (but less) data than G1

G1x = more than 1 epoch
G1 G1a G1a2 G1a3 ... G1b ... = adding more (newer and better) data, note G1a has better quality (and more) data than G0a
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 2 Ask for provider support

Model tree for BlinkDL/rwkv7-g1

Finetunes
18 models
Quantizations
11 models

Datasets used to train BlinkDL/rwkv7-g1

Spaces using BlinkDL/rwkv7-g1 9