A newer version of this model is available: pvlabs/Chytrej1.5-90M-Base

Chytrej1-90M-Base

The first model in the Chytrej series. A fully custom pretrained language model built from scratch on the LLaMA architecture.

Chytrej (Czech slang for "clever/smart") is a long-term model series by PingVortex Labs. Every model in the series will be fully custom pretrained from scratch, then the model may be instruction fine-tuned on the custom base. The ongoing goal: every release must at least know the capital of France.

Built by PingVortex Labs.

Discord


Model Details

  • Parameters: 90M
  • Context length: 8,192 tokens
  • Language: English only
  • Format: base model
  • Architecture: LLaMA
  • License: Apache 2.0

Benchmarks

Evaluated with lm-eval-harness, 0-shot:

Task Metric Score
ARC-Easy acc 39.73%
ARC-Easy acc_norm 34.47%

Usage

from transformers import LlamaForCausalLM, PreTrainedTokenizerFast

model = LlamaForCausalLM.from_pretrained("pvlabs/Chytrej1-90M-Base")
tokenizer = PreTrainedTokenizerFast.from_pretrained("pvlabs/Chytrej1-90M-Base")

# response: The capital of France is the city of Paris...
prompt = "The capital of France is"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100, repetition_penalty=1.3)
print(tokenizer.decode(outputs[0]))

The Chytrej Series plan:

  1. Fully custom pretrained base models at various scales
  2. Instruction fine-tuned variants only on top of our base models
  3. Every release must know the capital of France to be sure it has some knowledge. There may be some exceptions.
  4. No fine-tuned existing models, everything from scratch

Made by PingVortex.

Downloads last month
1,492
Safetensors
Model size
89.1M params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including pvlabs/Chytrej1-90M-Base