Aitana-2B-S-base-IP-1.0

Model description
Intended uses and limitations
How to use
Training
Technical specifications
Additional information

Model description

Aitana-2B-S-base-IP-1.0 is a generative language model with a decoder-only architecture. This repository contains the base checkpoint, intended for causal language modeling and for further adaptation or task-specific fine-tuning.

Based on the files shipped in this repository, the checkpoint uses the Llama architecture and the Transformers ecosystem. The local configuration indicates:

architecture: LlamaForCausalLM
hidden size: 2048
layers: 24
attention heads: 16
vocabulary size: 256000
context length: 8192
tensor dtype in config: bfloat16

Intended uses and limitations

Aitana-2B-S-base-IP-1.0 is a base model that can be used for causal language modeling and text generation. As with other base checkpoints, it is generally more useful as a starting point for instruction-tuning, domain adaptation, or downstream fine-tuning than as a final end-user assistant model.

Because this repository currently only exposes the model artifacts and not the full training report, claims about domain coverage, language balance, safety behavior, and benchmark performance should be added only once they are confirmed by the model authors.

How to use

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "gplsi/Aitana-2B-S-base-IP-1.0"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

prompt = "Escriu un breu resum sobre la importància de la llengua."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=128,
    do_sample=True,
    top_p=0.9,
    temperature=0.7,
    eos_token_id=tokenizer.eos_token_id,
    pad_token_id=tokenizer.pad_token_id,
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training

Base model

TO-DO: document the original parent checkpoint or initialization source for Aitana-2B-S-base-IP-1.0.

Training data

TO-DO: document the training corpora, language distribution, preprocessing steps, deduplication policy, anonymization steps, and data filtering criteria.

Training hyperparameters

TO-DO: document the effective batch size, learning rate schedule, optimizer setup, number of epochs or tokens seen, sequence length used during training, and hardware.

Technical specifications

Model architecture and objective

architecture: decoder-only causal language model
implementation class: LlamaForCausalLM
hidden size: 2048
intermediate size: 5440
layers: 24
attention heads: 16
key/value heads: 16
maximum position embeddings: 8192
vocabulary size: 256000
BOS token id: 1
EOS token id: 2
PAD token id: 3

Tokenizer

The tokenizer files in this repository define:

BOS token: <s>
EOS token: </s>
PAD token: <pad>
UNK token: <unk>

Hardware and software

The repository is packaged for the Hugging Face transformers library. Specific training hardware and training software details should be documented by the model authors if they are intended to be part of the public model card.

Additional information

Author

TO-DO: confirm the author list and institutional attribution to be displayed in the public model card.

Contact

TO-DO: add a contact email or project contact point.

License

TO-DO: confirm the license for this checkpoint and add it both here and in config.json if desired.

Funding

TO-DO: add funding information if this checkpoint is part of a funded project.

Disclaimer

This repository contains a base language model checkpoint. Base models can reflect biases present in their training data and may generate inaccurate, misleading, or unsafe content. Anyone deploying this model, or systems built on top of it, is responsible for evaluating those risks and ensuring compliance with applicable legal, ethical, and operational requirements.

Downloads last month: 172

Safetensors

Model size

2B params

Tensor type

BF16

Model tree for gplsi/Aitana-2B-S-base-IP-1.0

Quantizations

1 model

gplsi
/

Aitana-2B-S-base-IP-1.0