Model Card for Apertus-8B_pruned-latin-94237
Model Summary
This model is a vocabulary-pruned English-only version of swiss-ai/Apertus-8B-Instruct-2509. It was created as part of an academic project in Machine Learning to investigate the effects of vocabulary reduction on model size and performance.
Base Model: Apertus-8B-Instruct-2509
Developer (Base Model): Swiss AI Initiative (ETH Zurich, EPFL, CSCS)
Pruning Method: Vocabulary Pruning (see details below)
Vocabulary Pruning Details
The pruned vocabulary was obtained by only retaining tokens with characters from the "Latin", "Common" and "Inherited" unicode scripts. More information can be found on the project report.
Intended Use
This model is intended for academic research and educational purposes, specifically to study:
- The impact of language restriction on multilingual LLM performance.
- Efficiency gains in memory usage and inference speed.
- Comparative analysis between full-scale and pruned models.
For general-purpose instruction following or production use, we recommend using the original swiss-ai/Apertus-8B-Instruct-2509.
How to Use
You can load this model using the transformers library. Ensure you are using a recent version of transformers.
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "epfl-ml-ytf/apertus-8b-pruned-latin-94237"
# Load the pruned tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Example generation
messages = [
{"role": "user", "content": "Explain the concept of vocabulary pruning in one sentence."}
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device)
outputs = model.generate(inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
- Downloads last month
- 4
Model tree for epfl-ml-ytf/apertus-8b-pruned-latin-94237
Base model
swiss-ai/Apertus-8B-2509