exp7_sdpa_1epoch_lr1e4_500k_vngr_corpus_2-epoch-kaggle

This repository contains a causal language model trained using the lm-pretrain framework.
Source code: https://github.com/canbingol/lm-pretrain

Detailed experiment logs, ablations, and comparisons:
https://docs.google.com/spreadsheets/d/10dbABNIMc_WL85ba0rfGwrkbU-VHu3aRa9tnuOAGpyc/edit?usp=sharing

Usage

Download model file

from huggingface_hub import hf_hub_download

hf_hub_download(
    repo_id="canbingol/exp7_sdpa_1epoch_lr1e4_500k_vngr_corpus_2", 
    filename="model.py", 
    repo_type="model",
    local_dir="./"  
)

Load model and generate

import torch
from transformers import AutoTokenizer
from decoder_model import DecoderCausalLM

model_path = "canbingol/exp7_sdpa_1epoch_lr1e4_500k_vngr_corpus_2-epoch-kaggle"

device = "cuda" if torch.cuda.is_available() else "cpu"

model = DecoderCausalLM.from_pretrained(model_path).to(device=device, dtype=torch.bfloat16)
tokenizer = AutoTokenizer.from_pretrained(model_path)

input_ids = tokenizer.encode("selam ben", return_tensors="pt").to(device)

out_tokens = model.generate(input_ids)
generated_text = tokenizer.decode(out_tokens.flatten())

print(generated_text)

Notes

  • DecoderCausalLM implementation is included in the model files (model.py).
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train canbingol/exp7_sdpa_1epoch_lr1e4_500k_vngr_corpus_2-epoch-kaggle

Collection including canbingol/exp7_sdpa_1epoch_lr1e4_500k_vngr_corpus_2-epoch-kaggle