exp7_sdpa_1epoch_lr1e4_500k_vngr_corpus_2-epoch-kaggle

This repository contains a causal language model trained using the lm-pretrain framework.
Source code: https://github.com/canbingol/lm-pretrain

Detailed experiment logs, ablations, and comparisons:
https://docs.google.com/spreadsheets/d/10dbABNIMc_WL85ba0rfGwrkbU-VHu3aRa9tnuOAGpyc/edit?usp=sharing

Usage

Download model file

from huggingface_hub import hf_hub_download

hf_hub_download(
    repo_id="canbingol/exp7_sdpa_1epoch_lr1e4_500k_vngr_corpus_2", 
    filename="model.py", 
    repo_type="model",
    local_dir="./"  
)

Load model and generate

import torch
from transformers import AutoTokenizer
from decoder_model import DecoderCausalLM

model_path = "canbingol/exp7_sdpa_1epoch_lr1e4_500k_vngr_corpus_2-epoch-kaggle"

device = "cuda" if torch.cuda.is_available() else "cpu"

model = DecoderCausalLM.from_pretrained(model_path).to(device=device, dtype=torch.bfloat16)
tokenizer = AutoTokenizer.from_pretrained(model_path)

input_ids = tokenizer.encode("selam ben", return_tensors="pt").to(device)

out_tokens = model.generate(input_ids)
generated_text = tokenizer.decode(out_tokens.flatten())

print(generated_text)

Notes

DecoderCausalLM implementation is included in the model files (model.py).

Downloads last month: 1

Dataset used to train canbingol/exp7_sdpa_1epoch_lr1e4_500k_vngr_corpus_2-epoch-kaggle

Collection including canbingol/exp7_sdpa_1epoch_lr1e4_500k_vngr_corpus_2-epoch-kaggle

0.1-Pretrain experiments

Collection

This collection created for storing my pretrain expriments • 8 items • Updated Feb 28 • 1

exp7_sdpa_1epoch_lr1e4_500k_vngr_corpus_2-epoch-kaggle

Detailed experiment logs, ablations, and comparisons:https://docs.google.com/spreadsheets/d/10dbABNIMc_WL85ba0rfGwrkbU-VHu3aRa9tnuOAGpyc/edit?usp=sharing

Usage

Download model file

Load model and generate

Notes

Dataset used to train canbingol/exp7_sdpa_1epoch_lr1e4_500k_vngr_corpus_2-epoch-kaggle

Collection including canbingol/exp7_sdpa_1epoch_lr1e4_500k_vngr_corpus_2-epoch-kaggle

Detailed experiment logs, ablations, and comparisons:
https://docs.google.com/spreadsheets/d/10dbABNIMc_WL85ba0rfGwrkbU-VHu3aRa9tnuOAGpyc/edit?usp=sharing