Model Card for Model ID

An updated version of Neural History Chat, using the mighty-history-merge dataset to fine-tune the previous version (v1.0).

Model Details

Run history:

train/epoch	▁▁▂▂▃▃▃▄▄▅▅▅▆▆▇▇▇██
train/global_step	▁▁▂▂▃▃▃▄▄▅▅▅▆▆▇▇▇██
train/learning_rate	▂▃▅▆▇█▇▇▆▆▅▄▄▃▃▂▂▁
train/loss	█▆▄▃▃▃▃▃▂▃▂▂▁▁▁▁▂▁
train/total_flos	▁
train/train_loss	▁
train/train_runtime	▁
train/train_samples_per_second	▁
train/train_steps_per_second	▁

Run summary:

train/epoch	1.98
train/global_step	92
train/learning_rate	0.0
train/loss	0.7792
train/total_flos	1.756453697101824e+16
train/train_loss	1.30356
train/train_runtime	1176.2194
train/train_samples_per_second	10.068
train/train_steps_per_second	0.078

Training Explained

We went with a shorter training session of roughly 2 epochs for testing and evaluation. More steps/epochs might be in the future, but colab pricing is pretty steep. Currently to merge the peft back to the model, requires roughly 40GB of GPU RAM. So renting a Google Colab A100 is required and runs through credits quickly.

Downloads last month: 4

Safetensors

Model size

7B params

Tensor type

F16

ambrosfitz
/

neural-history-chat-v1.5

Model Card for Model ID

Model Details

Training Explained

Datasets used to train ambrosfitz/neural-history-chat-v1.5