Cheng98
/

llama-39m

@@ -5,6 +5,17 @@ license: llama2
 # Toy LLaMA-39M
 - This is a tiny LLaMA model pretrained on [Recag/Rp_C4_55](https://huggingface.co/datasets/Recag/Rp_C4_55), a small subset of C4 with `seq_len=512`.
   - Load model and tokenizer:
       ```python
       from transformers import AutoTokenizer, AutoModelForCausalLM

 # Toy LLaMA-39M
 - This is a tiny LLaMA model pretrained on [Recag/Rp_C4_55](https://huggingface.co/datasets/Recag/Rp_C4_55), a small subset of C4 with `seq_len=512`.
+  - Model architecture
+      ```json
+      {
+        "hidden_size": 512,
+        "intermediate_size": 2048,
+        "max_position_embeddings": 2048,
+        "num_attention_heads": 8,
+        "num_hidden_layers": 2,
+        "num_key_value_heads": 8
+      }
+      ```
   - Load model and tokenizer:
       ```python
       from transformers import AutoTokenizer, AutoModelForCausalLM