nlp-prachathai67k-text-classification

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on the prachathai67k dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1609
  • Accuracy: 0.9349
  • F1: 0.7418
  • Precision: 0.8033
  • Recall: 0.6890

Labels

id2label = {
    0:  'politics',
    1:  'human_rights',
    2:  'quality_of_life',
    3:  'international',
    4:  'social',
    5:  'environment',
    6:  'economics',
    7:  'culture',
    8:  'labor',
    9:  'national_security',
    10: 'ict',
    11: 'education'
}
label2id = {
    'politics':          0,
    'human_rights':      1,
    'quality_of_life':   2,
    'international':     3,
    'social':            4,
    'environment':       5,
    'economics':         6,
    'culture':           7,
    'labor':             8,
    'national_security': 9,
    'ict':               10,
    'education':         11
}

Usage

With pipeline API

from peft import PeftModel, PeftConfig
from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline

model_id = "tonkaew131/nlp-prachathai67k-text-classification"
config = PeftConfig.from_pretrained(model_id)

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(
    config.base_model_name_or_path,
    problem_type="multi_label_classification",
    num_labels=12,
    id2label=id2label,
    label2id=label2id
)
model.resize_token_embeddings(len(tokenizer))

lora_model = PeftModel.from_pretrained(model, model_id)
classifier = pipeline(
    "text-classification",
    model=lora_model,
    tokenizer=tokenizer,
    top_k=None
)

text = "<news-content>"
results = classifier(text)
print(results)

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 3
  • eval_batch_size: 3
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 12
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 2

Training results

Training Loss Epoch Step Validation Loss Accuracy F1 Precision Recall
0.1687 0.9998 4531 0.1698 0.9302 0.7251 0.7787 0.6784
0.1513 1.9997 9062 0.1609 0.9349 0.7418 0.8033 0.6890

Framework versions

  • PEFT 0.13.0
  • Transformers 4.44.2
  • Pytorch 2.4.0
  • Datasets 3.0.0
  • Tokenizers 0.19.1
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tonkaew131/nlp-prachathai67k-text-classification

Adapter
(664)
this model

Dataset used to train tonkaew131/nlp-prachathai67k-text-classification