FLAN-T5 small-GeoNames

This model is a fine-tuned version of flan-t5-small on the GeoNames dataset.

Model description

The model is trained to classify terms into one of 660 category classes related to geographical locations.

The model also works well as part of a Retrieval-and-Generation (RAG) pipeline by leveraging an external knowledge source, specifically GeoNames Semantic Primes.

Intended uses and limitations

This model is intended to be used to generate a type (class) for an input term.

Training and evaluation data

The training and evaluation data can be found here.

The train size is 8078865.

The test size is 702510.

Example

Here's an example of the model capabilities:

input:
- Lexical Term L: Pic de Font Blanca
output:
- Type: peak
input:
- Lexical Term L: Roc Mele
output:
- Type: mountain
input:
- Lexical Term L: Estany de les Abelletes
output:
- Type: lake

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
num_epochs: 5

Training results

Training Loss	Epoch	Step	Validation Loss
2.6223	1.0	1000	1.5223
2.1430	2.0	2000	1.3764
1.9100	3.0	3000	1.2825
1.7642	4.0	4000	1.2102
1.6607	5.0	5000	1.1488

@misc{akl2024dstillms4ol2024task,
      title={DSTI at LLMs4OL 2024 Task A: Intrinsic versus extrinsic knowledge for type classification}, 
      author={Hanna Abi Akl},
      year={2024},
      eprint={2408.14236},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2408.14236}, 
}

Downloads last month: 10

Safetensors

Model size

77M params

Tensor type

F32

Paper for HannaAbiAkl/flan-t5-small-geonames

DSTI at LLMs4OL 2024 Task A: Intrinsic versus extrinsic knowledge for type classification

Paper • 2408.14236 • Published Aug 26, 2024 • 5