Token indices sequence length is longer than the specified maximum sequence length for this model (548 > 512). Running this sequence through the model will result in indexing errors Some weights of RobertaForTokenClassification were not initialized from the model checkpoint at jhgan/ko-sroberta-multitask and are newly initialized: ['classifier.bias', 'classifier.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. /home/deep/gitprojects/ko-sroberta-korean-time-expression-classifier/src/time_expression_classifier/train_token_classifier.py:251: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `Trainer.__init__`. Use `processing_class` instead. trainer = Trainer( Skipped 1 records with TIMEX spans beyond max_length=256. 0%| | 0/45526 [00:00