diff --git "a/train.log" "b/train.log" new file mode 100644--- /dev/null +++ "b/train.log" @@ -0,0 +1,1669 @@ +Token indices sequence length is longer than the specified maximum sequence length for this model (548 > 512). Running this sequence through the model will result in indexing errors +Some weights of RobertaForTokenClassification were not initialized from the model checkpoint at jhgan/ko-sroberta-multitask and are newly initialized: ['classifier.bias', 'classifier.weight'] +You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. +/home/deep/gitprojects/ko-sroberta-korean-time-expression-classifier/src/time_expression_classifier/train_token_classifier.py:251: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `Trainer.__init__`. Use `processing_class` instead. + trainer = Trainer( +Skipped 1 records with TIMEX spans beyond max_length=256. + 0%| | 0/45526 [00:00