Upload 26 files
Browse files- 9.txt +30 -0
- Readme.txt +13 -0
- Sample Dataset/conv_1.txt +45 -0
- Sample Dataset/conv_1001.txt +48 -0
- Sample Dataset/conv_1002.txt +55 -0
- Sample Dataset/conv_1003.txt +39 -0
- Sample Dataset/conv_2.txt +45 -0
- Sample Dataset/conv_3.txt +47 -0
- code_bart.py +30 -0
- code_pegasus.py +31 -0
- fine_tune_model_bart_large_25/config.json +70 -0
- fine_tune_model_bart_large_25/generation_config.json +16 -0
- fine_tune_model_bart_large_25/model.safetensors +3 -0
- fine_tune_model_pegasus_50/config.json +124 -0
- fine_tune_model_pegasus_50/generation_config.json +12 -0
- fine_tune_model_pegasus_50/model.safetensors +3 -0
- fine_tune_tokenizer_bart_large_25/merges.txt +0 -0
- fine_tune_tokenizer_bart_large_25/special_tokens_map.json +15 -0
- fine_tune_tokenizer_bart_large_25/tokenizer.json +0 -0
- fine_tune_tokenizer_bart_large_25/tokenizer_config.json +57 -0
- fine_tune_tokenizer_bart_large_25/vocab.json +0 -0
- fine_tune_tokenizer_pegasus_50/special_tokens_map.json +110 -0
- fine_tune_tokenizer_pegasus_50/spiece.model +3 -0
- fine_tune_tokenizer_pegasus_50/tokenizer.json +0 -0
- fine_tune_tokenizer_pegasus_50/tokenizer_config.json +967 -0
- requirements.txt +2 -0
9.txt
ADDED
|
@@ -0,0 +1,30 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
Doctor: What is your patient ID?
|
| 2 |
+
Patient: 8
|
| 3 |
+
Doctor: What is your age?
|
| 4 |
+
Patient: 27
|
| 5 |
+
Doctor: What is your gender?
|
| 6 |
+
Patient: Male
|
| 7 |
+
Doctor: Please describe your social life at the IISERB campus. Are you actively participating in extracurricular activities, interacting with others, or taking initiative to socialize with others?
|
| 8 |
+
Patient: I love participating in such activities. Nowdays, due to PhD workload, don't get much time to indulge in extracurriculars, so i compensate it by spending more time with friends.
|
| 9 |
+
Doctor: Describe your typical daily Mood?
|
| 10 |
+
Patient: I am mostly compsed, worried about the progress of my PHD sometimes. Mostly, i have high morale and i indulge in work enthusiastically.
|
| 11 |
+
Doctor: Does your Mood remain steady or goes up and down throughout the day without any reason or on trivial matters?
|
| 12 |
+
Patient: My mood almost remain steady, trivial matters does not affect much. Sometimes, unethical conduct of my colleagues bother me, but for a very small duration.
|
| 13 |
+
Doctor: How do you handle day-to-day irritations or frustrations?
|
| 14 |
+
Patient: I sleep well, talk to my friends. Always think of a better future. I tell myself if a work with compassion today, the coming years will be much better.
|
| 15 |
+
Doctor: How do you handle pressure related to academics?
|
| 16 |
+
Patient: I try to separate my personal life from professional one. I believe academics is full of pressure, however you perform, pressure will always be there. So, don't let academic pressure destroy your personal space and mental peace, just deal it as any other profession.
|
| 17 |
+
Doctor: Describe your ability to attend to the task at hand or concentrate on daily tasks (academic, non-academic)?
|
| 18 |
+
Patient: I can concentrate on tasks quite well. Although i am not able to follow the pre-decided timeline of the tasks, i do concentrate well at the task at hand.
|
| 19 |
+
Doctor: Have you noticed any difficulties with memory, such as unable to register new information, forgetting recent events, or not able to recall older personal/factual events?
|
| 20 |
+
Patient: I don't have any such difficulty with memory.
|
| 21 |
+
Doctor: What do you do to feel better? For example, some people take caffeine, talk with people, or watch movies to feel better.
|
| 22 |
+
Patient: I am a hardcore tea-lover, 5-6 cups daily. So to feel better, tea is the first choice, second option i consider is to talk with people, rarely I watch movies to feel better.
|
| 23 |
+
Doctor: Describe how supported you feel by others (e.g., friends, family) around you and how they help you?
|
| 24 |
+
Patient: I am lucky to have best of friends and family. Friends listen to my concerns and always provide some solution, family gives me the opportunity to focus on my career without worrying about anything else.
|
| 25 |
+
Doctor: What do you usually do when you have a bad day or when you are not able to concentrate on work?
|
| 26 |
+
Patient: I take rest, sleep properly (thankfully tension does not disturb my sleep cycle), clean my room, working area, wash clothes, call family and friends.
|
| 27 |
+
Doctor: Are you experiencing symptoms of stress, anxiety, or depression? If yes, describe the symptoms?
|
| 28 |
+
Patient: No
|
| 29 |
+
Doctor: Are you doing anything (by self or help seeking) for the ongoing stress, anxiety, or depression, if any? If yes, what?
|
| 30 |
+
Patient: No
|
Readme.txt
ADDED
|
@@ -0,0 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
To run the sample, follow these instructions:
|
| 2 |
+
|
| 3 |
+
1. USing Pegasus Model
|
| 4 |
+
i. Open `code_pegasus.py`
|
| 5 |
+
ii. Run it (we have provided the path of sample data 9.txt)
|
| 6 |
+
iii. To run on other conversation files, you have to provide the path to the desired data files for testing. (You can find our sample dataset in the Sample Dataset directory)
|
| 7 |
+
|
| 8 |
+
2. USing BART-large Model
|
| 9 |
+
i. Open `code_bart.py`
|
| 10 |
+
ii. Run it (we have provided the path of sample data 9.txt)
|
| 11 |
+
iii. To run on other conversation files, you have to provide the path to the desired data files for testing. (You can find our sample dataset in the Sample Dataset directory)
|
| 12 |
+
|
| 13 |
+
Note: The Sample Dataset includes an anonymized our collected sample dataset named conv_1, conv_2, and conv_3, along with a Chinese language dataset translated to English named conv_1001, conv_1002, and conv_1003 (D4).
|
Sample Dataset/conv_1.txt
ADDED
|
@@ -0,0 +1,45 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
Doctor: What is your patient ID?
|
| 2 |
+
Patient: 101
|
| 3 |
+
|
| 4 |
+
Doctor: What is your age?
|
| 5 |
+
Patient: 19
|
| 6 |
+
|
| 7 |
+
Doctor: What is your gender?
|
| 8 |
+
Patient: Male
|
| 9 |
+
|
| 10 |
+
Doctor: Please describe your social life at the *anonymized* campus. Are you actively participating in extracurricular activities, interacting with others, or taking initiative to socialize with others?
|
| 11 |
+
Patient: I'm don't participate in any extracurricular activities, however I have taken part in organizing some festivals. I interact with a lot of people on a daily basis, and sometimes I take initiative to socialize with others, but generally don't initiate talk with people I've not met yet.
|
| 12 |
+
|
| 13 |
+
Doctor: Describe your typical daily Mood?
|
| 14 |
+
Patient: My typical daily mood starts positive, sometimes neutral, and tends to be neutral by the end of the day
|
| 15 |
+
|
| 16 |
+
Doctor: Does your Mood remain steady or goes up and down throughout the day without any reason or on trivial matters?
|
| 17 |
+
Patient: My mood has rarely fluctuated without reason, however it does tend to go up for trivial reasons, my mood generally doesn't do down easily.
|
| 18 |
+
|
| 19 |
+
Doctor: How do you handle day-to-day irritations or frustrations?
|
| 20 |
+
Patient: I try to fix whatever is causing me to be irritated quickly and make sure that it doesn't bother me again. If I'm not able to fix the cause of my frustration, I try to ignore it to the best of my ability and try to not bother others.
|
| 21 |
+
|
| 22 |
+
Doctor: How do you handle pressure related to academics?
|
| 23 |
+
Patient: I try to create different options to study and different ways to make sure I'm studying. I have changed my timetable many times and changed my study location to try and study better.
|
| 24 |
+
|
| 25 |
+
Doctor: Describe your ability to attend to the task at hand or concentrate on daily tasks (academic, non-academic)?
|
| 26 |
+
Patient: I tend to get distracted fairly easily if I'm not too concentrated on my task at hand, however, I tend to be extremely efficient with my work if I actually concentrate on what I'm doing, studying I sometimes I have to force myself but I'm able to do this in the end.
|
| 27 |
+
|
| 28 |
+
Doctor: Have you noticed any difficulties with memory, such as unable to register new information, forgetting recent events, or not able to recall older personal/factual events?
|
| 29 |
+
Patient: Nothing such that it affects my daily life, however I do tend to forget information about events and such if I'm not interested in them.
|
| 30 |
+
|
| 31 |
+
Doctor: What do you do to feel better? For example, some people take caffeine, talk with people, or watch movies to feel better.
|
| 32 |
+
Patient: I tend to just hang out and talk with friends to feel better if I'm feeling particularly sad, but I also play video games or watch youtube videos to feel better in general.
|
| 33 |
+
|
| 34 |
+
Doctor: Describe how supported you feel by others (e.g., friends, family) around you and how they help you?
|
| 35 |
+
Patient: I feel very supported by my friends and family, they are always there to take care of me and always help me when I ask for it
|
| 36 |
+
|
| 37 |
+
Doctor: What do you usually do when you have a bad day or when you are not able to concentrate on work?
|
| 38 |
+
Patient: I spend the day trying to do something new, some activity that I have not done before or very little. If that doesn't work, I just hang out with friends and hope tomorrow will be better.
|
| 39 |
+
|
| 40 |
+
Doctor: Are you experiencing symptoms of stress, anxiety, or depression? If yes, describe the symptoms?
|
| 41 |
+
Patient: I sometimes feel socially anxious, talking to new people is always scary and feels like an impossible task, sometimes I feel that people talk about me behind my back, and sometimes I just feel uncomfortable.
|
| 42 |
+
|
| 43 |
+
Doctor: Are you doing anything (by self or help seeking) for the ongoing stress, anxiety, or depression, if any? If yes, what?
|
| 44 |
+
Patient: I try my best to make sure that anxiety doesn't overwhelm me, I talk to myself, encourage myself and make sure that any unnecessary thoughts are gone from my mind.
|
| 45 |
+
|
Sample Dataset/conv_1001.txt
ADDED
|
@@ -0,0 +1,48 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
Doctor: What is your patient ID?
|
| 2 |
+
Patient: 1001
|
| 3 |
+
|
| 4 |
+
Doctor: What is your age?
|
| 5 |
+
Patient: 32
|
| 6 |
+
|
| 7 |
+
Doctor: What is your gender?
|
| 8 |
+
Patient: Female
|
| 9 |
+
|
| 10 |
+
Patient: "Okay"
|
| 11 |
+
|
| 12 |
+
Doctor: "Hello"
|
| 13 |
+
|
| 14 |
+
Doctor: "What are your main problems recently?"
|
| 15 |
+
Patient: "I haven't been feeling well recently, and I feel a little tight in my chest"
|
| 16 |
+
|
| 17 |
+
Doctor: "Have you ever gone to the hospital to see a doctor?"
|
| 18 |
+
|
| 19 |
+
Patient: "Not yet, I don't have much time recently"
|
| 20 |
+
Patient: "Maybe it will take two weeks to go"
|
| 21 |
+
Doctor: "Hmm, let's take some time to see if you have any emotional problems recently"
|
| 22 |
+
|
| 23 |
+
Patient: "There's nothing wrong with my mood, I just feel mentally tired recently"
|
| 24 |
+
Doctor: "Do you feel tired without doing anything?"
|
| 25 |
+
|
| 26 |
+
Patient: "I feel like this, I don't want to move"
|
| 27 |
+
Doctor: "Then do you feel like you don't want to work?"
|
| 28 |
+
|
| 29 |
+
Patient: "I don't have enough energy to work"
|
| 30 |
+
Patient: "Yes"
|
| 31 |
+
Patient: "But I have to work"
|
| 32 |
+
Patient: "It's quite stressful"
|
| 33 |
+
Doctor: "Have you ever felt that you have lost interest in your past hobbies?"
|
| 34 |
+
|
| 35 |
+
Patient: "I should still be interested"
|
| 36 |
+
Patient: "I just don't have enough time to develop my hobbies"
|
| 37 |
+
Doctor: "Sleep, eat, etc."
|
| 38 |
+
|
| 39 |
+
Doctor: "Is everything normal?"
|
| 40 |
+
Patient: "fairly normal"
|
| 41 |
+
Doctor: "Will you feel dizzy or nauseous?"
|
| 42 |
+
Patient: "I get dizzy occasionally"
|
| 43 |
+
Doctor: "Do you feel lack of confidence? You are always worried about not doing well"
|
| 44 |
+
|
| 45 |
+
Patient: "No, I don't have time to worry about this or that"
|
| 46 |
+
Doctor: "It sounds like you are doing well lately"
|
| 47 |
+
|
| 48 |
+
Doctor: "The consultation ends here"
|
Sample Dataset/conv_1002.txt
ADDED
|
@@ -0,0 +1,55 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
Doctor: What is your patient ID?
|
| 2 |
+
Patient: 1002
|
| 3 |
+
|
| 4 |
+
Doctor: What is your age?
|
| 5 |
+
Patient: 22
|
| 6 |
+
|
| 7 |
+
Doctor: What is your gender?
|
| 8 |
+
Patient: Female
|
| 9 |
+
|
| 10 |
+
Patient: "Hello doctor"
|
| 11 |
+
|
| 12 |
+
Doctor: "Hello"
|
| 13 |
+
|
| 14 |
+
Doctor: "Let's get started"
|
| 15 |
+
Doctor: "What problems have you encountered recently?"
|
| 16 |
+
Patient: "I am preparing for the exam recently, but I get sleepy while studying"
|
| 17 |
+
|
| 18 |
+
Doctor: "Understand, do you feel tired every day and have no energy?"
|
| 19 |
+
|
| 20 |
+
Patient: "Not very tired"
|
| 21 |
+
Patient: "If you don't study, you will be very energetic"
|
| 22 |
+
Patient: "I feel sleepy after reading a book for a while, especially in the morning when I get up early"
|
| 23 |
+
Doctor: "You know, do you feel that you are not interested in learning?"
|
| 24 |
+
|
| 25 |
+
Patient: "Because I have to endorse, it's really a bit boring. It's not as fun as watching TV or something."
|
| 26 |
+
Doctor: "Hmm, besides watching TV, are you still interested in other things that you were interested in before?"
|
| 27 |
+
|
| 28 |
+
Patient: "interested"
|
| 29 |
+
Doctor: "Okay, I just mentioned that you will easily feel sleepy when studying after getting up early. How have you been sleeping recently?"
|
| 30 |
+
|
| 31 |
+
Patient: "Sleep is about the same as before"
|
| 32 |
+
Doctor: "How long does it take from lying down in bed to falling asleep at night?"
|
| 33 |
+
|
| 34 |
+
Patient: "About 15 minutes"
|
| 35 |
+
Doctor: "It sounds like you fell asleep well. Will you wake up easily?"
|
| 36 |
+
|
| 37 |
+
Patient: "Maybe I'll get up for a WC"
|
| 38 |
+
Doctor: "I understand, do you usually wake up earlier than the alarm clock in the morning?"
|
| 39 |
+
|
| 40 |
+
Patient: "I used to set an alarm clock for 8 o'clock. After a while, I would automatically wake up a few minutes earlier every time. Gradually, the alarm clock has been changed to 7:30."
|
| 41 |
+
Patient: "I don't know if I have formed a biological clock"
|
| 42 |
+
Doctor: "I understand, so I sleep fairly regularly every day"
|
| 43 |
+
|
| 44 |
+
Doctor: "In addition to being easily sleepy while studying, are there any other problems?"
|
| 45 |
+
|
| 46 |
+
Patient: "No other problems for the time being"
|
| 47 |
+
Doctor: "Okay, I understand. How are you feeling recently?"
|
| 48 |
+
|
| 49 |
+
Patient: "Okay"
|
| 50 |
+
Doctor: "I understand. It sounds like you have been feeling sleepy while studying recently due to preparations for exams, which makes you feel a little troubled. Don't worry too much. You can combine work and rest through exercise or other entertainment activities in your daily study life, and pay attention to relaxation."
|
| 51 |
+
|
| 52 |
+
Doctor: "Okay, then our consultation ends here"
|
| 53 |
+
|
| 54 |
+
Patient: "Okay"
|
| 55 |
+
Patient: "Thank you"
|
Sample Dataset/conv_1003.txt
ADDED
|
@@ -0,0 +1,39 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
Doctor: What is your patient ID?
|
| 2 |
+
Patient: 1003
|
| 3 |
+
|
| 4 |
+
Doctor: What is your age?
|
| 5 |
+
Patient: 38
|
| 6 |
+
|
| 7 |
+
Doctor: What is your gender?
|
| 8 |
+
Patient: Female
|
| 9 |
+
Patient: "Hello, Dr. Wu"
|
| 10 |
+
Doctor: "Hello!"
|
| 11 |
+
Doctor: "What are your main problems recently?"
|
| 12 |
+
Patient: "I haven't been in a good mood the past two days"
|
| 13 |
+
Patient: "The child's grades have not improved"
|
| 14 |
+
Doctor: "Okay, let's talk."
|
| 15 |
+
Doctor: "How long have you been feeling this way?"
|
| 16 |
+
Patient: "Just these two days, my child's grades came down yesterday. He didn't do well in the test."
|
| 17 |
+
Doctor: "Children's affairs are indeed important, but parents' emotions also have a certain impact on their children. How have you been feeling in the past two weeks?"
|
| 18 |
+
Patient: "It was pretty good the past few days. I bought a stock before and it was doing pretty well."
|
| 19 |
+
Doctor: "May I ask, is stock trading an interest of yours?"
|
| 20 |
+
Patient: "It's okay to say it's just interest. After all, I'm in the finance industry."
|
| 21 |
+
Doctor: "Children's performance matters are staged. Don't rush for success. Emotions cannot change the current situation. I think you can take some events to accompany your child recently to see where the performance problems lie."
|
| 22 |
+
Patient: "Yeah, good doctor. I think so too"
|
| 23 |
+
Doctor: "For you, you may be too busy at work. Why don't you try something you are interested in to adjust your mood?"
|
| 24 |
+
Patient: "I have always liked dancing, and I even signed up for a dance class some time ago"
|
| 25 |
+
Patient: "After all, I am nearly 40 years old, so I have to keep in shape"
|
| 26 |
+
Doctor: "This is a great hobby. Have you persisted with it recently?"
|
| 27 |
+
Patient: "Yes, I will take two classes every weekend"
|
| 28 |
+
Patient: "When I have nothing to do, I will also practice with this mirror"
|
| 29 |
+
Doctor: "Okay, I hope dancing can bring you a good mood."
|
| 30 |
+
Doctor: "Okay, how is your sleep quality recently?"
|
| 31 |
+
Patient: "It's very good. Maybe because of dancing, the quality of sleep is quite high"
|
| 32 |
+
Doctor: "Okay, do you have a good appetite for all three meals?"
|
| 33 |
+
Patient: "I have a good appetite and my husband is a good cook"
|
| 34 |
+
Doctor: "Okay, it seems that you are very happy in life!"
|
| 35 |
+
Doctor: "So, don't worry too much about your children, how about you adjust your own state?"
|
| 36 |
+
Patient: "Life is quite happy."
|
| 37 |
+
Patient: "Yeah, okay. Thank you doctor"
|
| 38 |
+
Doctor: "Okay, okay, if there are no other questions, our consultation will end here."
|
| 39 |
+
Patient: "Yeah, thank you for your hard work, Dr. Wu."
|
Sample Dataset/conv_2.txt
ADDED
|
@@ -0,0 +1,45 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
Doctor: What is your patient ID?
|
| 2 |
+
Patient: 102
|
| 3 |
+
|
| 4 |
+
Doctor: What is your age?
|
| 5 |
+
Patient: 19
|
| 6 |
+
|
| 7 |
+
Doctor: What is your gender?
|
| 8 |
+
Patient: Female
|
| 9 |
+
|
| 10 |
+
Doctor: Please describe your social life at the *anonymized* campus. Are you actively participating in extracurricular activities, interacting with others, or taking initiative to socialize with others?
|
| 11 |
+
Patient: I am part of football team and a core member in Physics club and Singularity working team. I also make contacts with my seniors and other staffs for both personal and official discussions. I volunteer for community fests and other initiatives.
|
| 12 |
+
|
| 13 |
+
Doctor: Describe your typical daily Mood?
|
| 14 |
+
Patient: I always try to find happiness in every single moment of my life. But at times I turnout t be moody.
|
| 15 |
+
|
| 16 |
+
Doctor: Does your Mood remain steady or goes up and down throughout the day without any reason or on trivial matters?
|
| 17 |
+
Patient: My mood is dynamic. It goes up and down for both valid and unknown reasons . I get upset on simple jokes and responses from my close circle.
|
| 18 |
+
|
| 19 |
+
Doctor: How do you handle day-to-day irritations or frustrations?
|
| 20 |
+
Patient: I try to connect more with the Almighty through daily prayers. But mostly I prefer sleeping with no disturbance for hours. Nowadays I try to engage myself with a busy schedule and locations.
|
| 21 |
+
|
| 22 |
+
Doctor: How do you handle pressure related to academics?
|
| 23 |
+
Patient: lately I started purposeful ignorance of academic pressure. I will engage my times studying or with close friend. I also try to phone my parents when I feel so exhausted.
|
| 24 |
+
|
| 25 |
+
Doctor: Describe your ability to attend to the task at hand or concentrate on daily tasks (academic, non-academic)?
|
| 26 |
+
Patient: I am mostly able to focus on my task and complete on time. But when I am in a bad mood I will distract myself from the task with social media and resume when I feel fine.
|
| 27 |
+
|
| 28 |
+
Doctor: Have you noticed any difficulties with memory, such as unable to register new information, forgetting recent events, or not able to recall older personal/factual events?
|
| 29 |
+
Patient: Yes I do, and only very lately. I find it very difficult to comprehend what I see and try reading. I also noticed forgetting recent events which where not very important but still to be considered. I also have difficulty in recalling but the least.
|
| 30 |
+
|
| 31 |
+
Doctor: What do you do to feel better? For example, some people take caffeine, talk with people, or watch movies to feel better.
|
| 32 |
+
Patient: Sleep mostly. But if it is with communication gap, I only settle after conveying my last note. I also sing a song or try dancing in my room but I prefer privacy for this
|
| 33 |
+
|
| 34 |
+
Doctor: Describe how supported you feel by others (e.g., friends, family) around you and how they help you?
|
| 35 |
+
Patient: I feel supported very less even from family. And so I don't expect any support from anyone and try to figure out all alone.
|
| 36 |
+
|
| 37 |
+
Doctor: What do you usually do when you have a bad day or when you are not able to concentrate on work?
|
| 38 |
+
Patient: I sleep for hours or the entire day. I also get some ease after crying or talking about it. I used talk to myself which helped me figure out the situation and motivated to push through.
|
| 39 |
+
|
| 40 |
+
Doctor: Are you experiencing symptoms of stress, anxiety, or depression? If yes, describe the symptoms?
|
| 41 |
+
Patient: Yes, all stress, anxiety and depression
|
| 42 |
+
|
| 43 |
+
Doctor: Are you doing anything (by self or help seeking) for the ongoing stress, anxiety, or depression, if any? If yes, what?
|
| 44 |
+
Patient: Yes, I'm reading books on self-development and self-improvement.
|
| 45 |
+
|
Sample Dataset/conv_3.txt
ADDED
|
@@ -0,0 +1,47 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
Doctor: What is your patient ID?
|
| 2 |
+
Patient: 103
|
| 3 |
+
|
| 4 |
+
Doctor: What is your age?
|
| 5 |
+
Patient: 18
|
| 6 |
+
|
| 7 |
+
Doctor: What is your gender?
|
| 8 |
+
Patient: Male
|
| 9 |
+
|
| 10 |
+
Doctor: Please describe your social life at the *anonymized* campus. Are you actively participating in extracurricular activities, interacting with others, or taking initiative to socialize with others?
|
| 11 |
+
Patient: yes i have been envolve in various talks organised by the different council.I am trying to explore different sports.especially lawn tennis and football. I have made some good friends and interacting seniors helps a lot .
|
| 12 |
+
|
| 13 |
+
|
| 14 |
+
Doctor: Describe your typical daily Mood?
|
| 15 |
+
Patient: Confused at times to how to move ahead what to follow .Otherwise pretty chill when i am with myy friends which helps a lot
|
| 16 |
+
|
| 17 |
+
Doctor: Does your Mood remain steady or goes up and down throughout the day without any reason or on trivial matters?
|
| 18 |
+
Patient: not really if i am indulge in doing some work and have a plan of doing something throuhout the it kind just go strainght and remains steady
|
| 19 |
+
|
| 20 |
+
Doctor: How do you handle day-to-day irritations or frustrations?
|
| 21 |
+
Patient: Just talking to someone either your friends or family helps to deal all these situations pretty well. Finding solution to the problems which is causing you troubles and just talk to yourself helps me finding peace
|
| 22 |
+
|
| 23 |
+
|
| 24 |
+
Doctor: How do you handle pressure related to academics?
|
| 25 |
+
Patient: Regular studies and doing right things on daily basis helps in pressure handling in academics.All i can say is having a regular plan and consistency helps a lot
|
| 26 |
+
|
| 27 |
+
Doctor: Describe your ability to attend to the task at hand or concentrate on daily tasks (academic, non-academic)?
|
| 28 |
+
Patient: Having pateince and belief in your work you are doing helps me to concentrate on daily tasks. Trying to explore diffenrent things and making it interesting also helps
|
| 29 |
+
|
| 30 |
+
Doctor: Have you noticed any difficulties with memory, such as unable to register new information, forgetting recent events, or not able to recall older personal/factual events?
|
| 31 |
+
Patient: yes these things happens at some point of time in your day t day life. recalling recnts events is i say has become much common
|
| 32 |
+
|
| 33 |
+
Doctor: What do you do to feel better? For example, some people take caffeine, talk with people, or watch movies to feel better.
|
| 34 |
+
Patient: I like talking to people either my friends or my family which helps me feel better. Music my kind of way to escape from the hectic day and have some peace
|
| 35 |
+
|
| 36 |
+
Doctor: Describe how supported you feel by others (e.g., friends, family) around you and how they help you?
|
| 37 |
+
Patient: My family is very supportive in my recent yars where things wasn't going the way which i could have wished for but my family on my back helped me out from those loomy days
|
| 38 |
+
|
| 39 |
+
Doctor: What do you usually do when you have a bad day or when you are not able to concentrate on work?
|
| 40 |
+
Patient: I just put my headphones up and just go for a walk and trying to speak with myself r just chill with my friends which refereshes the mood
|
| 41 |
+
|
| 42 |
+
Doctor: Are you experiencing symptoms of stress, anxiety, or depression? If yes, describe the symptoms?
|
| 43 |
+
Patient: nope
|
| 44 |
+
|
| 45 |
+
Doctor: Are you doing anything (by self or help seeking) for the ongoing stress, anxiety, or depression, if any? If yes, what?
|
| 46 |
+
Patient: nope
|
| 47 |
+
|
code_bart.py
ADDED
|
@@ -0,0 +1,30 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from transformers import BartForConditionalGeneration, BartTokenizer
|
| 2 |
+
from transformers import pipeline
|
| 3 |
+
|
| 4 |
+
# Replace *Path* with the model path
|
| 5 |
+
model_path = "./fine_tune_model_bart_large_25"
|
| 6 |
+
|
| 7 |
+
# Replace **Path** with the tokenizer path
|
| 8 |
+
tokenizer_path = "./fine_tune_tokenizer_bart_large_25"
|
| 9 |
+
|
| 10 |
+
# Load the fine-tuned model
|
| 11 |
+
model = BartForConditionalGeneration.from_pretrained(model_path, ignore_mismatched_sizes=True)
|
| 12 |
+
|
| 13 |
+
# Load the tokenizer associated with the fine-tuned model
|
| 14 |
+
tokenizer = BartTokenizer.from_pretrained(tokenizer_path)
|
| 15 |
+
|
| 16 |
+
# Ensure the model is in evaluation mode
|
| 17 |
+
model.eval()
|
| 18 |
+
|
| 19 |
+
|
| 20 |
+
# Create a custom summarization pipeline
|
| 21 |
+
gen_kwargs = {"length_penalty": 1.0, "num_beams": 8, "max_length": 700}
|
| 22 |
+
custom_summarization_pipeline = pipeline('summarization', model=model, tokenizer=tokenizer, **gen_kwargs)
|
| 23 |
+
|
| 24 |
+
file_path = "./9.txt" # Replace with the conversation file path to check others
|
| 25 |
+
with open(file_path, 'r') as file:
|
| 26 |
+
text = file.read()
|
| 27 |
+
|
| 28 |
+
# Call the custom summarization pipeline
|
| 29 |
+
summary = custom_summarization_pipeline(text)
|
| 30 |
+
print('Summary:\n',summary[0]['summary_text'])
|
code_pegasus.py
ADDED
|
@@ -0,0 +1,31 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from transformers import PegasusForConditionalGeneration, PegasusTokenizer
|
| 2 |
+
from transformers import pipeline
|
| 3 |
+
|
| 4 |
+
# Replace *Path* with the model path
|
| 5 |
+
model_path = "./fine_tune_model_pegasus_50"
|
| 6 |
+
|
| 7 |
+
# Replace **Path** with the tokenizer path
|
| 8 |
+
tokenizer_path = "./fine_tune_tokenizer_pegasus_50"
|
| 9 |
+
|
| 10 |
+
# Load the fine-tuned model
|
| 11 |
+
model = PegasusForConditionalGeneration.from_pretrained(model_path, ignore_mismatched_sizes=True)
|
| 12 |
+
|
| 13 |
+
# Load the tokenizer associated with the fine-tuned model
|
| 14 |
+
tokenizer = PegasusTokenizer.from_pretrained(tokenizer_path)
|
| 15 |
+
|
| 16 |
+
# Ensure the model is in evaluation mode
|
| 17 |
+
model.eval()
|
| 18 |
+
|
| 19 |
+
# Create a custom summarization pipeline
|
| 20 |
+
gen_kwargs = {"length_penalty": 1.0, "num_beams": 8, "max_length": 700}
|
| 21 |
+
custom_summarization_pipeline = pipeline('summarization', model=model, tokenizer=tokenizer, **gen_kwargs)
|
| 22 |
+
|
| 23 |
+
|
| 24 |
+
file_path = "./9.txt" # Replace with the conversation file path to check others
|
| 25 |
+
with open(file_path, 'r') as file:
|
| 26 |
+
text = file.read()
|
| 27 |
+
|
| 28 |
+
# Call the custom summarization pipeline
|
| 29 |
+
summary = custom_summarization_pipeline(text)
|
| 30 |
+
|
| 31 |
+
print('Summary:\n',summary[0]['summary_text'])
|
fine_tune_model_bart_large_25/config.json
ADDED
|
@@ -0,0 +1,70 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"_name_or_path": "facebook/bart-large-cnn",
|
| 3 |
+
"_num_labels": 3,
|
| 4 |
+
"activation_dropout": 0.0,
|
| 5 |
+
"activation_function": "gelu",
|
| 6 |
+
"add_final_layer_norm": false,
|
| 7 |
+
"architectures": [
|
| 8 |
+
"BartForConditionalGeneration"
|
| 9 |
+
],
|
| 10 |
+
"attention_dropout": 0.0,
|
| 11 |
+
"bos_token_id": 0,
|
| 12 |
+
"classif_dropout": 0.0,
|
| 13 |
+
"classifier_dropout": 0.0,
|
| 14 |
+
"d_model": 1024,
|
| 15 |
+
"decoder_attention_heads": 16,
|
| 16 |
+
"decoder_ffn_dim": 4096,
|
| 17 |
+
"decoder_layerdrop": 0.0,
|
| 18 |
+
"decoder_layers": 12,
|
| 19 |
+
"decoder_start_token_id": 2,
|
| 20 |
+
"dropout": 0.1,
|
| 21 |
+
"early_stopping": true,
|
| 22 |
+
"encoder_attention_heads": 16,
|
| 23 |
+
"encoder_ffn_dim": 4096,
|
| 24 |
+
"encoder_layerdrop": 0.0,
|
| 25 |
+
"encoder_layers": 12,
|
| 26 |
+
"eos_token_id": 2,
|
| 27 |
+
"force_bos_token_to_be_generated": true,
|
| 28 |
+
"forced_bos_token_id": 0,
|
| 29 |
+
"forced_eos_token_id": 2,
|
| 30 |
+
"gradient_checkpointing": false,
|
| 31 |
+
"id2label": {
|
| 32 |
+
"0": "LABEL_0",
|
| 33 |
+
"1": "LABEL_1",
|
| 34 |
+
"2": "LABEL_2"
|
| 35 |
+
},
|
| 36 |
+
"init_std": 0.02,
|
| 37 |
+
"is_encoder_decoder": true,
|
| 38 |
+
"label2id": {
|
| 39 |
+
"LABEL_0": 0,
|
| 40 |
+
"LABEL_1": 1,
|
| 41 |
+
"LABEL_2": 2
|
| 42 |
+
},
|
| 43 |
+
"length_penalty": 2.0,
|
| 44 |
+
"max_length": 142,
|
| 45 |
+
"max_position_embeddings": 1024,
|
| 46 |
+
"min_length": 56,
|
| 47 |
+
"model_type": "bart",
|
| 48 |
+
"no_repeat_ngram_size": 3,
|
| 49 |
+
"normalize_before": false,
|
| 50 |
+
"num_beams": 4,
|
| 51 |
+
"num_hidden_layers": 12,
|
| 52 |
+
"output_past": true,
|
| 53 |
+
"pad_token_id": 1,
|
| 54 |
+
"prefix": " ",
|
| 55 |
+
"scale_embedding": false,
|
| 56 |
+
"task_specific_params": {
|
| 57 |
+
"summarization": {
|
| 58 |
+
"early_stopping": true,
|
| 59 |
+
"length_penalty": 2.0,
|
| 60 |
+
"max_length": 142,
|
| 61 |
+
"min_length": 56,
|
| 62 |
+
"no_repeat_ngram_size": 3,
|
| 63 |
+
"num_beams": 4
|
| 64 |
+
}
|
| 65 |
+
},
|
| 66 |
+
"torch_dtype": "float32",
|
| 67 |
+
"transformers_version": "4.37.2",
|
| 68 |
+
"use_cache": true,
|
| 69 |
+
"vocab_size": 50264
|
| 70 |
+
}
|
fine_tune_model_bart_large_25/generation_config.json
ADDED
|
@@ -0,0 +1,16 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"_from_model_config": true,
|
| 3 |
+
"bos_token_id": 0,
|
| 4 |
+
"decoder_start_token_id": 2,
|
| 5 |
+
"early_stopping": true,
|
| 6 |
+
"eos_token_id": 2,
|
| 7 |
+
"forced_bos_token_id": 0,
|
| 8 |
+
"forced_eos_token_id": 2,
|
| 9 |
+
"length_penalty": 2.0,
|
| 10 |
+
"max_length": 142,
|
| 11 |
+
"min_length": 56,
|
| 12 |
+
"no_repeat_ngram_size": 3,
|
| 13 |
+
"num_beams": 4,
|
| 14 |
+
"pad_token_id": 1,
|
| 15 |
+
"transformers_version": "4.37.2"
|
| 16 |
+
}
|
fine_tune_model_bart_large_25/model.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:bb484458437e4416722396975f77f3829b2c4830e9ece28dbb45b92f4d57893c
|
| 3 |
+
size 1625422896
|
fine_tune_model_pegasus_50/config.json
ADDED
|
@@ -0,0 +1,124 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"_name_or_path": "google/pegasus-large",
|
| 3 |
+
"activation_dropout": 0.1,
|
| 4 |
+
"activation_function": "relu",
|
| 5 |
+
"add_bias_logits": false,
|
| 6 |
+
"add_final_layer_norm": true,
|
| 7 |
+
"architectures": [
|
| 8 |
+
"PegasusForConditionalGeneration"
|
| 9 |
+
],
|
| 10 |
+
"attention_dropout": 0.1,
|
| 11 |
+
"bos_token_id": 0,
|
| 12 |
+
"classif_dropout": 0.0,
|
| 13 |
+
"classifier_dropout": 0.0,
|
| 14 |
+
"d_model": 1024,
|
| 15 |
+
"decoder_attention_heads": 16,
|
| 16 |
+
"decoder_ffn_dim": 4096,
|
| 17 |
+
"decoder_layerdrop": 0.0,
|
| 18 |
+
"decoder_layers": 16,
|
| 19 |
+
"decoder_start_token_id": 0,
|
| 20 |
+
"dropout": 0.1,
|
| 21 |
+
"encoder_attention_heads": 16,
|
| 22 |
+
"encoder_ffn_dim": 4096,
|
| 23 |
+
"encoder_layerdrop": 0.0,
|
| 24 |
+
"encoder_layers": 16,
|
| 25 |
+
"eos_token_id": 1,
|
| 26 |
+
"extra_pos_embeddings": 1,
|
| 27 |
+
"force_bos_token_to_be_generated": false,
|
| 28 |
+
"forced_eos_token_id": 1,
|
| 29 |
+
"gradient_checkpointing": false,
|
| 30 |
+
"id2label": {
|
| 31 |
+
"0": "LABEL_0",
|
| 32 |
+
"1": "LABEL_1",
|
| 33 |
+
"2": "LABEL_2"
|
| 34 |
+
},
|
| 35 |
+
"init_std": 0.02,
|
| 36 |
+
"is_encoder_decoder": true,
|
| 37 |
+
"label2id": {
|
| 38 |
+
"LABEL_0": 0,
|
| 39 |
+
"LABEL_1": 1,
|
| 40 |
+
"LABEL_2": 2
|
| 41 |
+
},
|
| 42 |
+
"length_penalty": 0.8,
|
| 43 |
+
"max_length": 256,
|
| 44 |
+
"max_position_embeddings": 1024,
|
| 45 |
+
"model_type": "pegasus",
|
| 46 |
+
"normalize_before": true,
|
| 47 |
+
"normalize_embedding": false,
|
| 48 |
+
"num_beams": 8,
|
| 49 |
+
"num_hidden_layers": 16,
|
| 50 |
+
"pad_token_id": 0,
|
| 51 |
+
"scale_embedding": true,
|
| 52 |
+
"static_position_embeddings": true,
|
| 53 |
+
"task_specific_params": {
|
| 54 |
+
"summarization_aeslc": {
|
| 55 |
+
"length_penalty": 0.6,
|
| 56 |
+
"max_length": 32,
|
| 57 |
+
"max_position_embeddings": 512
|
| 58 |
+
},
|
| 59 |
+
"summarization_arxiv": {
|
| 60 |
+
"length_penalty": 0.8,
|
| 61 |
+
"max_length": 256,
|
| 62 |
+
"max_position_embeddings": 1024
|
| 63 |
+
},
|
| 64 |
+
"summarization_big_patent": {
|
| 65 |
+
"length_penalty": 0.7,
|
| 66 |
+
"max_length": 256,
|
| 67 |
+
"max_position_embeddings": 1024
|
| 68 |
+
},
|
| 69 |
+
"summarization_billsum": {
|
| 70 |
+
"length_penalty": 0.6,
|
| 71 |
+
"max_length": 256,
|
| 72 |
+
"max_position_embeddings": 1024
|
| 73 |
+
},
|
| 74 |
+
"summarization_cnn_dailymail": {
|
| 75 |
+
"length_penalty": 0.8,
|
| 76 |
+
"max_length": 128,
|
| 77 |
+
"max_position_embeddings": 1024
|
| 78 |
+
},
|
| 79 |
+
"summarization_gigaword": {
|
| 80 |
+
"length_penalty": 0.6,
|
| 81 |
+
"max_length": 32,
|
| 82 |
+
"max_position_embeddings": 128
|
| 83 |
+
},
|
| 84 |
+
"summarization_large": {
|
| 85 |
+
"length_penalty": 0.8,
|
| 86 |
+
"max_length": 256,
|
| 87 |
+
"max_position_embeddings": 1024
|
| 88 |
+
},
|
| 89 |
+
"summarization_multi_news": {
|
| 90 |
+
"length_penalty": 0.8,
|
| 91 |
+
"max_length": 256,
|
| 92 |
+
"max_position_embeddings": 1024
|
| 93 |
+
},
|
| 94 |
+
"summarization_newsroom": {
|
| 95 |
+
"length_penalty": 0.8,
|
| 96 |
+
"max_length": 128,
|
| 97 |
+
"max_position_embeddings": 512
|
| 98 |
+
},
|
| 99 |
+
"summarization_pubmed": {
|
| 100 |
+
"length_penalty": 0.8,
|
| 101 |
+
"max_length": 256,
|
| 102 |
+
"max_position_embeddings": 1024
|
| 103 |
+
},
|
| 104 |
+
"summarization_reddit_tifu": {
|
| 105 |
+
"length_penalty": 0.6,
|
| 106 |
+
"max_length": 128,
|
| 107 |
+
"max_position_embeddings": 512
|
| 108 |
+
},
|
| 109 |
+
"summarization_wikihow": {
|
| 110 |
+
"length_penalty": 0.6,
|
| 111 |
+
"max_length": 256,
|
| 112 |
+
"max_position_embeddings": 512
|
| 113 |
+
},
|
| 114 |
+
"summarization_xsum": {
|
| 115 |
+
"length_penalty": 0.8,
|
| 116 |
+
"max_length": 64,
|
| 117 |
+
"max_position_embeddings": 512
|
| 118 |
+
}
|
| 119 |
+
},
|
| 120 |
+
"torch_dtype": "float32",
|
| 121 |
+
"transformers_version": "4.37.2",
|
| 122 |
+
"use_cache": true,
|
| 123 |
+
"vocab_size": 96103
|
| 124 |
+
}
|
fine_tune_model_pegasus_50/generation_config.json
ADDED
|
@@ -0,0 +1,12 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"_from_model_config": true,
|
| 3 |
+
"bos_token_id": 0,
|
| 4 |
+
"decoder_start_token_id": 0,
|
| 5 |
+
"eos_token_id": 1,
|
| 6 |
+
"forced_eos_token_id": 1,
|
| 7 |
+
"length_penalty": 0.8,
|
| 8 |
+
"max_length": 256,
|
| 9 |
+
"num_beams": 8,
|
| 10 |
+
"pad_token_id": 0,
|
| 11 |
+
"transformers_version": "4.37.2"
|
| 12 |
+
}
|
fine_tune_model_pegasus_50/model.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b6e47f9f260ed65c81b6c49fccc4fe01c129c1c08920ae92896d303e80d13e41
|
| 3 |
+
size 2283652852
|
fine_tune_tokenizer_bart_large_25/merges.txt
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
fine_tune_tokenizer_bart_large_25/special_tokens_map.json
ADDED
|
@@ -0,0 +1,15 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"bos_token": "<s>",
|
| 3 |
+
"cls_token": "<s>",
|
| 4 |
+
"eos_token": "</s>",
|
| 5 |
+
"mask_token": {
|
| 6 |
+
"content": "<mask>",
|
| 7 |
+
"lstrip": true,
|
| 8 |
+
"normalized": true,
|
| 9 |
+
"rstrip": false,
|
| 10 |
+
"single_word": false
|
| 11 |
+
},
|
| 12 |
+
"pad_token": "<pad>",
|
| 13 |
+
"sep_token": "</s>",
|
| 14 |
+
"unk_token": "<unk>"
|
| 15 |
+
}
|
fine_tune_tokenizer_bart_large_25/tokenizer.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
fine_tune_tokenizer_bart_large_25/tokenizer_config.json
ADDED
|
@@ -0,0 +1,57 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"add_prefix_space": false,
|
| 3 |
+
"added_tokens_decoder": {
|
| 4 |
+
"0": {
|
| 5 |
+
"content": "<s>",
|
| 6 |
+
"lstrip": false,
|
| 7 |
+
"normalized": true,
|
| 8 |
+
"rstrip": false,
|
| 9 |
+
"single_word": false,
|
| 10 |
+
"special": true
|
| 11 |
+
},
|
| 12 |
+
"1": {
|
| 13 |
+
"content": "<pad>",
|
| 14 |
+
"lstrip": false,
|
| 15 |
+
"normalized": true,
|
| 16 |
+
"rstrip": false,
|
| 17 |
+
"single_word": false,
|
| 18 |
+
"special": true
|
| 19 |
+
},
|
| 20 |
+
"2": {
|
| 21 |
+
"content": "</s>",
|
| 22 |
+
"lstrip": false,
|
| 23 |
+
"normalized": true,
|
| 24 |
+
"rstrip": false,
|
| 25 |
+
"single_word": false,
|
| 26 |
+
"special": true
|
| 27 |
+
},
|
| 28 |
+
"3": {
|
| 29 |
+
"content": "<unk>",
|
| 30 |
+
"lstrip": false,
|
| 31 |
+
"normalized": true,
|
| 32 |
+
"rstrip": false,
|
| 33 |
+
"single_word": false,
|
| 34 |
+
"special": true
|
| 35 |
+
},
|
| 36 |
+
"50264": {
|
| 37 |
+
"content": "<mask>",
|
| 38 |
+
"lstrip": true,
|
| 39 |
+
"normalized": true,
|
| 40 |
+
"rstrip": false,
|
| 41 |
+
"single_word": false,
|
| 42 |
+
"special": true
|
| 43 |
+
}
|
| 44 |
+
},
|
| 45 |
+
"bos_token": "<s>",
|
| 46 |
+
"clean_up_tokenization_spaces": true,
|
| 47 |
+
"cls_token": "<s>",
|
| 48 |
+
"eos_token": "</s>",
|
| 49 |
+
"errors": "replace",
|
| 50 |
+
"mask_token": "<mask>",
|
| 51 |
+
"model_max_length": 1024,
|
| 52 |
+
"pad_token": "<pad>",
|
| 53 |
+
"sep_token": "</s>",
|
| 54 |
+
"tokenizer_class": "BartTokenizer",
|
| 55 |
+
"trim_offsets": true,
|
| 56 |
+
"unk_token": "<unk>"
|
| 57 |
+
}
|
fine_tune_tokenizer_bart_large_25/vocab.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
fine_tune_tokenizer_pegasus_50/special_tokens_map.json
ADDED
|
@@ -0,0 +1,110 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"additional_special_tokens": [
|
| 3 |
+
"<mask_1>",
|
| 4 |
+
"<unk_2>",
|
| 5 |
+
"<unk_3>",
|
| 6 |
+
"<unk_4>",
|
| 7 |
+
"<unk_5>",
|
| 8 |
+
"<unk_6>",
|
| 9 |
+
"<unk_7>",
|
| 10 |
+
"<unk_8>",
|
| 11 |
+
"<unk_9>",
|
| 12 |
+
"<unk_10>",
|
| 13 |
+
"<unk_11>",
|
| 14 |
+
"<unk_12>",
|
| 15 |
+
"<unk_13>",
|
| 16 |
+
"<unk_14>",
|
| 17 |
+
"<unk_15>",
|
| 18 |
+
"<unk_16>",
|
| 19 |
+
"<unk_17>",
|
| 20 |
+
"<unk_18>",
|
| 21 |
+
"<unk_19>",
|
| 22 |
+
"<unk_20>",
|
| 23 |
+
"<unk_21>",
|
| 24 |
+
"<unk_22>",
|
| 25 |
+
"<unk_23>",
|
| 26 |
+
"<unk_24>",
|
| 27 |
+
"<unk_25>",
|
| 28 |
+
"<unk_26>",
|
| 29 |
+
"<unk_27>",
|
| 30 |
+
"<unk_28>",
|
| 31 |
+
"<unk_29>",
|
| 32 |
+
"<unk_30>",
|
| 33 |
+
"<unk_31>",
|
| 34 |
+
"<unk_32>",
|
| 35 |
+
"<unk_33>",
|
| 36 |
+
"<unk_34>",
|
| 37 |
+
"<unk_35>",
|
| 38 |
+
"<unk_36>",
|
| 39 |
+
"<unk_37>",
|
| 40 |
+
"<unk_38>",
|
| 41 |
+
"<unk_39>",
|
| 42 |
+
"<unk_40>",
|
| 43 |
+
"<unk_41>",
|
| 44 |
+
"<unk_42>",
|
| 45 |
+
"<unk_43>",
|
| 46 |
+
"<unk_44>",
|
| 47 |
+
"<unk_45>",
|
| 48 |
+
"<unk_46>",
|
| 49 |
+
"<unk_47>",
|
| 50 |
+
"<unk_48>",
|
| 51 |
+
"<unk_49>",
|
| 52 |
+
"<unk_50>",
|
| 53 |
+
"<unk_51>",
|
| 54 |
+
"<unk_52>",
|
| 55 |
+
"<unk_53>",
|
| 56 |
+
"<unk_54>",
|
| 57 |
+
"<unk_55>",
|
| 58 |
+
"<unk_56>",
|
| 59 |
+
"<unk_57>",
|
| 60 |
+
"<unk_58>",
|
| 61 |
+
"<unk_59>",
|
| 62 |
+
"<unk_60>",
|
| 63 |
+
"<unk_61>",
|
| 64 |
+
"<unk_62>",
|
| 65 |
+
"<unk_63>",
|
| 66 |
+
"<unk_64>",
|
| 67 |
+
"<unk_65>",
|
| 68 |
+
"<unk_66>",
|
| 69 |
+
"<unk_67>",
|
| 70 |
+
"<unk_68>",
|
| 71 |
+
"<unk_69>",
|
| 72 |
+
"<unk_70>",
|
| 73 |
+
"<unk_71>",
|
| 74 |
+
"<unk_72>",
|
| 75 |
+
"<unk_73>",
|
| 76 |
+
"<unk_74>",
|
| 77 |
+
"<unk_75>",
|
| 78 |
+
"<unk_76>",
|
| 79 |
+
"<unk_77>",
|
| 80 |
+
"<unk_78>",
|
| 81 |
+
"<unk_79>",
|
| 82 |
+
"<unk_80>",
|
| 83 |
+
"<unk_81>",
|
| 84 |
+
"<unk_82>",
|
| 85 |
+
"<unk_83>",
|
| 86 |
+
"<unk_84>",
|
| 87 |
+
"<unk_85>",
|
| 88 |
+
"<unk_86>",
|
| 89 |
+
"<unk_87>",
|
| 90 |
+
"<unk_88>",
|
| 91 |
+
"<unk_89>",
|
| 92 |
+
"<unk_90>",
|
| 93 |
+
"<unk_91>",
|
| 94 |
+
"<unk_92>",
|
| 95 |
+
"<unk_93>",
|
| 96 |
+
"<unk_94>",
|
| 97 |
+
"<unk_95>",
|
| 98 |
+
"<unk_96>",
|
| 99 |
+
"<unk_97>",
|
| 100 |
+
"<unk_98>",
|
| 101 |
+
"<unk_99>",
|
| 102 |
+
"<unk_100>",
|
| 103 |
+
"<unk_101>",
|
| 104 |
+
"<unk_102>"
|
| 105 |
+
],
|
| 106 |
+
"eos_token": "</s>",
|
| 107 |
+
"mask_token": "<mask_2>",
|
| 108 |
+
"pad_token": "<pad>",
|
| 109 |
+
"unk_token": "<unk>"
|
| 110 |
+
}
|
fine_tune_tokenizer_pegasus_50/spiece.model
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0015189ef36359283fec8b93cf6d9ce51bca37eb1101defc68a53b394913b96c
|
| 3 |
+
size 1912529
|
fine_tune_tokenizer_pegasus_50/tokenizer.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
fine_tune_tokenizer_pegasus_50/tokenizer_config.json
ADDED
|
@@ -0,0 +1,967 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"added_tokens_decoder": {
|
| 3 |
+
"0": {
|
| 4 |
+
"content": "<pad>",
|
| 5 |
+
"lstrip": false,
|
| 6 |
+
"normalized": false,
|
| 7 |
+
"rstrip": false,
|
| 8 |
+
"single_word": false,
|
| 9 |
+
"special": true
|
| 10 |
+
},
|
| 11 |
+
"1": {
|
| 12 |
+
"content": "</s>",
|
| 13 |
+
"lstrip": false,
|
| 14 |
+
"normalized": false,
|
| 15 |
+
"rstrip": false,
|
| 16 |
+
"single_word": false,
|
| 17 |
+
"special": true
|
| 18 |
+
},
|
| 19 |
+
"2": {
|
| 20 |
+
"content": "<mask_1>",
|
| 21 |
+
"lstrip": false,
|
| 22 |
+
"normalized": false,
|
| 23 |
+
"rstrip": false,
|
| 24 |
+
"single_word": false,
|
| 25 |
+
"special": true
|
| 26 |
+
},
|
| 27 |
+
"3": {
|
| 28 |
+
"content": "<mask_2>",
|
| 29 |
+
"lstrip": false,
|
| 30 |
+
"normalized": false,
|
| 31 |
+
"rstrip": false,
|
| 32 |
+
"single_word": false,
|
| 33 |
+
"special": true
|
| 34 |
+
},
|
| 35 |
+
"4": {
|
| 36 |
+
"content": "<unk_2>",
|
| 37 |
+
"lstrip": false,
|
| 38 |
+
"normalized": false,
|
| 39 |
+
"rstrip": false,
|
| 40 |
+
"single_word": false,
|
| 41 |
+
"special": true
|
| 42 |
+
},
|
| 43 |
+
"5": {
|
| 44 |
+
"content": "<unk_3>",
|
| 45 |
+
"lstrip": false,
|
| 46 |
+
"normalized": false,
|
| 47 |
+
"rstrip": false,
|
| 48 |
+
"single_word": false,
|
| 49 |
+
"special": true
|
| 50 |
+
},
|
| 51 |
+
"6": {
|
| 52 |
+
"content": "<unk_4>",
|
| 53 |
+
"lstrip": false,
|
| 54 |
+
"normalized": false,
|
| 55 |
+
"rstrip": false,
|
| 56 |
+
"single_word": false,
|
| 57 |
+
"special": true
|
| 58 |
+
},
|
| 59 |
+
"7": {
|
| 60 |
+
"content": "<unk_5>",
|
| 61 |
+
"lstrip": false,
|
| 62 |
+
"normalized": false,
|
| 63 |
+
"rstrip": false,
|
| 64 |
+
"single_word": false,
|
| 65 |
+
"special": true
|
| 66 |
+
},
|
| 67 |
+
"8": {
|
| 68 |
+
"content": "<unk_6>",
|
| 69 |
+
"lstrip": false,
|
| 70 |
+
"normalized": false,
|
| 71 |
+
"rstrip": false,
|
| 72 |
+
"single_word": false,
|
| 73 |
+
"special": true
|
| 74 |
+
},
|
| 75 |
+
"9": {
|
| 76 |
+
"content": "<unk_7>",
|
| 77 |
+
"lstrip": false,
|
| 78 |
+
"normalized": false,
|
| 79 |
+
"rstrip": false,
|
| 80 |
+
"single_word": false,
|
| 81 |
+
"special": true
|
| 82 |
+
},
|
| 83 |
+
"10": {
|
| 84 |
+
"content": "<unk_8>",
|
| 85 |
+
"lstrip": false,
|
| 86 |
+
"normalized": false,
|
| 87 |
+
"rstrip": false,
|
| 88 |
+
"single_word": false,
|
| 89 |
+
"special": true
|
| 90 |
+
},
|
| 91 |
+
"11": {
|
| 92 |
+
"content": "<unk_9>",
|
| 93 |
+
"lstrip": false,
|
| 94 |
+
"normalized": false,
|
| 95 |
+
"rstrip": false,
|
| 96 |
+
"single_word": false,
|
| 97 |
+
"special": true
|
| 98 |
+
},
|
| 99 |
+
"12": {
|
| 100 |
+
"content": "<unk_10>",
|
| 101 |
+
"lstrip": false,
|
| 102 |
+
"normalized": false,
|
| 103 |
+
"rstrip": false,
|
| 104 |
+
"single_word": false,
|
| 105 |
+
"special": true
|
| 106 |
+
},
|
| 107 |
+
"13": {
|
| 108 |
+
"content": "<unk_11>",
|
| 109 |
+
"lstrip": false,
|
| 110 |
+
"normalized": false,
|
| 111 |
+
"rstrip": false,
|
| 112 |
+
"single_word": false,
|
| 113 |
+
"special": true
|
| 114 |
+
},
|
| 115 |
+
"14": {
|
| 116 |
+
"content": "<unk_12>",
|
| 117 |
+
"lstrip": false,
|
| 118 |
+
"normalized": false,
|
| 119 |
+
"rstrip": false,
|
| 120 |
+
"single_word": false,
|
| 121 |
+
"special": true
|
| 122 |
+
},
|
| 123 |
+
"15": {
|
| 124 |
+
"content": "<unk_13>",
|
| 125 |
+
"lstrip": false,
|
| 126 |
+
"normalized": false,
|
| 127 |
+
"rstrip": false,
|
| 128 |
+
"single_word": false,
|
| 129 |
+
"special": true
|
| 130 |
+
},
|
| 131 |
+
"16": {
|
| 132 |
+
"content": "<unk_14>",
|
| 133 |
+
"lstrip": false,
|
| 134 |
+
"normalized": false,
|
| 135 |
+
"rstrip": false,
|
| 136 |
+
"single_word": false,
|
| 137 |
+
"special": true
|
| 138 |
+
},
|
| 139 |
+
"17": {
|
| 140 |
+
"content": "<unk_15>",
|
| 141 |
+
"lstrip": false,
|
| 142 |
+
"normalized": false,
|
| 143 |
+
"rstrip": false,
|
| 144 |
+
"single_word": false,
|
| 145 |
+
"special": true
|
| 146 |
+
},
|
| 147 |
+
"18": {
|
| 148 |
+
"content": "<unk_16>",
|
| 149 |
+
"lstrip": false,
|
| 150 |
+
"normalized": false,
|
| 151 |
+
"rstrip": false,
|
| 152 |
+
"single_word": false,
|
| 153 |
+
"special": true
|
| 154 |
+
},
|
| 155 |
+
"19": {
|
| 156 |
+
"content": "<unk_17>",
|
| 157 |
+
"lstrip": false,
|
| 158 |
+
"normalized": false,
|
| 159 |
+
"rstrip": false,
|
| 160 |
+
"single_word": false,
|
| 161 |
+
"special": true
|
| 162 |
+
},
|
| 163 |
+
"20": {
|
| 164 |
+
"content": "<unk_18>",
|
| 165 |
+
"lstrip": false,
|
| 166 |
+
"normalized": false,
|
| 167 |
+
"rstrip": false,
|
| 168 |
+
"single_word": false,
|
| 169 |
+
"special": true
|
| 170 |
+
},
|
| 171 |
+
"21": {
|
| 172 |
+
"content": "<unk_19>",
|
| 173 |
+
"lstrip": false,
|
| 174 |
+
"normalized": false,
|
| 175 |
+
"rstrip": false,
|
| 176 |
+
"single_word": false,
|
| 177 |
+
"special": true
|
| 178 |
+
},
|
| 179 |
+
"22": {
|
| 180 |
+
"content": "<unk_20>",
|
| 181 |
+
"lstrip": false,
|
| 182 |
+
"normalized": false,
|
| 183 |
+
"rstrip": false,
|
| 184 |
+
"single_word": false,
|
| 185 |
+
"special": true
|
| 186 |
+
},
|
| 187 |
+
"23": {
|
| 188 |
+
"content": "<unk_21>",
|
| 189 |
+
"lstrip": false,
|
| 190 |
+
"normalized": false,
|
| 191 |
+
"rstrip": false,
|
| 192 |
+
"single_word": false,
|
| 193 |
+
"special": true
|
| 194 |
+
},
|
| 195 |
+
"24": {
|
| 196 |
+
"content": "<unk_22>",
|
| 197 |
+
"lstrip": false,
|
| 198 |
+
"normalized": false,
|
| 199 |
+
"rstrip": false,
|
| 200 |
+
"single_word": false,
|
| 201 |
+
"special": true
|
| 202 |
+
},
|
| 203 |
+
"25": {
|
| 204 |
+
"content": "<unk_23>",
|
| 205 |
+
"lstrip": false,
|
| 206 |
+
"normalized": false,
|
| 207 |
+
"rstrip": false,
|
| 208 |
+
"single_word": false,
|
| 209 |
+
"special": true
|
| 210 |
+
},
|
| 211 |
+
"26": {
|
| 212 |
+
"content": "<unk_24>",
|
| 213 |
+
"lstrip": false,
|
| 214 |
+
"normalized": false,
|
| 215 |
+
"rstrip": false,
|
| 216 |
+
"single_word": false,
|
| 217 |
+
"special": true
|
| 218 |
+
},
|
| 219 |
+
"27": {
|
| 220 |
+
"content": "<unk_25>",
|
| 221 |
+
"lstrip": false,
|
| 222 |
+
"normalized": false,
|
| 223 |
+
"rstrip": false,
|
| 224 |
+
"single_word": false,
|
| 225 |
+
"special": true
|
| 226 |
+
},
|
| 227 |
+
"28": {
|
| 228 |
+
"content": "<unk_26>",
|
| 229 |
+
"lstrip": false,
|
| 230 |
+
"normalized": false,
|
| 231 |
+
"rstrip": false,
|
| 232 |
+
"single_word": false,
|
| 233 |
+
"special": true
|
| 234 |
+
},
|
| 235 |
+
"29": {
|
| 236 |
+
"content": "<unk_27>",
|
| 237 |
+
"lstrip": false,
|
| 238 |
+
"normalized": false,
|
| 239 |
+
"rstrip": false,
|
| 240 |
+
"single_word": false,
|
| 241 |
+
"special": true
|
| 242 |
+
},
|
| 243 |
+
"30": {
|
| 244 |
+
"content": "<unk_28>",
|
| 245 |
+
"lstrip": false,
|
| 246 |
+
"normalized": false,
|
| 247 |
+
"rstrip": false,
|
| 248 |
+
"single_word": false,
|
| 249 |
+
"special": true
|
| 250 |
+
},
|
| 251 |
+
"31": {
|
| 252 |
+
"content": "<unk_29>",
|
| 253 |
+
"lstrip": false,
|
| 254 |
+
"normalized": false,
|
| 255 |
+
"rstrip": false,
|
| 256 |
+
"single_word": false,
|
| 257 |
+
"special": true
|
| 258 |
+
},
|
| 259 |
+
"32": {
|
| 260 |
+
"content": "<unk_30>",
|
| 261 |
+
"lstrip": false,
|
| 262 |
+
"normalized": false,
|
| 263 |
+
"rstrip": false,
|
| 264 |
+
"single_word": false,
|
| 265 |
+
"special": true
|
| 266 |
+
},
|
| 267 |
+
"33": {
|
| 268 |
+
"content": "<unk_31>",
|
| 269 |
+
"lstrip": false,
|
| 270 |
+
"normalized": false,
|
| 271 |
+
"rstrip": false,
|
| 272 |
+
"single_word": false,
|
| 273 |
+
"special": true
|
| 274 |
+
},
|
| 275 |
+
"34": {
|
| 276 |
+
"content": "<unk_32>",
|
| 277 |
+
"lstrip": false,
|
| 278 |
+
"normalized": false,
|
| 279 |
+
"rstrip": false,
|
| 280 |
+
"single_word": false,
|
| 281 |
+
"special": true
|
| 282 |
+
},
|
| 283 |
+
"35": {
|
| 284 |
+
"content": "<unk_33>",
|
| 285 |
+
"lstrip": false,
|
| 286 |
+
"normalized": false,
|
| 287 |
+
"rstrip": false,
|
| 288 |
+
"single_word": false,
|
| 289 |
+
"special": true
|
| 290 |
+
},
|
| 291 |
+
"36": {
|
| 292 |
+
"content": "<unk_34>",
|
| 293 |
+
"lstrip": false,
|
| 294 |
+
"normalized": false,
|
| 295 |
+
"rstrip": false,
|
| 296 |
+
"single_word": false,
|
| 297 |
+
"special": true
|
| 298 |
+
},
|
| 299 |
+
"37": {
|
| 300 |
+
"content": "<unk_35>",
|
| 301 |
+
"lstrip": false,
|
| 302 |
+
"normalized": false,
|
| 303 |
+
"rstrip": false,
|
| 304 |
+
"single_word": false,
|
| 305 |
+
"special": true
|
| 306 |
+
},
|
| 307 |
+
"38": {
|
| 308 |
+
"content": "<unk_36>",
|
| 309 |
+
"lstrip": false,
|
| 310 |
+
"normalized": false,
|
| 311 |
+
"rstrip": false,
|
| 312 |
+
"single_word": false,
|
| 313 |
+
"special": true
|
| 314 |
+
},
|
| 315 |
+
"39": {
|
| 316 |
+
"content": "<unk_37>",
|
| 317 |
+
"lstrip": false,
|
| 318 |
+
"normalized": false,
|
| 319 |
+
"rstrip": false,
|
| 320 |
+
"single_word": false,
|
| 321 |
+
"special": true
|
| 322 |
+
},
|
| 323 |
+
"40": {
|
| 324 |
+
"content": "<unk_38>",
|
| 325 |
+
"lstrip": false,
|
| 326 |
+
"normalized": false,
|
| 327 |
+
"rstrip": false,
|
| 328 |
+
"single_word": false,
|
| 329 |
+
"special": true
|
| 330 |
+
},
|
| 331 |
+
"41": {
|
| 332 |
+
"content": "<unk_39>",
|
| 333 |
+
"lstrip": false,
|
| 334 |
+
"normalized": false,
|
| 335 |
+
"rstrip": false,
|
| 336 |
+
"single_word": false,
|
| 337 |
+
"special": true
|
| 338 |
+
},
|
| 339 |
+
"42": {
|
| 340 |
+
"content": "<unk_40>",
|
| 341 |
+
"lstrip": false,
|
| 342 |
+
"normalized": false,
|
| 343 |
+
"rstrip": false,
|
| 344 |
+
"single_word": false,
|
| 345 |
+
"special": true
|
| 346 |
+
},
|
| 347 |
+
"43": {
|
| 348 |
+
"content": "<unk_41>",
|
| 349 |
+
"lstrip": false,
|
| 350 |
+
"normalized": false,
|
| 351 |
+
"rstrip": false,
|
| 352 |
+
"single_word": false,
|
| 353 |
+
"special": true
|
| 354 |
+
},
|
| 355 |
+
"44": {
|
| 356 |
+
"content": "<unk_42>",
|
| 357 |
+
"lstrip": false,
|
| 358 |
+
"normalized": false,
|
| 359 |
+
"rstrip": false,
|
| 360 |
+
"single_word": false,
|
| 361 |
+
"special": true
|
| 362 |
+
},
|
| 363 |
+
"45": {
|
| 364 |
+
"content": "<unk_43>",
|
| 365 |
+
"lstrip": false,
|
| 366 |
+
"normalized": false,
|
| 367 |
+
"rstrip": false,
|
| 368 |
+
"single_word": false,
|
| 369 |
+
"special": true
|
| 370 |
+
},
|
| 371 |
+
"46": {
|
| 372 |
+
"content": "<unk_44>",
|
| 373 |
+
"lstrip": false,
|
| 374 |
+
"normalized": false,
|
| 375 |
+
"rstrip": false,
|
| 376 |
+
"single_word": false,
|
| 377 |
+
"special": true
|
| 378 |
+
},
|
| 379 |
+
"47": {
|
| 380 |
+
"content": "<unk_45>",
|
| 381 |
+
"lstrip": false,
|
| 382 |
+
"normalized": false,
|
| 383 |
+
"rstrip": false,
|
| 384 |
+
"single_word": false,
|
| 385 |
+
"special": true
|
| 386 |
+
},
|
| 387 |
+
"48": {
|
| 388 |
+
"content": "<unk_46>",
|
| 389 |
+
"lstrip": false,
|
| 390 |
+
"normalized": false,
|
| 391 |
+
"rstrip": false,
|
| 392 |
+
"single_word": false,
|
| 393 |
+
"special": true
|
| 394 |
+
},
|
| 395 |
+
"49": {
|
| 396 |
+
"content": "<unk_47>",
|
| 397 |
+
"lstrip": false,
|
| 398 |
+
"normalized": false,
|
| 399 |
+
"rstrip": false,
|
| 400 |
+
"single_word": false,
|
| 401 |
+
"special": true
|
| 402 |
+
},
|
| 403 |
+
"50": {
|
| 404 |
+
"content": "<unk_48>",
|
| 405 |
+
"lstrip": false,
|
| 406 |
+
"normalized": false,
|
| 407 |
+
"rstrip": false,
|
| 408 |
+
"single_word": false,
|
| 409 |
+
"special": true
|
| 410 |
+
},
|
| 411 |
+
"51": {
|
| 412 |
+
"content": "<unk_49>",
|
| 413 |
+
"lstrip": false,
|
| 414 |
+
"normalized": false,
|
| 415 |
+
"rstrip": false,
|
| 416 |
+
"single_word": false,
|
| 417 |
+
"special": true
|
| 418 |
+
},
|
| 419 |
+
"52": {
|
| 420 |
+
"content": "<unk_50>",
|
| 421 |
+
"lstrip": false,
|
| 422 |
+
"normalized": false,
|
| 423 |
+
"rstrip": false,
|
| 424 |
+
"single_word": false,
|
| 425 |
+
"special": true
|
| 426 |
+
},
|
| 427 |
+
"53": {
|
| 428 |
+
"content": "<unk_51>",
|
| 429 |
+
"lstrip": false,
|
| 430 |
+
"normalized": false,
|
| 431 |
+
"rstrip": false,
|
| 432 |
+
"single_word": false,
|
| 433 |
+
"special": true
|
| 434 |
+
},
|
| 435 |
+
"54": {
|
| 436 |
+
"content": "<unk_52>",
|
| 437 |
+
"lstrip": false,
|
| 438 |
+
"normalized": false,
|
| 439 |
+
"rstrip": false,
|
| 440 |
+
"single_word": false,
|
| 441 |
+
"special": true
|
| 442 |
+
},
|
| 443 |
+
"55": {
|
| 444 |
+
"content": "<unk_53>",
|
| 445 |
+
"lstrip": false,
|
| 446 |
+
"normalized": false,
|
| 447 |
+
"rstrip": false,
|
| 448 |
+
"single_word": false,
|
| 449 |
+
"special": true
|
| 450 |
+
},
|
| 451 |
+
"56": {
|
| 452 |
+
"content": "<unk_54>",
|
| 453 |
+
"lstrip": false,
|
| 454 |
+
"normalized": false,
|
| 455 |
+
"rstrip": false,
|
| 456 |
+
"single_word": false,
|
| 457 |
+
"special": true
|
| 458 |
+
},
|
| 459 |
+
"57": {
|
| 460 |
+
"content": "<unk_55>",
|
| 461 |
+
"lstrip": false,
|
| 462 |
+
"normalized": false,
|
| 463 |
+
"rstrip": false,
|
| 464 |
+
"single_word": false,
|
| 465 |
+
"special": true
|
| 466 |
+
},
|
| 467 |
+
"58": {
|
| 468 |
+
"content": "<unk_56>",
|
| 469 |
+
"lstrip": false,
|
| 470 |
+
"normalized": false,
|
| 471 |
+
"rstrip": false,
|
| 472 |
+
"single_word": false,
|
| 473 |
+
"special": true
|
| 474 |
+
},
|
| 475 |
+
"59": {
|
| 476 |
+
"content": "<unk_57>",
|
| 477 |
+
"lstrip": false,
|
| 478 |
+
"normalized": false,
|
| 479 |
+
"rstrip": false,
|
| 480 |
+
"single_word": false,
|
| 481 |
+
"special": true
|
| 482 |
+
},
|
| 483 |
+
"60": {
|
| 484 |
+
"content": "<unk_58>",
|
| 485 |
+
"lstrip": false,
|
| 486 |
+
"normalized": false,
|
| 487 |
+
"rstrip": false,
|
| 488 |
+
"single_word": false,
|
| 489 |
+
"special": true
|
| 490 |
+
},
|
| 491 |
+
"61": {
|
| 492 |
+
"content": "<unk_59>",
|
| 493 |
+
"lstrip": false,
|
| 494 |
+
"normalized": false,
|
| 495 |
+
"rstrip": false,
|
| 496 |
+
"single_word": false,
|
| 497 |
+
"special": true
|
| 498 |
+
},
|
| 499 |
+
"62": {
|
| 500 |
+
"content": "<unk_60>",
|
| 501 |
+
"lstrip": false,
|
| 502 |
+
"normalized": false,
|
| 503 |
+
"rstrip": false,
|
| 504 |
+
"single_word": false,
|
| 505 |
+
"special": true
|
| 506 |
+
},
|
| 507 |
+
"63": {
|
| 508 |
+
"content": "<unk_61>",
|
| 509 |
+
"lstrip": false,
|
| 510 |
+
"normalized": false,
|
| 511 |
+
"rstrip": false,
|
| 512 |
+
"single_word": false,
|
| 513 |
+
"special": true
|
| 514 |
+
},
|
| 515 |
+
"64": {
|
| 516 |
+
"content": "<unk_62>",
|
| 517 |
+
"lstrip": false,
|
| 518 |
+
"normalized": false,
|
| 519 |
+
"rstrip": false,
|
| 520 |
+
"single_word": false,
|
| 521 |
+
"special": true
|
| 522 |
+
},
|
| 523 |
+
"65": {
|
| 524 |
+
"content": "<unk_63>",
|
| 525 |
+
"lstrip": false,
|
| 526 |
+
"normalized": false,
|
| 527 |
+
"rstrip": false,
|
| 528 |
+
"single_word": false,
|
| 529 |
+
"special": true
|
| 530 |
+
},
|
| 531 |
+
"66": {
|
| 532 |
+
"content": "<unk_64>",
|
| 533 |
+
"lstrip": false,
|
| 534 |
+
"normalized": false,
|
| 535 |
+
"rstrip": false,
|
| 536 |
+
"single_word": false,
|
| 537 |
+
"special": true
|
| 538 |
+
},
|
| 539 |
+
"67": {
|
| 540 |
+
"content": "<unk_65>",
|
| 541 |
+
"lstrip": false,
|
| 542 |
+
"normalized": false,
|
| 543 |
+
"rstrip": false,
|
| 544 |
+
"single_word": false,
|
| 545 |
+
"special": true
|
| 546 |
+
},
|
| 547 |
+
"68": {
|
| 548 |
+
"content": "<unk_66>",
|
| 549 |
+
"lstrip": false,
|
| 550 |
+
"normalized": false,
|
| 551 |
+
"rstrip": false,
|
| 552 |
+
"single_word": false,
|
| 553 |
+
"special": true
|
| 554 |
+
},
|
| 555 |
+
"69": {
|
| 556 |
+
"content": "<unk_67>",
|
| 557 |
+
"lstrip": false,
|
| 558 |
+
"normalized": false,
|
| 559 |
+
"rstrip": false,
|
| 560 |
+
"single_word": false,
|
| 561 |
+
"special": true
|
| 562 |
+
},
|
| 563 |
+
"70": {
|
| 564 |
+
"content": "<unk_68>",
|
| 565 |
+
"lstrip": false,
|
| 566 |
+
"normalized": false,
|
| 567 |
+
"rstrip": false,
|
| 568 |
+
"single_word": false,
|
| 569 |
+
"special": true
|
| 570 |
+
},
|
| 571 |
+
"71": {
|
| 572 |
+
"content": "<unk_69>",
|
| 573 |
+
"lstrip": false,
|
| 574 |
+
"normalized": false,
|
| 575 |
+
"rstrip": false,
|
| 576 |
+
"single_word": false,
|
| 577 |
+
"special": true
|
| 578 |
+
},
|
| 579 |
+
"72": {
|
| 580 |
+
"content": "<unk_70>",
|
| 581 |
+
"lstrip": false,
|
| 582 |
+
"normalized": false,
|
| 583 |
+
"rstrip": false,
|
| 584 |
+
"single_word": false,
|
| 585 |
+
"special": true
|
| 586 |
+
},
|
| 587 |
+
"73": {
|
| 588 |
+
"content": "<unk_71>",
|
| 589 |
+
"lstrip": false,
|
| 590 |
+
"normalized": false,
|
| 591 |
+
"rstrip": false,
|
| 592 |
+
"single_word": false,
|
| 593 |
+
"special": true
|
| 594 |
+
},
|
| 595 |
+
"74": {
|
| 596 |
+
"content": "<unk_72>",
|
| 597 |
+
"lstrip": false,
|
| 598 |
+
"normalized": false,
|
| 599 |
+
"rstrip": false,
|
| 600 |
+
"single_word": false,
|
| 601 |
+
"special": true
|
| 602 |
+
},
|
| 603 |
+
"75": {
|
| 604 |
+
"content": "<unk_73>",
|
| 605 |
+
"lstrip": false,
|
| 606 |
+
"normalized": false,
|
| 607 |
+
"rstrip": false,
|
| 608 |
+
"single_word": false,
|
| 609 |
+
"special": true
|
| 610 |
+
},
|
| 611 |
+
"76": {
|
| 612 |
+
"content": "<unk_74>",
|
| 613 |
+
"lstrip": false,
|
| 614 |
+
"normalized": false,
|
| 615 |
+
"rstrip": false,
|
| 616 |
+
"single_word": false,
|
| 617 |
+
"special": true
|
| 618 |
+
},
|
| 619 |
+
"77": {
|
| 620 |
+
"content": "<unk_75>",
|
| 621 |
+
"lstrip": false,
|
| 622 |
+
"normalized": false,
|
| 623 |
+
"rstrip": false,
|
| 624 |
+
"single_word": false,
|
| 625 |
+
"special": true
|
| 626 |
+
},
|
| 627 |
+
"78": {
|
| 628 |
+
"content": "<unk_76>",
|
| 629 |
+
"lstrip": false,
|
| 630 |
+
"normalized": false,
|
| 631 |
+
"rstrip": false,
|
| 632 |
+
"single_word": false,
|
| 633 |
+
"special": true
|
| 634 |
+
},
|
| 635 |
+
"79": {
|
| 636 |
+
"content": "<unk_77>",
|
| 637 |
+
"lstrip": false,
|
| 638 |
+
"normalized": false,
|
| 639 |
+
"rstrip": false,
|
| 640 |
+
"single_word": false,
|
| 641 |
+
"special": true
|
| 642 |
+
},
|
| 643 |
+
"80": {
|
| 644 |
+
"content": "<unk_78>",
|
| 645 |
+
"lstrip": false,
|
| 646 |
+
"normalized": false,
|
| 647 |
+
"rstrip": false,
|
| 648 |
+
"single_word": false,
|
| 649 |
+
"special": true
|
| 650 |
+
},
|
| 651 |
+
"81": {
|
| 652 |
+
"content": "<unk_79>",
|
| 653 |
+
"lstrip": false,
|
| 654 |
+
"normalized": false,
|
| 655 |
+
"rstrip": false,
|
| 656 |
+
"single_word": false,
|
| 657 |
+
"special": true
|
| 658 |
+
},
|
| 659 |
+
"82": {
|
| 660 |
+
"content": "<unk_80>",
|
| 661 |
+
"lstrip": false,
|
| 662 |
+
"normalized": false,
|
| 663 |
+
"rstrip": false,
|
| 664 |
+
"single_word": false,
|
| 665 |
+
"special": true
|
| 666 |
+
},
|
| 667 |
+
"83": {
|
| 668 |
+
"content": "<unk_81>",
|
| 669 |
+
"lstrip": false,
|
| 670 |
+
"normalized": false,
|
| 671 |
+
"rstrip": false,
|
| 672 |
+
"single_word": false,
|
| 673 |
+
"special": true
|
| 674 |
+
},
|
| 675 |
+
"84": {
|
| 676 |
+
"content": "<unk_82>",
|
| 677 |
+
"lstrip": false,
|
| 678 |
+
"normalized": false,
|
| 679 |
+
"rstrip": false,
|
| 680 |
+
"single_word": false,
|
| 681 |
+
"special": true
|
| 682 |
+
},
|
| 683 |
+
"85": {
|
| 684 |
+
"content": "<unk_83>",
|
| 685 |
+
"lstrip": false,
|
| 686 |
+
"normalized": false,
|
| 687 |
+
"rstrip": false,
|
| 688 |
+
"single_word": false,
|
| 689 |
+
"special": true
|
| 690 |
+
},
|
| 691 |
+
"86": {
|
| 692 |
+
"content": "<unk_84>",
|
| 693 |
+
"lstrip": false,
|
| 694 |
+
"normalized": false,
|
| 695 |
+
"rstrip": false,
|
| 696 |
+
"single_word": false,
|
| 697 |
+
"special": true
|
| 698 |
+
},
|
| 699 |
+
"87": {
|
| 700 |
+
"content": "<unk_85>",
|
| 701 |
+
"lstrip": false,
|
| 702 |
+
"normalized": false,
|
| 703 |
+
"rstrip": false,
|
| 704 |
+
"single_word": false,
|
| 705 |
+
"special": true
|
| 706 |
+
},
|
| 707 |
+
"88": {
|
| 708 |
+
"content": "<unk_86>",
|
| 709 |
+
"lstrip": false,
|
| 710 |
+
"normalized": false,
|
| 711 |
+
"rstrip": false,
|
| 712 |
+
"single_word": false,
|
| 713 |
+
"special": true
|
| 714 |
+
},
|
| 715 |
+
"89": {
|
| 716 |
+
"content": "<unk_87>",
|
| 717 |
+
"lstrip": false,
|
| 718 |
+
"normalized": false,
|
| 719 |
+
"rstrip": false,
|
| 720 |
+
"single_word": false,
|
| 721 |
+
"special": true
|
| 722 |
+
},
|
| 723 |
+
"90": {
|
| 724 |
+
"content": "<unk_88>",
|
| 725 |
+
"lstrip": false,
|
| 726 |
+
"normalized": false,
|
| 727 |
+
"rstrip": false,
|
| 728 |
+
"single_word": false,
|
| 729 |
+
"special": true
|
| 730 |
+
},
|
| 731 |
+
"91": {
|
| 732 |
+
"content": "<unk_89>",
|
| 733 |
+
"lstrip": false,
|
| 734 |
+
"normalized": false,
|
| 735 |
+
"rstrip": false,
|
| 736 |
+
"single_word": false,
|
| 737 |
+
"special": true
|
| 738 |
+
},
|
| 739 |
+
"92": {
|
| 740 |
+
"content": "<unk_90>",
|
| 741 |
+
"lstrip": false,
|
| 742 |
+
"normalized": false,
|
| 743 |
+
"rstrip": false,
|
| 744 |
+
"single_word": false,
|
| 745 |
+
"special": true
|
| 746 |
+
},
|
| 747 |
+
"93": {
|
| 748 |
+
"content": "<unk_91>",
|
| 749 |
+
"lstrip": false,
|
| 750 |
+
"normalized": false,
|
| 751 |
+
"rstrip": false,
|
| 752 |
+
"single_word": false,
|
| 753 |
+
"special": true
|
| 754 |
+
},
|
| 755 |
+
"94": {
|
| 756 |
+
"content": "<unk_92>",
|
| 757 |
+
"lstrip": false,
|
| 758 |
+
"normalized": false,
|
| 759 |
+
"rstrip": false,
|
| 760 |
+
"single_word": false,
|
| 761 |
+
"special": true
|
| 762 |
+
},
|
| 763 |
+
"95": {
|
| 764 |
+
"content": "<unk_93>",
|
| 765 |
+
"lstrip": false,
|
| 766 |
+
"normalized": false,
|
| 767 |
+
"rstrip": false,
|
| 768 |
+
"single_word": false,
|
| 769 |
+
"special": true
|
| 770 |
+
},
|
| 771 |
+
"96": {
|
| 772 |
+
"content": "<unk_94>",
|
| 773 |
+
"lstrip": false,
|
| 774 |
+
"normalized": false,
|
| 775 |
+
"rstrip": false,
|
| 776 |
+
"single_word": false,
|
| 777 |
+
"special": true
|
| 778 |
+
},
|
| 779 |
+
"97": {
|
| 780 |
+
"content": "<unk_95>",
|
| 781 |
+
"lstrip": false,
|
| 782 |
+
"normalized": false,
|
| 783 |
+
"rstrip": false,
|
| 784 |
+
"single_word": false,
|
| 785 |
+
"special": true
|
| 786 |
+
},
|
| 787 |
+
"98": {
|
| 788 |
+
"content": "<unk_96>",
|
| 789 |
+
"lstrip": false,
|
| 790 |
+
"normalized": false,
|
| 791 |
+
"rstrip": false,
|
| 792 |
+
"single_word": false,
|
| 793 |
+
"special": true
|
| 794 |
+
},
|
| 795 |
+
"99": {
|
| 796 |
+
"content": "<unk_97>",
|
| 797 |
+
"lstrip": false,
|
| 798 |
+
"normalized": false,
|
| 799 |
+
"rstrip": false,
|
| 800 |
+
"single_word": false,
|
| 801 |
+
"special": true
|
| 802 |
+
},
|
| 803 |
+
"100": {
|
| 804 |
+
"content": "<unk_98>",
|
| 805 |
+
"lstrip": false,
|
| 806 |
+
"normalized": false,
|
| 807 |
+
"rstrip": false,
|
| 808 |
+
"single_word": false,
|
| 809 |
+
"special": true
|
| 810 |
+
},
|
| 811 |
+
"101": {
|
| 812 |
+
"content": "<unk_99>",
|
| 813 |
+
"lstrip": false,
|
| 814 |
+
"normalized": false,
|
| 815 |
+
"rstrip": false,
|
| 816 |
+
"single_word": false,
|
| 817 |
+
"special": true
|
| 818 |
+
},
|
| 819 |
+
"102": {
|
| 820 |
+
"content": "<unk_100>",
|
| 821 |
+
"lstrip": false,
|
| 822 |
+
"normalized": false,
|
| 823 |
+
"rstrip": false,
|
| 824 |
+
"single_word": false,
|
| 825 |
+
"special": true
|
| 826 |
+
},
|
| 827 |
+
"103": {
|
| 828 |
+
"content": "<unk_101>",
|
| 829 |
+
"lstrip": false,
|
| 830 |
+
"normalized": false,
|
| 831 |
+
"rstrip": false,
|
| 832 |
+
"single_word": false,
|
| 833 |
+
"special": true
|
| 834 |
+
},
|
| 835 |
+
"104": {
|
| 836 |
+
"content": "<unk_102>",
|
| 837 |
+
"lstrip": false,
|
| 838 |
+
"normalized": false,
|
| 839 |
+
"rstrip": false,
|
| 840 |
+
"single_word": false,
|
| 841 |
+
"special": true
|
| 842 |
+
},
|
| 843 |
+
"105": {
|
| 844 |
+
"content": "<unk>",
|
| 845 |
+
"lstrip": false,
|
| 846 |
+
"normalized": false,
|
| 847 |
+
"rstrip": false,
|
| 848 |
+
"single_word": false,
|
| 849 |
+
"special": true
|
| 850 |
+
}
|
| 851 |
+
},
|
| 852 |
+
"additional_special_tokens": [
|
| 853 |
+
"<mask_1>",
|
| 854 |
+
"<unk_2>",
|
| 855 |
+
"<unk_3>",
|
| 856 |
+
"<unk_4>",
|
| 857 |
+
"<unk_5>",
|
| 858 |
+
"<unk_6>",
|
| 859 |
+
"<unk_7>",
|
| 860 |
+
"<unk_8>",
|
| 861 |
+
"<unk_9>",
|
| 862 |
+
"<unk_10>",
|
| 863 |
+
"<unk_11>",
|
| 864 |
+
"<unk_12>",
|
| 865 |
+
"<unk_13>",
|
| 866 |
+
"<unk_14>",
|
| 867 |
+
"<unk_15>",
|
| 868 |
+
"<unk_16>",
|
| 869 |
+
"<unk_17>",
|
| 870 |
+
"<unk_18>",
|
| 871 |
+
"<unk_19>",
|
| 872 |
+
"<unk_20>",
|
| 873 |
+
"<unk_21>",
|
| 874 |
+
"<unk_22>",
|
| 875 |
+
"<unk_23>",
|
| 876 |
+
"<unk_24>",
|
| 877 |
+
"<unk_25>",
|
| 878 |
+
"<unk_26>",
|
| 879 |
+
"<unk_27>",
|
| 880 |
+
"<unk_28>",
|
| 881 |
+
"<unk_29>",
|
| 882 |
+
"<unk_30>",
|
| 883 |
+
"<unk_31>",
|
| 884 |
+
"<unk_32>",
|
| 885 |
+
"<unk_33>",
|
| 886 |
+
"<unk_34>",
|
| 887 |
+
"<unk_35>",
|
| 888 |
+
"<unk_36>",
|
| 889 |
+
"<unk_37>",
|
| 890 |
+
"<unk_38>",
|
| 891 |
+
"<unk_39>",
|
| 892 |
+
"<unk_40>",
|
| 893 |
+
"<unk_41>",
|
| 894 |
+
"<unk_42>",
|
| 895 |
+
"<unk_43>",
|
| 896 |
+
"<unk_44>",
|
| 897 |
+
"<unk_45>",
|
| 898 |
+
"<unk_46>",
|
| 899 |
+
"<unk_47>",
|
| 900 |
+
"<unk_48>",
|
| 901 |
+
"<unk_49>",
|
| 902 |
+
"<unk_50>",
|
| 903 |
+
"<unk_51>",
|
| 904 |
+
"<unk_52>",
|
| 905 |
+
"<unk_53>",
|
| 906 |
+
"<unk_54>",
|
| 907 |
+
"<unk_55>",
|
| 908 |
+
"<unk_56>",
|
| 909 |
+
"<unk_57>",
|
| 910 |
+
"<unk_58>",
|
| 911 |
+
"<unk_59>",
|
| 912 |
+
"<unk_60>",
|
| 913 |
+
"<unk_61>",
|
| 914 |
+
"<unk_62>",
|
| 915 |
+
"<unk_63>",
|
| 916 |
+
"<unk_64>",
|
| 917 |
+
"<unk_65>",
|
| 918 |
+
"<unk_66>",
|
| 919 |
+
"<unk_67>",
|
| 920 |
+
"<unk_68>",
|
| 921 |
+
"<unk_69>",
|
| 922 |
+
"<unk_70>",
|
| 923 |
+
"<unk_71>",
|
| 924 |
+
"<unk_72>",
|
| 925 |
+
"<unk_73>",
|
| 926 |
+
"<unk_74>",
|
| 927 |
+
"<unk_75>",
|
| 928 |
+
"<unk_76>",
|
| 929 |
+
"<unk_77>",
|
| 930 |
+
"<unk_78>",
|
| 931 |
+
"<unk_79>",
|
| 932 |
+
"<unk_80>",
|
| 933 |
+
"<unk_81>",
|
| 934 |
+
"<unk_82>",
|
| 935 |
+
"<unk_83>",
|
| 936 |
+
"<unk_84>",
|
| 937 |
+
"<unk_85>",
|
| 938 |
+
"<unk_86>",
|
| 939 |
+
"<unk_87>",
|
| 940 |
+
"<unk_88>",
|
| 941 |
+
"<unk_89>",
|
| 942 |
+
"<unk_90>",
|
| 943 |
+
"<unk_91>",
|
| 944 |
+
"<unk_92>",
|
| 945 |
+
"<unk_93>",
|
| 946 |
+
"<unk_94>",
|
| 947 |
+
"<unk_95>",
|
| 948 |
+
"<unk_96>",
|
| 949 |
+
"<unk_97>",
|
| 950 |
+
"<unk_98>",
|
| 951 |
+
"<unk_99>",
|
| 952 |
+
"<unk_100>",
|
| 953 |
+
"<unk_101>",
|
| 954 |
+
"<unk_102>"
|
| 955 |
+
],
|
| 956 |
+
"clean_up_tokenization_spaces": true,
|
| 957 |
+
"eos_token": "</s>",
|
| 958 |
+
"full_tokenizer_file": null,
|
| 959 |
+
"mask_token": "<mask_2>",
|
| 960 |
+
"mask_token_sent": "<mask_1>",
|
| 961 |
+
"model_max_length": 1024,
|
| 962 |
+
"offset": 103,
|
| 963 |
+
"pad_token": "<pad>",
|
| 964 |
+
"sp_model_kwargs": {},
|
| 965 |
+
"tokenizer_class": "PegasusTokenizer",
|
| 966 |
+
"unk_token": "<unk>"
|
| 967 |
+
}
|
requirements.txt
ADDED
|
@@ -0,0 +1,2 @@
|
|
|
|
|
|
|
|
|
|
| 1 |
+
torch=2.0.1
|
| 2 |
+
transformers=4.40.1
|