Questions_to_Query_Templates_LORA / custom_generation_config_bigModel.yaml

Re-add large JSON file using Git LFS after cleaning history

afbe3a2 over 1 year ago

4.15 kB

	# Config for running the InferenceRecipe in generate.py to generate output from an LLM
	#
	# To launch, run the following command from root torchtune directory:
	# tune run generate --config generation

	# Model arguments
	model:
	_component_: torchtune.models.llama3.llama3_8b

	checkpointer:
	_component_: torchtune.utils.FullModelMetaCheckpointer

	checkpoint_dir: /home/aorogat/q_to_template/
	checkpoint_files: [
	meta_model_0.pt
	]
	output_dir: /home/aorogat/q_to_template/
	model_type: LLAMA3

	device: cuda
	dtype: bf16

	seed: 1234

	# Tokenizer arguments
	tokenizer:
	_component_: torchtune.models.llama3.llama3_tokenizer
	path: /home/aorogat/Meta-Llama-3-8B/original/tokenizer.model

	# Generation arguments; defaults taken from gpt-fast
	#prompt: "### Instruction: \nYou are a powerful model trained to convert questions to tagged questions. Use the tags as follows: \n<qt> to surround question keywords like 'What', 'Who', 'Which', 'How many', 'Return' or any word that represents requests. \n<o> to surround entities as an object like person name, place name, etc. It must be a noun or a noun phrase. \n<s> to surround entities as a subject like person name, place name, etc. The difference between <s> and <o>, <s> only appear in yes/no questions as in the training data you saw before. \n<cc> to surround coordinating conjunctions that connect two or more phrases like 'and', 'or', 'nor', etc. \n<p> to surround predicates that may be an entity attribute or a relationship between two entities. It can be a verb phrase or a noun phrase. The question must contain at least one predicate. \n<off> for offset in questions asking for the second, third, etc. For example, the question 'What is the second largest country?', <off> will be located as follows. 'What is the <off>second</off> largest country?' \n<t> to surround entity types like person, place, etc. \n<op> to surround operators that compare quantities or values, like 'greater than', 'more than', etc. \n<ref> to indicate a reference within the question that requires a cycle to refer back to an entity (e.g., 'Who is the CEO of a company founded by himself?' where 'himself' would be tagged as <ref>himself</ref>). Then, convert the tagged question to a sparql query template with placeholdes []. \nInput: How many persons live in the capital of Canada? \nTagged Question: \n```html"
	prompt: "### Instruction: \nYou are a powerful model trained to convert questions to tagged questions. Use the tags as follows: \n<qt> to surround question keywords like 'What', 'Who', 'Which', 'How many', 'Return' or any word that represents requests. \n<o> to surround entities as an object like person name, place name, etc. It must be a noun or a noun phrase. \n<s> to surround entities as a subject like person name, place name, etc. The difference between <s> and <o>, <s> only appear in yes/no questions as in the training data you saw before. \n<cc> to surround coordinating conjunctions that connect two or more phrases like 'and', 'or', 'nor', etc. \n<p> to surround predicates that may be an entity attribute or a relationship between two entities. It can be a verb phrase or a noun phrase. The question must contain at least one predicate. \n<off> for offset in questions asking for the second, third, etc. For example, the question 'What is the second largest country?', <off> will be located as follows. 'What is the <off>second</off> largest country?' \n<t> to surround entity types like person, place, etc. \n<op> to surround operators that compare quantities or values, like 'greater than', 'more than', etc. \n<ref> to indicate a reference within the question that requires a cycle to refer back to an entity (e.g., 'Who is the CEO of a company founded by himself?' where 'himself' would be tagged as <ref>himself</ref>). Then, convert the tagged question to a sparql query template with placeholdes []. \nInput: Which film directed by Garry Marshall, starring both Julia Roberts and Richard Gere, has a runtime of over 100 minutes? \nTagged Question: \n```html"
	max_new_tokens: 250
	temperature: 0.6 # 0.8 and 0.6 are popular values to try
	top_k: 1

	quantizer: null