YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Multi-Task Dataset: SST-2 + MNLI + QQP (Modified for LLaMA 1B)

This dataset is a combination of SST-2, MNLI, and QQP for multi-task learning.

Modifications:

  • Each example includes a task prefix:
    • SST-2: "Task: SST2 | Sentence: ..."
    • MNLI: "Task: MNLI | Premise: ... Hypothesis: ..."
    • QQP: "Task: QQP | Q1: ... Q2: ..."
  • Labels are standardized to integer format.
  • Tokenized using the LLaMA-1B tokenizer.
  • Maximum sequence length is 128 tokens.

Dataset Usage:

from datasets import load_dataset
dataset = load_dataset("emirhanboge/sst2_mnli_qqp_llama1b_modified")
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support