RAID Dual Neologism Tokens

This repo contains two learned token embeddings trained on matched RAID GPT-4 vs human text pairs.

Tokens:

<neo_detect_ai_900001>
<neo_detect_human_900002>

Artifacts:

concept_900001_token.pt: AI-style token checkpoint
concept_900002_token.pt: human-style token checkpoint
raid_eval_summary.json: held-out RAID dual-token evaluation
beemo_eval_summary.json: cross-dataset transfer evaluation on Beemo model outputs

The checkpoints are small PyTorch payloads with keys:

concept_id
new_token
new_token_id
embedding

Loading

import torch
payload = torch.load('concept_900001_token.pt', map_location='cpu')
token = payload['new_token']
embedding = payload['embedding']

These are not full finetuned model weights. They are learned embedding rows intended to be inserted into the base tokenizer/model vocabulary used during training.

Repo: danielfein/raid-dual-neologism-tokens

Downloads last month: -; Downloads are not tracked for this model. How to track