gemma4-31b-journalist

An investigative journalism model fine-tuned on google/gemma-4-31B-it, specializing in OSINT tool selection, verification methodology, financial investigation, digital security, and media ethics.

Built by Buried Signals to power OSINT Navigator and coJournalist โ€” AI tools for investigative reporters.

Training

  • Method: QLoRA (4-bit NF4 + LoRA r=16, alpha=32) via Oumi
  • Data: 700 instruction/response pairs across 10 categories โ€” see training dataset
  • Epochs: 1
  • Hardware: A100 (HF Jobs)

Sources

Training data was synthesized from the following primary sources:

OSINT & tool databases

  • OSINT Navigator Tool Database โ€” 7,524 tools from 9 open-source toolkit sources (Buried Signals)
  • Indicator Media briefing tools โ€” 101 tool entries from 37 briefings (used with permission)

Investigation methodology

  • Buried Signals skill repositories: OSINT methodology, Follow the Money, social media intelligence, OPSEC, evidence grounding, Spotlight investigator & fact-checker agents
  • Bellingcat how-to guides (geolocation, flight tracking, corruption OSINT)
  • GIJN resource center and investigation manuals

Handbooks & manuals (PDFs)

  • Al Jazeera Media Institute, Investigative Journalism Handbook
  • UNESCO, Story-Based Inquiry (Mark Lee Hunter) and Global Investigative Journalism Casebook
  • CiFAR, Investigate โ€” The Manual (illicit financial flows & asset recovery)
  • CIPE, Investigative Reporting: A Toolkit for Reporters
  • GIJN, Citizen Investigations: A Practical Guide
  • EJF / TEMPO Institute, Investigative Journalism Training Manual

Ethics & legal

  • SPJ Code of Ethics
  • Reporters Committee for Freedom of the Press (RCFP) digital evidence & shield law resources
  • European Journalism Centre, Verification Handbook 3

Synthetic generation

  • 700 Q&A pairs generated by Claude Opus 4.6, grounded in the above sources with an embedded system prompt enforcing SIFT verification, evidence standards, and proportionality ethics.

Full attribution: SOURCES.md

GGUF

This model was converted to GGUF format using Unsloth.

Example usage:

  • Text only: llama-cli -hf tomvaillant/gemma4-31b-journalist --jinja
  • Multimodal: llama-mtmd-cli -hf tomvaillant/gemma4-31b-journalist --jinja

Available files

  • gemma-4-31B-it.Q4_K_M.gguf
  • gemma-4-31B-it.BF16-mmproj.gguf
  • gemma-4-31B-it.BF16-00002-of-00002.gguf

This was trained 2x faster with Unsloth

Downloads last month
1,353
GGUF
Hardware compatibility
Log In to add your hardware

4-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for tomvaillant/gemma4-31b-journalist

Quantized
(107)
this model