Overview

This repository contains three sets of yearly word embedding models between 1900 and 1999 trained on the Japanese National Diet Library Ngram dataset: a book-only model, a magazine-only model, and a combined model trained on books and magazines together. The combined model was used for the paper’s main analysis because it had better overall embedding quality than the single-source models.

Combined model

The combined model is a series of yearly skip-gram with negative sampling (SGNS) word embeddings trained on the merged NDL Ngram corpus of books and magazines for 1900–1999. This is the primary model used in the paper’s main analysis because it showed the most reliable embedding quality across years.

Book-only model

The book-only model contains yearly SGNS word embeddings trained only on the book portion of the NDL Ngram dataset.

Magazine-only model

The magazine-only model contains yearly SGNS word embeddings trained only on the magazine portion of the NDL Ngram dataset.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support