Abstract
SHARE models are causal language models pre-trained specifically for social sciences and humanities that match general-purpose model performance while MIRROR provides a text review interface that preserves critical engagement without generating content.
This intermediate technical report introduces the SHARE family of base models and the MIRROR user interface. The SHARE models are the first causal language models fully pretrained by and for the social sciences and humanities (SSH). Their performance in modelling SSH texts is close to that of general purpose models (Phi-4) which use 100 times more tokens, as shown by our custom SSH Cloze benchmark. The MIRROR user interface is designed for reviewing text inputs from the SSH disciplines while preserving critical engagement. By prototyping a generative AI interface that does not generate any text, we propose a way to harness the capabilities of the SHARE models without compromising the integrity of SSH principles and norms.
Community
SHARE is a family of causal LLMs pretrained specifically for the socials sciences and humanities. Besides intermediate checkpoints of the models, we also release a new Cloze benchmark and a user interface. Any feedback is much appreciated, especially because we are still at the start of the project!
Get this paper in your agent:
hf papers read 2604.11152 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 2
Datasets citing this paper 1
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper