Duplicate weight files across branches: step_100==step_1000, step_200==step_2000, step_300==main

by emirhanboge - opened Mar 17

Discussion

emirhanboge

Mar 17

•

edited Mar 17

Duplicate weight files across branches: step_100==step_1000, step_200==step_2000, step_300==main

Summary

Three pairs of branches contain identical model weight files (verified via LFS SHA-256):

Branch A	Branch B	LFS SHA-256 (shard 1, first 16 chars)
`step_100`	`step_1000`	`e5f78246eb9773f0`
`step_200`	`step_2000`	`53d993f3c56a3ec1`
`step_300`	`main`	`7714e11a8d367ebc`

Reproduction

from huggingface_hub import HfApi
from collections import defaultdict

api = HfApi()
shard1_hashes = {}

branches = [f'step_{i}' for i in range(100, 3001, 100)] + ['main']
for rev in branches:
    try:
        files = api.list_repo_tree('allenai/Olmo-3-7B-RL-Zero-Code', revision=rev)
        for f in files:
            name = getattr(f, 'rfilename', getattr(f, 'path', ''))
            if 'model-00001' in name and hasattr(f, 'lfs') and f.lfs:
                shard1_hashes[rev] = f.lfs.sha256[:16]
    except:
        pass

groups = defaultdict(list)
for rev, h in shard1_hashes.items():
    groups[h].append(rev)

for h, revs in groups.items():
    if len(revs) > 1:
        print(f'DUPLICATE: {revs} -> {h}')

Output:

DUPLICATE: ['step_100', 'step_1000'] -> e5f78246eb9773f0
DUPLICATE: ['step_200', 'step_2000'] -> 53d993f3c56a3ec1
DUPLICATE: ['step_300', 'main'] -> 7714e11a8d367ebc

The pattern suggests a labeling error during upload: each X00 step was duplicated as X000 (100→1000, 200→2000, 300→main). Researchers using these checkpoints to study how representations evolve during RL training will see false patterns.

Are the correct step_1000 and step_2000 weights available? Could they be re-uploaded to the correct branches?
Which training step does main correspond to, is it intended to be step_300, or should it be the final checkpoint (step_3000)?

Environment

Verified with huggingface_hub version 0.36.2
Also confirmed via local md5sum after snapshot_download with force_download=True

Thank you for releasing intermediate checkpoints, they are extremely valuable for research.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment