Add .gitignore and stop tracking Python bytecode caches

#2
by alexis779 - opened

Summary

This PR improves repo hygiene and refactors training code into a small package layout.

1. .gitignore and bytecode

  • Ignore Python bytecode and caches (__pycache__/, *.pyc, …), virtualenvs, build/packaging dirs, IDE/OS noise, and local training outputs (wandb/, output/).
  • Stop tracking previously committed .pyc files under src/**/__pycache__/.

2. Predictor refactor + packaging

  • Move encode_video, ActionConditionedPredictor, and CrossAttentionLayer from train.py into src/jepa_predictor.py; train.py imports them.
  • Wire TrainingConfig dataset paths to ROOT / FRAMES_ROOT from src/datasets/so100_lerobot.py.
  • Add pyproject.toml, uv.lock, .python-version, main.py, and src/__init__.py.

Dataset paths (optional env overrides)

  • SO100_LEROBOT_ROOT β€” defaults to the path baked into the original dataset adapter.
  • SO100_LEROBOT_FRAMES_ROOT β€” defaults to {ROOT}/frames.

Set these if your dataset lives outside the default location.

Rupesh386 changed pull request status to merged

Sign up or log in to comment