Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
Spaces:
anugrah55
/
opensleuth-training-gemini-cli
like
0
Paused
App
Files
Files
Community
Fetching metadata from the HF Docker repository...
main
opensleuth-training-gemini-cli
50.9 kB
Ctrl+K
Ctrl+K
1 contributor
History:
13 commits
anugrah55
trainer v0.4: switch to Qwen2.5-3B-Instruct, dynamic task discovery, delegated probe sampling, difficulty-weighted rollouts, push to opensleuth-qwen2.5-3b-grpo-v2; sentinel cleared on FORCE_TRAIN=1.
78575eb
verified
12 days ago
__pycache__
Overhaul trainer: TRL GRPO with env-backed reward, Qwen2.5-0.5B 4bit+LoRA, slim PyTorch CUDA base, heartbeat HTTP for HF Spaces health probe
12 days ago
opensleuth_train
trainer v0.4: switch to Qwen2.5-3B-Instruct, dynamic task discovery, delegated probe sampling, difficulty-weighted rollouts, push to opensleuth-qwen2.5-3b-grpo-v2; sentinel cleared on FORCE_TRAIN=1.
12 days ago
.gitattributes
Safe
1.52 kB
initial commit
13 days ago
Dockerfile
Safe
992 Bytes
Drop deprecated TRANSFORMERS_CACHE env var
12 days ago
README.md
Safe
1.98 kB
Overhaul trainer: TRL GRPO with env-backed reward, Qwen2.5-0.5B 4bit+LoRA, slim PyTorch CUDA base, heartbeat HTTP for HF Spaces health probe
12 days ago
entrypoint.sh
Safe
2.68 kB
trainer v0.4: switch to Qwen2.5-3B-Instruct, dynamic task discovery, delegated probe sampling, difficulty-weighted rollouts, push to opensleuth-qwen2.5-3b-grpo-v2; sentinel cleared on FORCE_TRAIN=1.
12 days ago
requirements.txt
Safe
401 Bytes
Bump TRL to 0.16.1 (adds GRPOTrainer); transformers 4.51.3, peft 0.14, accelerate 1.4, bnb 0.45.5
12 days ago
train.py
Safe
10.8 kB
trainer v0.4: switch to Qwen2.5-3B-Instruct, dynamic task discovery, delegated probe sampling, difficulty-weighted rollouts, push to opensleuth-qwen2.5-3b-grpo-v2; sentinel cleared on FORCE_TRAIN=1.
12 days ago