sql_env / docs /exploration /README.md
hjerpe's picture
Upload folder using huggingface_hub
9e64e71 verified

Exploration

Ideas, technology research, and ad-hoc investigation notes. This is a scratchpad -- content here is not system-of-record.

Diataxis type: Exploration (learning-oriented, not yet distilled)

What Goes Here

  • Technology evaluations and comparisons
  • Prototype findings
  • External API exploration
  • Performance investigations
  • Ideation and backlog notes

What Does NOT Go Here

  • Durable learnings (go to docs/learnings/)
  • Design decisions (go to docs/design-docs/)
  • Implementation specs (go to specs/)
  • Operational how-to guides (go to docs/guides/)

Exploration Index

Topic Type Date Summary
grpo-collapse-analysis.md Investigation 2026-04 Post-mortem on Qwen3-1.7B GRPO collapse into degenerate null-argument tool calls
grpo-plateau-plan.md Investigation 2026-04 Interventions to push past 30-40% accuracy plateau in GRPO training
grpo-training-session-log.md Investigation 2026-04 Running log of SFT warmup + GRPO training sessions on Colab L4
rl-vs-icl-research.md Comparison 2026-04 When GRPO training adds value over pure prompting for small SQL agents
train-grpo-walkthrough.md Prototype 2026-04 Step-by-step companion guide for train_grpo.ipynb

Types

  • Tech Eval: Evaluating a library, framework, or service
  • Prototype: Findings from exploratory prototyping
  • Investigation: Deep dive into a specific problem
  • Comparison: Side-by-side analysis of options

Graduating Content

When exploration produces durable insights:

  1. Extract patterns to docs/learnings/<category>.md
  2. Create reference files in docs/references/ for agent context
  3. Create how-to guides in docs/guides/ for operational procedures