orbit-wars-agent / README.md
Builder-Neekhil's picture
Update README with v2 adaptive features
2da101c verified
# πŸ›Έ Orbit Wars - Kaggle Competition Agent (v2 Adaptive)
**Competition:** [Orbit Wars](https://www.kaggle.com/competitions/orbit-wars) ($50,000 prize pool)
**Deadline:** June 23, 2026
**Current ELO:** ~1100 and climbing
## Overview
This is a highly competitive **adaptive** agent for the Orbit Wars Kaggle competition β€” a real-time strategy game where 2 or 4 AI agents compete to conquer planets orbiting a central sun in continuous 2D space.
**v2 adds real-time in-match opponent profiling and adaptive parameter tuning β€” the agent learns opponent playstyle during each game and adjusts its strategy accordingly.**
## Architecture
### Core: Composite Rule-Based Engine
Combined from the **5 top public agents** on the leaderboard:
| Source | LB Rating | Key Feature |
|--------|-----------|-------------|
| tamrazov-starwars (base) | LB 1224 | Gang-up attacks, weakest enemy targeting, elimination missions |
| ykhnkf | LB #1 | Hostile reinforcement prediction |
| pascal v14 | High-rated | 4-source coordinated swarm attacks |
| pilkwang | LB ~1000 | Structured decision architecture |
| yuriygreben | Architect | Physics-aware multi-phase strategy |
### v2: In-Match Opponent Profiling & Adaptation
New adaptive layer that monitors opponent behavior in real-time:
**What it tracks (EMA-smoothed):**
- **Aggression** β€” fleet launch frequency (how often they attack)
- **Expansion rate** β€” planet capture speed
- **Relative strength** β€” ship/planet differential
**How it adapts:**
| Opponent Style | Agent Response |
|---------------|---------------|
| Very aggressive (aggression > 0.6) | ↑ defense ratios, ↑ reinforcement priority, ↓ attack aggression |
| Passive/turtle (aggression < 0.3) | ↑ attack multipliers, ↑ elimination bonus, ↑ expansion pressure |
| We're ahead | Play safe, consolidate, higher attack cost weighting |
| We're behind | Take risks, ↑ snipe values, ↑ finishing bonuses, lower defense |
| Enemy expanding fast | Contest neutrals more aggressively, ↓ target margins |
| Late game (step > 350) | Maximum elimination drive, ↑ finishing multipliers |
**20 parameters dynamically tuned** during each match based on game state.
### Key Technical Features
1. **Hostile Reinforcement Prediction** β€” estimates enemy counterattack potential before committing fleets
2. **4-Source Coordinated Swarms** β€” synchronizes 4 fleets to overwhelm defended targets
3. **Multi-Phase Target Selection** β€” opening expansion β†’ mid-game optimization β†’ late-game elimination
4. **Sun-Aware Fleet Routing** β€” avoids solar destruction with safe detour angles
5. **Crash Exploit Detection** β€” captures planets weakened by enemy fleet collisions
6. **Doomed Planet Evacuation** β€” retreats from unsaveable positions to useful targets
## Performance
| Opponent | Win Rate | Notes |
|----------|----------|-------|
| Random | **100%** (3/3) | Eliminated by step ~94-150 |
| Nearest-Sniper | **100%** (4/4) | Eliminated by step ~88-152 |
| 3Γ— Random (4P) | **100%** | All eliminated by step ~123 |
## Usage
### Direct Kaggle Submission
```bash
wget https://huggingface.co/Builder-Neekhil/orbit-wars-agent/resolve/main/submission.py
kaggle competitions submit orbit-wars -f submission.py -m "v2 adaptive agent"
```
### Local Testing
```python
from kaggle_environments import make
exec(open('submission.py').read(), globals())
env = make("orbit_wars", configuration={"seed": 42}, debug=False)
env.run([agent, "random"])
final = env.steps[-1]
print(f"P0: {final[0].reward}, P1: {final[1].reward}")
```
## Self-Play PPO Training (Optional)
The repo includes a PPO self-play training pipeline for further improvement:
```bash
pip install torch numpy pyyaml kaggle-environments huggingface_hub
# Train (requires GPU for reasonable speed, ~10h on T4)
TOTAL_UPDATES=500 EPISODES_PER_UPDATE=4 python train_efficient.py
```
**Training approach** (based on the Artificial Generals Intelligence paper, arXiv:2507.06825):
- **Phase 1 (0-20%)**: Train vs random opponents (fast, learn basics)
- **Phase 2 (20-50%)**: Train vs baseline agent (harder, learn tactics)
- **Phase 3 (50-100%)**: Self-play with opponent pool (N=3, argmax opponents)
- **Reward**: Potential-based shaping (planets + ships + production differential)
- **Architecture**: 128-d MLP controller that outputs 20 parameter adjustments
## Files
- `submission.py` β€” Complete adaptive agent (single-file, ready for Kaggle)
- `train_efficient.py` β€” PPO self-play training script
- `generate_submission.py` β€” Packages trained controller into submission file
## License
MIT