Update README with v2 adaptive features

2da101c verified 12 days ago

4.6 kB

	# 🛸 Orbit Wars - Kaggle Competition Agent (v2 Adaptive)

	Competition: [Orbit Wars](https://www.kaggle.com/competitions/orbit-wars) ($50,000 prize pool)
	Deadline: June 23, 2026
	Current ELO: ~1100 and climbing

	## Overview

	This is a highly competitive adaptive agent for the Orbit Wars Kaggle competition — a real-time strategy game where 2 or 4 AI agents compete to conquer planets orbiting a central sun in continuous 2D space.

	v2 adds real-time in-match opponent profiling and adaptive parameter tuning — the agent learns opponent playstyle during each game and adjusts its strategy accordingly.

	## Architecture

	### Core: Composite Rule-Based Engine
	Combined from the 5 top public agents on the leaderboard:

	\| Source \| LB Rating \| Key Feature \|
	\|--------\|-----------\|-------------\|
	\| tamrazov-starwars (base) \| LB 1224 \| Gang-up attacks, weakest enemy targeting, elimination missions \|
	\| ykhnkf \| LB #1 \| Hostile reinforcement prediction \|
	\| pascal v14 \| High-rated \| 4-source coordinated swarm attacks \|
	\| pilkwang \| LB ~1000 \| Structured decision architecture \|
	\| yuriygreben \| Architect \| Physics-aware multi-phase strategy \|

	### v2: In-Match Opponent Profiling & Adaptation

	New adaptive layer that monitors opponent behavior in real-time:

	What it tracks (EMA-smoothed):
	- Aggression — fleet launch frequency (how often they attack)
	- Expansion rate — planet capture speed
	- Relative strength — ship/planet differential

	How it adapts:

	\| Opponent Style \| Agent Response \|
	\|---------------\|---------------\|
	\| Very aggressive (aggression > 0.6) \| ↑ defense ratios, ↑ reinforcement priority, ↓ attack aggression \|
	\| Passive/turtle (aggression < 0.3) \| ↑ attack multipliers, ↑ elimination bonus, ↑ expansion pressure \|
	\| We're ahead \| Play safe, consolidate, higher attack cost weighting \|
	\| We're behind \| Take risks, ↑ snipe values, ↑ finishing bonuses, lower defense \|
	\| Enemy expanding fast \| Contest neutrals more aggressively, ↓ target margins \|
	\| Late game (step > 350) \| Maximum elimination drive, ↑ finishing multipliers \|

	20 parameters dynamically tuned during each match based on game state.

	### Key Technical Features

	1. Hostile Reinforcement Prediction — estimates enemy counterattack potential before committing fleets
	2. 4-Source Coordinated Swarms — synchronizes 4 fleets to overwhelm defended targets
	3. Multi-Phase Target Selection — opening expansion → mid-game optimization → late-game elimination
	4. Sun-Aware Fleet Routing — avoids solar destruction with safe detour angles
	5. Crash Exploit Detection — captures planets weakened by enemy fleet collisions
	6. Doomed Planet Evacuation — retreats from unsaveable positions to useful targets

	## Performance

	\| Opponent \| Win Rate \| Notes \|
	\|----------\|----------\|-------\|
	\| Random \| 100% (3/3) \| Eliminated by step ~94-150 \|
	\| Nearest-Sniper \| 100% (4/4) \| Eliminated by step ~88-152 \|
	\| 3× Random (4P) \| 100% \| All eliminated by step ~123 \|

	## Usage

	### Direct Kaggle Submission
	```bash
	wget https://huggingface.co/Builder-Neekhil/orbit-wars-agent/resolve/main/submission.py
	kaggle competitions submit orbit-wars -f submission.py -m "v2 adaptive agent"
	```

	### Local Testing
	```python
	from kaggle_environments import make
	exec(open('submission.py').read(), globals())

	env = make("orbit_wars", configuration={"seed": 42}, debug=False)
	env.run([agent, "random"])
	final = env.steps[-1]
	print(f"P0: {final[0].reward}, P1: {final[1].reward}")
	```

	## Self-Play PPO Training (Optional)

	The repo includes a PPO self-play training pipeline for further improvement:

	```bash
	pip install torch numpy pyyaml kaggle-environments huggingface_hub

	# Train (requires GPU for reasonable speed, ~10h on T4)
	TOTAL_UPDATES=500 EPISODES_PER_UPDATE=4 python train_efficient.py
	```

	Training approach (based on the Artificial Generals Intelligence paper, arXiv:2507.06825):
	- Phase 1 (0-20%): Train vs random opponents (fast, learn basics)
	- Phase 2 (20-50%): Train vs baseline agent (harder, learn tactics)
	- Phase 3 (50-100%): Self-play with opponent pool (N=3, argmax opponents)
	- Reward: Potential-based shaping (planets + ships + production differential)
	- Architecture: 128-d MLP controller that outputs 20 parameter adjustments

	## Files
	- `submission.py` — Complete adaptive agent (single-file, ready for Kaggle)
	- `train_efficient.py` — PPO self-play training script
	- `generate_submission.py` — Packages trained controller into submission file

	## License
	MIT

	# 🛸 Orbit Wars - Kaggle Competition Agent (v2 Adaptive)

	Competition: [Orbit Wars](https://www.kaggle.com/competitions/orbit-wars) ($50,000 prize pool)
	Deadline: June 23, 2026
	Current ELO: ~1100 and climbing

	## Overview

	This is a highly competitive adaptive agent for the Orbit Wars Kaggle competition — a real-time strategy game where 2 or 4 AI agents compete to conquer planets orbiting a central sun in continuous 2D space.

	v2 adds real-time in-match opponent profiling and adaptive parameter tuning — the agent learns opponent playstyle during each game and adjusts its strategy accordingly.

	## Architecture

	### Core: Composite Rule-Based Engine
	Combined from the 5 top public agents on the leaderboard:

	\| Source \| LB Rating \| Key Feature \|
	\|--------\|-----------\|-------------\|
	\| tamrazov-starwars (base) \| LB 1224 \| Gang-up attacks, weakest enemy targeting, elimination missions \|
	\| ykhnkf \| LB #1 \| Hostile reinforcement prediction \|
	\| pascal v14 \| High-rated \| 4-source coordinated swarm attacks \|
	\| pilkwang \| LB ~1000 \| Structured decision architecture \|
	\| yuriygreben \| Architect \| Physics-aware multi-phase strategy \|

	### v2: In-Match Opponent Profiling & Adaptation

	New adaptive layer that monitors opponent behavior in real-time:

	What it tracks (EMA-smoothed):
	- Aggression — fleet launch frequency (how often they attack)
	- Expansion rate — planet capture speed
	- Relative strength — ship/planet differential

	How it adapts:

	\| Opponent Style \| Agent Response \|
	\|---------------\|---------------\|
	\| Very aggressive (aggression > 0.6) \| ↑ defense ratios, ↑ reinforcement priority, ↓ attack aggression \|
	\| Passive/turtle (aggression < 0.3) \| ↑ attack multipliers, ↑ elimination bonus, ↑ expansion pressure \|
	\| We're ahead \| Play safe, consolidate, higher attack cost weighting \|
	\| We're behind \| Take risks, ↑ snipe values, ↑ finishing bonuses, lower defense \|
	\| Enemy expanding fast \| Contest neutrals more aggressively, ↓ target margins \|
	\| Late game (step > 350) \| Maximum elimination drive, ↑ finishing multipliers \|

	20 parameters dynamically tuned during each match based on game state.

	### Key Technical Features

	1. Hostile Reinforcement Prediction — estimates enemy counterattack potential before committing fleets
	2. 4-Source Coordinated Swarms — synchronizes 4 fleets to overwhelm defended targets
	3. Multi-Phase Target Selection — opening expansion → mid-game optimization → late-game elimination
	4. Sun-Aware Fleet Routing — avoids solar destruction with safe detour angles
	5. Crash Exploit Detection — captures planets weakened by enemy fleet collisions
	6. Doomed Planet Evacuation — retreats from unsaveable positions to useful targets

	## Performance

	\| Opponent \| Win Rate \| Notes \|
	\|----------\|----------\|-------\|
	\| Random \| 100% (3/3) \| Eliminated by step ~94-150 \|
	\| Nearest-Sniper \| 100% (4/4) \| Eliminated by step ~88-152 \|
	\| 3× Random (4P) \| 100% \| All eliminated by step ~123 \|

	## Usage

	### Direct Kaggle Submission
	```bash
	wget https://huggingface.co/Builder-Neekhil/orbit-wars-agent/resolve/main/submission.py
	kaggle competitions submit orbit-wars -f submission.py -m "v2 adaptive agent"
	```

	### Local Testing
	```python
	from kaggle_environments import make
	exec(open('submission.py').read(), globals())

	env = make("orbit_wars", configuration={"seed": 42}, debug=False)
	env.run([agent, "random"])
	final = env.steps[-1]
	print(f"P0: {final[0].reward}, P1: {final[1].reward}")
	```

	## Self-Play PPO Training (Optional)

	The repo includes a PPO self-play training pipeline for further improvement:

	```bash
	pip install torch numpy pyyaml kaggle-environments huggingface_hub

	# Train (requires GPU for reasonable speed, ~10h on T4)
	TOTAL_UPDATES=500 EPISODES_PER_UPDATE=4 python train_efficient.py
	```

	Training approach (based on the Artificial Generals Intelligence paper, arXiv:2507.06825):
	- Phase 1 (0-20%): Train vs random opponents (fast, learn basics)
	- Phase 2 (20-50%): Train vs baseline agent (harder, learn tactics)
	- Phase 3 (50-100%): Self-play with opponent pool (N=3, argmax opponents)
	- Reward: Potential-based shaping (planets + ships + production differential)
	- Architecture: 128-d MLP controller that outputs 20 parameter adjustments

	## Files
	- `submission.py` — Complete adaptive agent (single-file, ready for Kaggle)
	- `train_efficient.py` — PPO self-play training script
	- `generate_submission.py` — Packages trained controller into submission file

	## License
	MIT