| # Unity Ml-Agents Custom trainers Plugin |
|
|
| As an attempt to bring a wider variety of reinforcement learning algorithms to our users, we have added custom trainers |
| capabilities. we introduce an extensible plugin system to define new trainers based on the High level trainer API |
| in `Ml-agents` Package. This will allow rerouting `mlagents-learn` CLI to custom trainers and extending the config files |
| with hyper-parameters specific to your new trainers. We will expose a high-level extensible trainer (both on-policy, |
| and off-policy trainers) optimizer and hyperparameter classes with documentation for the use of this plugin. For more |
| infromation on how python plugin system works see [Plugin interfaces](Training-Plugins.md). |
| ## Overview |
| Model-free RL algorithms generally fall into two broad categories: on-policy and off-policy. On-policy algorithms perform updates based on data gathered from the current policy. Off-policy algorithms learn a Q function from a buffer of previous data, then use this Q function to make decisions. Off-policy algorithms have three key benefits in the context of ML-Agents: They tend to use fewer samples than on-policy as they can pull and re-use data from the buffer many times. They allow player demonstrations to be inserted in-line with RL data into the buffer, enabling new ways of doing imitation learning by streaming player data. |
|
|
| To add new custom trainers to ML-agents, you would need to create a new python package. |
| To give you an idea of how to structure your package, we have created a [mlagents_trainer_plugin](../ml-agents-trainer-plugin) package ourselves as an |
| example, with implementation of `A2c` and `DQN` algorithms. You would need a `setup.py` file to list extra requirements and |
| register the new RL algorithm in ml-agents ecosystem and be able to call `mlagents-learn` CLI with your customized |
| configuration. |
|
|
|
|
| ```shell |
| βββ mlagents_trainer_plugin |
| β βββ __init__.py |
| β βββ a2c |
| β β βββ __init__.py |
| β β βββ a2c_3DBall.yaml |
| β β βββ a2c_optimizer.py |
| β β βββ a2c_trainer.py |
| β βββ dqn |
| β βββ __init__.py |
| β βββ dqn_basic.yaml |
| β βββ dqn_optimizer.py |
| β βββ dqn_trainer.py |
| βββ setup.py |
| ``` |
| ## Installation and Execution |
| If you haven't already, follow the [installation instructions](Installation.md). Once you have the `ml-agents-env` and `ml-agents` packages you can install the plugin package. From the repository's root directory install `ml-agents-trainer-plugin` (or replace with the name of your plugin folder). |
|
|
| ```sh |
| pip3 install -e <./ml-agents-trainer-plugin> |
| ``` |
|
|
| Following the previous installations your package is added as an entrypoint and you can use a config file with new |
| trainers: |
| ```sh |
| mlagents-learn ml-agents-trainer-plugin/mlagents_trainer_plugin/a2c/a2c_3DBall.yaml --run-id <run-id-name> |
| --env <env-executable> |
| ``` |
|
|
| ## Tutorial |
| Hereβs a step-by-step [tutorial](Tutorial-Custom-Trainer-Plugin.md) on how to write a setup file and extend ml-agents trainers, optimizers, and |
| hyperparameter settings.To extend ML-agents classes see references on |
| [trainers](Python-On-Off-Policy-Trainer-Documentation.md) and [Optimizer](Python-Optimizer-Documentation.md). |
|
|