| # Using an Environment Executable |
|
|
| This section will help you create and use built environments rather than the |
| Editor to interact with an environment. Using an executable has some advantages |
| over using the Editor: |
|
|
| - You can exchange executable with other people without having to share your |
| entire repository. |
| - You can put your executable on a remote machine for faster training. |
| - You can use `Server Build` (`Headless`) mode for faster training (as long as the executable does not need rendering). |
| - You can keep using the Unity Editor for other tasks while the agents are |
| training. |
|
|
| ## Building the 3DBall environment |
|
|
| The first step is to open the Unity scene containing the 3D Balance Ball |
| environment: |
|
|
| 1. Launch Unity. |
| 1. On the Projects dialog, choose the **Open** option at the top of the window. |
| 1. Using the file dialog that opens, locate the `Project` folder within the |
| ML-Agents project and click **Open**. |
| 1. In the **Project** window, navigate to the folder |
| `Assets/ML-Agents/Examples/3DBall/Scenes/`. |
| 1. Double-click the `3DBall` file to load the scene containing the Balance Ball |
| environment. |
|
|
|  |
|
|
| Next, we want the set up scene to play correctly when the training process |
| launches our environment executable. This means: |
|
|
| - The environment application runs in the background. |
| - No dialogs require interaction. |
| - The correct scene loads automatically. |
|
|
| 1. Open Player Settings (menu: **Edit** > **Project Settings** > **Player**). |
| 1. Under **Resolution and Presentation**: |
| - Ensure that **Run in Background** is Checked. |
| - Ensure that **Display Resolution Dialog** is set to Disabled. (Note: this |
| setting may not be available in newer versions of the editor.) |
| 1. Open the Build Settings window (menu:**File** > **Build Settings**). |
| 1. Choose your target platform. |
| - (optional) Select βDevelopment Buildβ to |
| [log debug messages](https://docs.unity3d.com/Manual/LogFiles.html). |
| 1. If any scenes are shown in the **Scenes in Build** list, make sure that the |
| 3DBall Scene is the only one checked. (If the list is empty, then only the |
| current scene is included in the build). |
| 1. Click **Build**: |
| - In the File dialog, navigate to your ML-Agents directory. |
| - Assign a file name and click **Save**. |
| - (For WindowsοΌWith Unity 2018.1, it will ask you to select a folder instead |
| of a file name. Create a subfolder within the root directory and select |
| that folder to build. In the following steps you will refer to this |
| subfolder's name as `env_name`. You cannot create builds in the Assets |
| folder |
| |
|  |
|
|
| Now that we have a Unity executable containing the simulation environment, we |
| can interact with it. |
|
|
| ## Interacting with the Environment |
|
|
| If you want to use the [Python API](Python-LLAPI.md) to interact with your |
| executable, you can pass the name of the executable with the argument |
| 'file_name' of the `UnityEnvironment`. For instance: |
| |
| ```python |
| from mlagents_envs.environment import UnityEnvironment |
| env = UnityEnvironment(file_name=<env_name>) |
| ``` |
| |
| ## Training the Environment |
| |
| 1. Open a command or terminal window. |
| 1. Navigate to the folder where you installed the ML-Agents Toolkit. If you |
| followed the default [installation](Installation.md), then navigate to the |
| `ml-agents/` folder. |
| 1. Run |
| `mlagents-learn <trainer-config-file> --env=<env_name> --run-id=<run-identifier>` |
| Where: |
| - `<trainer-config-file>` is the file path of the trainer configuration yaml |
| - `<env_name>` is the name and path to the executable you exported from Unity |
| (without extension) |
| - `<run-identifier>` is a string used to separate the results of different |
| training runs |
| |
| For example, if you are training with a 3DBall executable, and you saved it to |
| the directory where you installed the ML-Agents Toolkit, run: |
| |
| ```sh |
| mlagents-learn config/ppo/3DBall.yaml --env=3DBall --run-id=firstRun |
| ``` |
| |
| And you should see something like |
| |
| ```console |
| ml-agents$ mlagents-learn config/ppo/3DBall.yaml --env=3DBall --run-id=first-run |
| |
| |
| βββββββ |
| βββββββββββββ |
| ,βββmβββ' ,βββββββ βββ βββ |
| βββββ' ββββ βββ ββ ββ ,ββ ββββ ,ββ βββββ βββ ,ββ |
| βββββ ββββ ββββ βββ βββ ββββββββββ βββ βββββ ^βββ ββββ |
| βββββββββββββββββ ββ βββ βββ βββ βββ βββ βββ ββββ βββ |
| ββββββββββββββββββ ββ βββ βββ βββ βββ βββ βββ ββββββ |
| ^ββββ ββββ ββββ βββββββββ βββ βββ βββ ββββ ββββ` |
| 'βββββ ^βββ βββ βββββ ββ ^ββ `ββ `ββ 'ββ ββββ |
| ββββββββ ββββββ, βββββ |
| `ββββββββββββ |
| Β¬`βββββ |
| |
| ``` |
| |
| **Note**: If you're using Anaconda, don't forget to activate the ml-agents |
| environment first. |
| |
| If `mlagents-learn` runs correctly and starts training, you should see something |
| like this: |
| |
| ```console |
| CrashReporter: initialized |
| Mono path[0] = '/Users/dericp/workspace/ml-agents/3DBall.app/Contents/Resources/Data/Managed' |
| Mono config path = '/Users/dericp/workspace/ml-agents/3DBall.app/Contents/MonoBleedingEdge/etc' |
| INFO:mlagents_envs: |
| 'Ball3DAcademy' started successfully! |
| Unity Academy name: Ball3DAcademy |
|
|
| INFO:mlagents_envs:Connected new brain: |
| Unity brain name: Ball3DLearning |
| Number of Visual Observations (per agent): 0 |
| Vector Observation space size (per agent): 8 |
| Number of stacked Vector Observation: 1 |
| INFO:mlagents_envs:Hyperparameters for the PPO Trainer of brain Ball3DLearning: |
| batch_size: 64 |
| beta: 0.001 |
| buffer_size: 12000 |
| epsilon: 0.2 |
| gamma: 0.995 |
| hidden_units: 128 |
| lambd: 0.99 |
| learning_rate: 0.0003 |
| max_steps: 5.0e4 |
| normalize: True |
| num_epoch: 3 |
| num_layers: 2 |
| time_horizon: 1000 |
| sequence_length: 64 |
| summary_freq: 1000 |
| use_recurrent: False |
| memory_size: 256 |
| use_curiosity: False |
| curiosity_strength: 0.01 |
| curiosity_enc_size: 128 |
| output_path: ./results/first-run-0/Ball3DLearning |
| INFO:mlagents.trainers: first-run-0: Ball3DLearning: Step: 1000. Mean Reward: 1.242. Std of Reward: 0.746. Training. |
| INFO:mlagents.trainers: first-run-0: Ball3DLearning: Step: 2000. Mean Reward: 1.319. Std of Reward: 0.693. Training. |
| INFO:mlagents.trainers: first-run-0: Ball3DLearning: Step: 3000. Mean Reward: 1.804. Std of Reward: 1.056. Training. |
| INFO:mlagents.trainers: first-run-0: Ball3DLearning: Step: 4000. Mean Reward: 2.151. Std of Reward: 1.432. Training. |
| INFO:mlagents.trainers: first-run-0: Ball3DLearning: Step: 5000. Mean Reward: 3.175. Std of Reward: 2.250. Training. |
| INFO:mlagents.trainers: first-run-0: Ball3DLearning: Step: 6000. Mean Reward: 4.898. Std of Reward: 4.019. Training. |
| INFO:mlagents.trainers: first-run-0: Ball3DLearning: Step: 7000. Mean Reward: 6.716. Std of Reward: 5.125. Training. |
| INFO:mlagents.trainers: first-run-0: Ball3DLearning: Step: 8000. Mean Reward: 12.124. Std of Reward: 11.929. Training. |
| INFO:mlagents.trainers: first-run-0: Ball3DLearning: Step: 9000. Mean Reward: 18.151. Std of Reward: 16.871. Training. |
| INFO:mlagents.trainers: first-run-0: Ball3DLearning: Step: 10000. Mean Reward: 27.284. Std of Reward: 28.667. Training. |
| ``` |
| |
| You can press Ctrl+C to stop the training, and your trained model will be at |
| `results/<run-identifier>/<behavior_name>.onnx`, which corresponds to your model's |
| latest checkpoint. (**Note:** There is a known bug on Windows that causes the |
| saving of the model to fail when you early terminate the training, it's |
| recommended to wait until Step has reached the max_steps parameter you set in |
| your config YAML.) You can now embed this trained model into your Agent by |
| following the steps below: |
| |
| 1. Move your model file into |
| `Project/Assets/ML-Agents/Examples/3DBall/TFModels/`. |
| 1. Open the Unity Editor, and select the **3DBall** scene as described above. |
| 1. Select the **3DBall** prefab from the Project window and select **Agent**. |
| 1. Drag the `<behavior_name>.onnx` file from the Project window of the Editor to |
| the **Model** placeholder in the **Ball3DAgent** inspector window. |
| 1. Press the **Play** button at the top of the Editor. |
| |
| ## Training on Headless Server |
| |
| To run training on headless server with no graphics rendering support, you need to turn off |
| graphics display in the Unity executable. There are two ways to achieve this: |
| 1. Pass `--no-graphics` option to mlagents-learn training command. This is equivalent to |
| adding `-nographics -batchmode` to the Unity executable's commandline. |
| 2. Build your Unity executable with **Server Build**. You can find this setting in Build Settings |
| in the Unity Editor. |
| |
| If you want to train with graphics (for example, using camera and visual observations), you'll |
| need to set up display rendering support (e.g. xvfb) on you server machine. In our |
| [Colab Notebook Tutorials](ML-Agents-Toolkit-Documentation.md#python-tutorial-with-google-colab), the Setup section has |
| examples of setting up xvfb on servers. |
| |