The RLlib integration brings support between the [Ray/RLlib](https://github.com/ray-project/ray) library and CARLA, allowing the easy use of the CARLA environment for training and inference purposes. Ray is an open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.
The RLlib integration allows users to create and use CARLA as an environment of Ray and use that environment for training and inference purposes. The integration is ready to use both locally and in the cloud using AWS.
In this guide we will outline the requirements needed for running the RLlib integration both locally and on AWS, the structure of the integration repository, an overview of how to use the library and then an example of how to set up a Ray experiment using CARLA as an environment.
- [__Before you begin__](#before-you-begin)
- [Requirements for running locally](#requirements-for-running-locally)
- [Requirements for running on AWS Cloud](#requirements-for-running-on-aws-cloud)
- Requirements vary depending on if you are running locally or on AWS:
>###### Requirements for running locally
>>- [Install a package version of CARLA](https://github.com/carla-simulator/carla/releases) and import the [additional assets](https://carla.readthedocs.io/en/latest/start_quickstart/#import-additional-assets). __The recommended version is CARLA 0.9.11__ as the integration was designed and tested with this version. Other versions may be compatible but have not been fully tested, so use these at your own discretion.
>>- Navigate into the root folder of the RLlib integration repository and install the Python requirements:
pip3 install -r requirements.txt
>>- Set an environment variable to locate the CARLA package by running the command below or add `CARLA_ROOT=path/to/carla` to your `.bashrc` file:
export CARLA_ROOT=path/to/carla
>###### Requirements for running on AWS Cloud
>>- The requirements for running on AWS are taken care of automatically in an install script found in the RLlib integration repository. Find more details in the section ["Running on AWS"](#running-on-aws).
---
## RLlib repository structure
The repository is divided into three directories:
-`rllib_integration` contains all the infrastructure related to CARLA and how to set up the CARLA server, clients and actors. This provides the basic structure that all training and testing experiments must follow.
-`aws` has the files needed to run in an AWS instance. `aws_helper.py` provides several functionalities that ease the management of EC2 instances, including instance creation and sending and receiving data.
-`dqn_example` and the `dqn_*` files in the root directory provide an easy-to-understand example on how to set up a Ray experiment using CARLA as its environment.
---
## Creating your own experiment
This section provides a general overview on how to create your own experiment. For a more specific example, see the next section ["DQN example"](#dqn-example).
You will need to create at least four files:
- The experiment class
- The environment configuration
- The training and inference scripts
#### 1. The experiment class
To use the CARLA environment you need to define a training experiment. Ray requires environments to return a series of specific information. You can see details on the CARLA environment in [`rllib-integration/rllib_integration/carla_env.py`][carlaEnv].
The information required by Ray is dependent on your specific experiment so all experiments should inherit from [`BaseExperiment`][baseExperiment]. This class contains all the functions that need to be overwritten for your own experiment. These are all functions related to the actions, observations and rewards of the training.
The experiment should be configured through a `.yaml` file. Any settings passed through the configuration file will override the default settings. The locations of the different default settings are explained below.
2. Sets up variables specific to your experiment as well as specifying town conditions and the spawning of the ego vehicle and its sensors. The default settings are found [here][defaultExperimentSettings] and provide an example of how to set up sensors.
3. Configures settings specific to [Ray's training][raySettings]. These settings are related to the specific trainer used. If you are using a built-in model, you can apply settings for it here.
The last step is to create your own training and inference scripts. This part is completely up to you and is dependent on the Ray API. If you want to create your own specific model, check out [Ray's custom model documentation][rayCustomModel].
This section builds upon the previous section to show a specific example on how to work with the RLlib integration using the [BirdView pseudosensor][birdview] and Ray's [DQNTrainer][dqntrainer].
The default configuration uses 1 GPU and 12 CPUs, so if your local machine doesn't have that capacity, lower the numbers in the [configuration file][dqnConfig].
If you experience out of memory problems, consider reducing the `buffer_size` parameter.
This section explains how to use the RLlib integration to automatically run training and inference on AWS EC2 instances. To handle the scaling of instances we use the [Ray autoscaler API][rayAutoscaler].
Use the provided [`aws_helper.py`][awsHelper] script to automatically create the image needed for training by running the command below, passing in the name of the base image and the installation script `install.sh` found in [`rllib-integration/aws/install`][installsh]:
Once the image is created, there will be an output with image information. To use the Ray autoscaler, update the `<ImageId>` and `<SecurityGroupIds>` settings in your [autoscaler configuration file][autoscalerSettings] with the information from the output.
2. Update the `<ImageId>` and `<SecurityGroupIds>` settings in [`dqn_autoscaler.yaml`][dqnAutoscaler] with the information provided by the previous command.
4. (Optional) Update remote files with local changes:
ray rsync-up dqn_example/dqn_autoscaler.yaml dqn_example .
ray rsync-up dqn_example/dqn_autoscaler.yaml rllib_integration .
5. Run the training:
ray submit dqn_example/dqn_autoscaler.yaml dqn_train.py -- dqn_example/dqn_config.yaml --auto
6. (Optional) Monitor the cluster status:
ray attach dqn_example/dqn_autoscaler.yaml
watch -n 1 ray status
7. Shutdown the cluster:
ray down dqn_example/dqn_autoscaler.yaml
---
This guide has outlined how to install and run the RLlib integration on AWS and on a local machine. If you have any questions or ran into any issues working through the guide, feel free to post in the [forum](https://forum.carla.org/) or raise an issue on [GitHub](https://github.com/carla-simulator/rllib-integration).