Imagine this. You know that reinforcement learning has been responsible for some of AI’s most significant advancements. You’re in the exploratory phase of implementing your first project. You’d love a way to evaluate whether your RL agent is appropriate for the task you have, something not always apparent without a multitude of tedious trial and error. Let’s find out how DeepMind’s BSuite could help make that a thing of the past.
[Related Article: Best Deep Reinforcement Learning Research of 2019 So Far]
What is BSuite?
Behavior Suite for Reinforcement Learning, or BSuite, is a collection of algorithms you can use to test the efficiency of your Reinforcement Learning models in different scenarios. It operates in two ways:
- Capturing critical information about the design of efficient and general learning algorithms.
- Studying agent behavior through these benchmarks.
It automates your evaluations of RL models. Evaluating those models is time-consuming and extends the time table of your projects from significantly to “is this ever going to end?” Your analysis is a necessary part of transferring models from one use case to another, but the stagnation involved in manually testing those models can derail any exciting project.
BSuite’s collection of tests removes that step, automating testing logically and efficiently. You (theoretically) can get back to your project more quickly without the heartbreak of learning that your chosen model isn’t going to cut it once you’re too far in to save things.
These carefully designed experiments test your model efficacy before you get too deep. The ultimate purpose is to build superior algorithms through standardized testing and open source collaboration.
What are the BSuite Components?
DeepMind’s BSuite Components consist of three areas – environments, interactions, and analysis. DeepMind researchers defined a score-based system to tell you how well your algorithms work in different components. Run your experiment, and BSuite gives you insight into five key qualities. The results will tell you if your model is targeted, simple, challenging, scalable, and fast.
It provides insight into your agent behavior, but it’s not intended to be a replacement for grand AI challenges or a leaderboard. Instead, it helps automate those tests and give researchers a better picture overall of the chosen models. It’s a bridge between theory and practice.
How Do I Get Started?
BSuite is located on GitHub, but you can read the paper here. DeepMind’s chose open-source to facilitate research and improve the performance of algorithms. It begins with a Colab tutorial so that you’re on board with everything. From there, install Bsuite following either the command or a cloned repository.
You’ll begin with the environment for your experiment. There are two auto logging options, though you could certainly log your own if you need to save in a different storage system. The environments are all small enough to run on CPU for observations, and the system also accounts for specific dependencies (although they aren’t installed automatically).
It comes with a ready-made analysis Jupyter notebook and can make automated reports of your RL capabilities. You can run agents in a single environment or across the suite with as many running in parallel as your host machine can manage. The company only asks that you cite the paper on any experiments.
Automated RL Testing
The purpose of DeepMind’s BSuite is to automate what might take you a truly tedious amount of time to test and to encourage open collaboration through open-source frameworks. There isn’t another way to automate your RL testing so DeepMind’s foray into this area couldn’t come at a better time. With more and more organizations using reinforcement learning to tackle huge issues, this might free up researchers to rollout faster and innovate smarter.
[Related Article: What Are a Few AI Research Labs on the West Coast?]
It’s still relatively new, and again, not intended to replace bit AI challenges currently running. However, it could begin to lessen the bottlenecking with sophisticated RL testing and match appropriate algorithms to use cases much faster. Instead of waiting around to test out these models manually, your organization could be applying the right model to the proper use case right away. As more people use it, it can only get better.