A team from DeepMind Technologies—made up of Ian Osband, Yotam Doron, Matteo Hessel, John Aslanides, Eren Sezner, Andre Saraiva, Katrina McKinney, Tor Lattimore, Csaba Szepezvari, Satinder Singh, Benjamin Van Roy, Richard Sutton, David Silver, and Hado Van Hesselt—has recently published a piece on their new program Behavior Suite (bsuite for short). Bsuit is a software package designed to aid and provide insights to researchers using deep reinforcement learning. The paper introduces the software, showcases a few potential use examples, and hints to future developments in the package.
[Related Article: Reinforcement Learning vs. Differentiable Programming]
The team suggests that bsuite is the software package that will let us better understand our research:
This includes the scalability of our RL algorithms, the environments where we expect them to perform well, and the key issues outstanding in the design of a general intelligence system. We have the existence proof that a single self-learning RL agent can master the game of Go purely from self-play [56]. We do not have a clear picture of whether such a learning algorithm will perform well at driving a car, or managing a power plant.
That is, apparently, where bsuite comes in. The paper catalogues a few different example experiments to showcase this idea. One such example is memory length. The team uses bsuite to test how long a reinforcement learning program can “remember” in a single bit. It’s a good example because it’s simple, targeted challenging, and scalable. The software can quickly give researchers an answer that then can shape the rest of their programming.
The team has made bsuite easy-to-use and open source, with the goal that their software can open up the realms of possibilities of research, and interconnect the research community. Furthermore, their paper points directly to their GitHub profile, where you can read through their step-by-step practical tutorials on how to use and install the software, rather than making you read through pages of research articles. They’ve also made it so you can implement it within diverse environments, and have solved some of the issues that usually arise when implementing new software into things like neural networks or other specific uses. Everything the team has done has made this software package accessible and ready-to-use.
Behavior Suite is in its first iteration, but the team is already planning on future versions, aided by the research community’s use and feedback of this software. They’ve created a bsuite committee, to meet annually, and go over new developments and suggestions.
[Related Article: The Best Machine Learning Research of June 2019]
They say, “By collecting clear, informative and scalable experiments; and providing accessible tools for reproducible evaluation we hope to facilitate progress in reinforcement learning research.” AI agents will only keep getting smarter, but this will help us put the pieces together for new evaluation methods. With their combined efforts, they’re well on their way to reaching the goal of widespread deeper understanding of research.