BipedalWalker environment from gym, solved with Asynchronous Advantage Actor Critic algorithm using Tensorflow.