{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Sheet 5 Reinforcement Learning: Tim Racs and Derrick Hines - Task 4 - Car racing" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Deep Q Learning (DQN)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Install some Python package we need by running the following cell." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: gym in /home/racs/Documents/ProgPrakSoSe20/ENV/lib/python3.6/site-packages (0.13.0)\n", "Requirement already satisfied: box2d in /home/racs/Documents/ProgPrakSoSe20/ENV/lib/python3.6/site-packages (2.3.2)\n", "Requirement already satisfied: box2d-kengz in /home/racs/Documents/ProgPrakSoSe20/ENV/lib/python3.6/site-packages (2.3.3)\n", "Requirement already satisfied: opencv-python in /home/racs/Documents/ProgPrakSoSe20/ENV/lib/python3.6/site-packages (4.1.0.25)\n", "Requirement already satisfied: h5py in /home/racs/Documents/ProgPrakSoSe20/ENV/lib/python3.6/site-packages (2.9.0)\n", "Requirement already satisfied: tqdm in /home/racs/Documents/ProgPrakSoSe20/ENV/lib/python3.6/site-packages (4.32.2)\n", "Requirement already satisfied: six in /home/racs/Documents/ProgPrakSoSe20/ENV/lib/python3.6/site-packages (from gym) (1.12.0)\n", "Requirement already satisfied: scipy in /home/racs/Documents/ProgPrakSoSe20/ENV/lib/python3.6/site-packages (from gym) (1.2.1)\n", "Requirement already satisfied: pyglet>=1.2.0 in /home/racs/Documents/ProgPrakSoSe20/ENV/lib/python3.6/site-packages (from gym) (1.3.2)\n", "Requirement already satisfied: numpy>=1.10.4 in /home/racs/Documents/ProgPrakSoSe20/ENV/lib/python3.6/site-packages (from gym) (1.16.2)\n", "Requirement already satisfied: cloudpickle~=1.2.0 in /home/racs/Documents/ProgPrakSoSe20/ENV/lib/python3.6/site-packages (from gym) (1.2.1)\n", "Requirement already satisfied: future in /home/racs/Documents/ProgPrakSoSe20/ENV/lib/python3.6/site-packages (from pyglet>=1.2.0->gym) (0.17.1)\n" ] } ], "source": [ "!pip install gym box2d box2d-kengz opencv-python h5py tqdm" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Using TensorFlow backend.\n" ] } ], "source": [ "import gym\n", "import numpy as np\n", "from mllab.rl.dqn import BaseQNetwork, ReplayMemory, EpsilonGreedyPolicy\n", "from mllab.rl.dqn import ProportionalPrioritizationReplayMemory #added" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Environment\n", "\n", "The action space of the car racing environment is continous and\n", "consists of a three dimensional real vector $[-1, 1]x[0, 1]x[0, 1]$\n", "corresponding to steering position, amount of gas and and brake intensity.\n", "We need discrete actions, so you have to pick finitely many points from this box.\n", "\n", "A initial suggestion has been made, **feel free to modify it**." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "env = gym.make('CarRacing-mllab-v0', verbose=0)\n", "# This will open a window. Call env.close() at the end to get rid of it." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "# Picke a finite set of actions\n", "action_space = env.action_space.discretize((\n", " np.array([ 0, 1, 0]), # full gas\n", " np.array([ 0, 0, 0]), # full brake\n", " np.array([-1, 0, 0]), # steer left\n", " np.array([ 1, 0, 0]), # steer right\n", " np.array([ 0, 0, 0]), # do nothing\n", "))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's watch a random policy." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "env.reset()\n", "while True:\n", " new_state, reward, terminated, _info = env.step(action_space.sample())\n", " env.render()\n", " if terminated:\n", " break" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Preprocessing map\n", "\n", "The state space of the car racing environment is made of an $96\\times96$ RGB image and seven measurements:\n", "\n", "- The velocity of the car (absolute value)\n", "- The angular velocity of the four wheels\n", "- The steering angle of the front wheels\n", "- The angular velocity of the car\n", "\n", "The map `preprocess` takes a state and transforms it to a state which hopefully is better suited as an input to the neural network. You can use the transformation as is **or change it**." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "import cv2 as cv\n", "from matplotlib import pyplot as plt\n", "from keras import backend as K\n", "\n", "\n", "def show(image):\n", " \"\"\"\n", " Show a greyscale image.\n", " \n", " Useful for debugging.\n", " \"\"\"\n", " fig, ax = plt.subplots(dpi=2 * 72)\n", " if image.ndim == 3:\n", " if image.shape[0] == 1:\n", " image = image.reshape(image.shape[1:])\n", " elif image.shape[-1] == 1:\n", " image = image.reshape(image.shape[:-1])\n", " if image.ndim == 2:\n", " ax.imshow(image, cmap='gray')\n", " else:\n", " ax.imshow(image)\n", " plt.axis('off')\n", " plt.show()\n", "\n", "\n", "def preprocess(state):\n", " \"\"\"\n", " Preprocess the rendered color image of the car racing environment.\n", "\n", " Parameters\n", " ----------\n", "\n", " state: (image, measurements)\n", " image is an RGB image, more precisely an 96x96x3 array.\n", " measurements is 1D vector of length 7.\n", " \"\"\"\n", " image, measurements = state\n", " # Convert to grayscale\n", " gray = cv.cvtColor(image, cv.COLOR_RGB2GRAY)\n", " # Resize the image (to save memory)\n", " # Get mask for red markings in curves\n", " curve_marks = cv.inRange(image, (250, 0, 0), (255, 0, 0))\n", " # Replace markings with white\n", " gray[curve_marks == 255] = 255\n", " gray = cv.resize(gray, (0,0), fx=0.85, fy=0.85)\n", " # Remove pattern in grass by setting light pixels (> 130) to white (255)\n", " gray = cv.threshold(gray, 130, 255, cv.THRESH_TRUNC)[1] / 130\n", " if K.image_data_format() == 'channels_first':\n", " gray = gray.reshape((1,) + gray.shape)\n", " else:\n", " gray = gray.reshape(gray.shape + (1,))\n", " measurements = np.concatenate((\n", " measurements[:4],\n", " np.array([np.cos(measurements[4]), np.sin(measurements[4])]),\n", " measurements[5:],\n", " ))\n", " return (gray.astype(K.floatx()), measurements.astype(K.floatx()))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Q-Network\n", "\n", "The Q-Network maps a (preprocessd) state to a Q-value for each action. Since out state consists of an image and measurements, we need to use Keras' functional API to build a neural network which can take mixed input.\n", "\n", "First, define two models for the scalar inputs and the image inputs:\n", "\n", "```python\n", "input_img = layers.Input(shape=...)\n", "img = layers.Conv2D(4, kernel_size=(3, 3), activation='relu')(input_img)\n", "# add more layers here (replace input_img by img)\n", "img = layers.Flatten()(img)\n", "img = keras.Model(input_img, img)\n", "\n", "input_scalar = layers.Input(shape=...)\n", "img = keras.Dense(8, activation='relu')(input_scalar)\n", "# as above\n", "scalar = layers.Model(input_scalar, scalar)\n", "```\n", "\n", "Then concatenate both models and create a new model:\n", "```python\n", "model = layers.concatenate([img.output, scalar.output])\n", "model = layers.Dense(num_actions, activation='linear')(model)\n", "model = keras.Model(inputs=[img.input, scalar.input], outputs=model)\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Task 4\n", "\n", "Define your model for the Q-network by implementing the `build_model` method.\n", "\n", "The method must return a model and an optimizer. The loss is implemented in the parent class." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "import keras\n", "import keras.layers as layers\n", "import keras.optimizers as optimizers\n", "\n", "class QNetwork(BaseQNetwork):\n", " def build_model(self, state_shape):\n", " num_actions = len(self.action_space)\n", " # Build the network for the image part\n", " img_shape, scalar_shape = state_shape\n", "\n", " input_img = layers.Input(shape=img_shape)\n", " img = layers.Conv2D(4, kernel_size=(3, 3), activation='relu')(input_img)\n", "\n", " img = layers.Conv2D(16, kernel_size=(3, 3), activation='relu')(img) #added:\n", " img = layers.Dense(16, activation='relu')(img)\n", " img = layers.Dense(8, activation='relu')(img)\n", " \n", " img = layers.Flatten()(img)\n", " img = keras.Model(input_img, img)\n", "\n", " # Build the network for the scalar part\n", " input_scalar = layers.Input(shape=scalar_shape)\n", " scalar=layers.Dense(16, activation='relu')(input_scalar) #added\n", " scalar=layers.Dense(8, activation='relu')(scalar)\n", " scalar = keras.Model(input_scalar, scalar)\n", "\n", " # Combine both networks\n", " model = layers.concatenate([img.output, scalar.output])\n", " model = layers.Dense(16, activation='relu')(model) #added\n", " model = layers.Dense(num_actions, activation='linear')(model)\n", " model = keras.Model(inputs=[img.input, scalar.input], outputs=model)\n", "\n", " opt = optimizers.RMSprop(lr=0.00025 / 4, rho=0.95, epsilon=0.01)\n", " return model, opt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Replay Memory\n", "\n", "The replay memory stores transitions. It was already implemented for your. To add a transition use\n", "```python\n", "replay_memory.add(state, action_index, reward, new_state)\n", "```\n", "**Important:** `state` and `new_state` must be the output of `preprocess`. If `state` is terminal, `new_state` must be `None`. The action index (not the actual index) is returned by the policy, see below.\n", "\n", "\n", "In order to sample a batch of transitions, call\n", "```python\n", "transitions, sample_weights = replay_memory.sample(importance_criterion, progress)\n", "```\n", "The parameter `importance_criterion` is a callable (e.g., a function) which get a transitions as arguments and returns an number to measure the prediction error for the transitions. You should use the TD-Error\n", "$$\n", " |y - Q(s^\\prime, a)| = |\\bigl(r + \\gamma Q_\\textrm{target}(s^\\prime, \\operatorname{argmax}_aQ(s^\\prime, a))\\bigr) - Q(s^\\prime)|.\n", "$$\n", "For terminal states $y$ is just $r$.\n", "\n", "The arguments for `importance_criterion` are\n", "```python\n", "def my_criterion(s, actions, rewards, s2, not_terminal): ...\n", "```\n", "Where\n", "- `s` is a list of preprocessed state\n", "- `actions` is a NumPy array of action indices (the action taken in `s`)\n", "- `rewards` is a NumPy array of rewards received (the reward received after taking the action from `actions` in the state from `s`)\n", "- `s2` is a list of preprocessd states (the new state). Only non terminal states are returned.\n", "- `not_terminal` is a NumPy array of boolean indicating which of the states in `s` was not terminal. \n", "\n", "The parameter `progression` is a float in $[0, 1]$ which represents the percentage of the steps taken so far.\n", "\n", "The return value `transitions` of `replay_memory.sample` is a tuple which has the same entries as those given as parameters to `importance_criterion`. The `sample_weights` return value must be passed to the gradient step (see policy description).\n", "\n", "#### Memory requirements\n", "Depending on the size of your state the memory requirements can be huge. For example, to store 100k transitions you need 10GB of memory or more. Check if your machine has enough memory or try a smaller replay memory." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### $\\varepsilon$-Greedy-Policy\n", "\n", "A policy class is already implemented for your. Initialize it as following (feel free to change the parameters):\n", "```python\n", "policy = EpsilonGreedyPolicy(q_network)\n", "policy.initial_exploration = 1.0 # initial epsilon value\n", "policy.final_exploration = 0.01 # lowest epsilon value\n", "policy.evaluation_exploration = 0.001 # epsilon used during evaluation\n", "policy.final_exploration_step = 500_000 # number of steps over which epsilon is linearly decreased\n", "```\n", "Here, `q_network` is an instance of `QNetwork`.\n", "\n", "With probability $\\varepsilon$ the policy returns a random action (exploration). Otherwise, an action is returned with maximal Q-value. The probability $\\varepsilon$ is linearly decreased with the step number.\n", "\n", "You get an action from the policy by calling it (like a function):\n", "```python\n", "action_index, action = policy(preprocessed_state, step)\n", "```\n", "\n", "More methods:\n", "\n", "- `policy.copy()` creates an independent copy of the policy\n", "- `policy.gradient_step(states, actions, labels, sample_weights)` performs a gradient step. `state` and `actions` are the return values of the replay memory (first two elements in `transitions`), and `sample_weights` is the second return value of the replay memory.\n", "- `policy.copy_weights_from(other_policy)` Copies over the weights from another policy.\n", "\n", "To compute the Q-values of the underlying network, use\n", "```python\n", "policy.q_network(states)\n", "```\n", "which returns a NumPy array where each row contains the outputs of the network. You need this to implement the label computation and for `importance_criterion`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Q-Learning Algorithm\n", "\n", "Implement the `train` method." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "from tqdm import tqdm_notebook as tqdm\n", "\n", "\n", "class DeepQLearning:\n", " # After how many steps the weights are copied to the target-action network\n", " target_network_update_frequency = 1_000 #n_update\n", " discount_factor = 0.99 #gamma\n", " # A random policy is run for that many steps to initialize the replay memory\n", " replay_start_size = 5_000 #n_batch\n", "\n", " def __init__(self, env, replay_memory, policy, preprocess=preprocess):\n", " self.env = env\n", " self.replay_memory = replay_memory\n", " self.policy = policy\n", " self.preprocess = preprocess\n", " self.rewards = []\n", " self.best_agent = None\n", "\n", " def train(self, total_steps, replay_period, weight_filename=None, evaluate=None, double_dqn=True):\n", " \"\"\"\n", " Train the agent using DQN.\n", "\n", " Parameters\n", " ==========\n", "\n", " total_steps: int\n", " Number of steps the agent is trained for. #n_max\n", " replay_period: int\n", " Number of steps between which the network is trained. #n_replay\n", " weight_filename: str or None\n", " If not None the weights of Q-network are stored to this file during training.\n", " evaluate: int or None\n", " Number of episodes after which the policy is evaluted and the result is printed.\n", " double_qdn: bool\n", " Whether to use Double-DQN (DDQN).\n", " \"\"\"\n", " if len(self.replay_memory) == 0:\n", " self.initialize_replay_memory()\n", " action_value = self.policy # ~ theta\n", " target_action_value = self.policy.copy() # ~ theta_hat\n", " episode = 0\n", " step = 0\n", "\n", " while step < total_steps:\n", " episode += 1\n", " self.env.reset()\n", " preprocessed_state = self.preprocess(self.env.state) #phi(s)\n", " print(\"Episode {} ({} steps so far)\".format(episode, step))\n", " for _ in tqdm(range(env.spec.max_episode_steps)): \n", " # your code goes here\n", " step+=1\n", " action_index, action = action_value(preprocessed_state, step)\n", " new_state, reward, terminated, _info = env.step(action)\n", " preprocessed_new_state = None\n", " if not terminated:\n", " preprocessed_new_state = self.preprocess(new_state)\n", " replay_memory.add(preprocessed_state,action_index,reward,preprocessed_new_state)\n", " \n", " if step % replay_period == 0: #train with batch from memory\n", " def labels(s, actions, rewards, s2, not_terminal):\n", " \"\"\"\n", " input:\n", " s is a list of preprocessed state\n", " actions is a NumPy array of action indices (the action taken in s)\n", " rewards is a NumPy array of rewards received \n", " s2 is a list of preprocessd states (the new state). Only non terminal states are returned.\n", " not_terminal is a NumPy array of boolean indicating which of the states in s was not terminal\n", " return\n", " y = 𝛾Q_target(s',argmax_a Q(s',a)))\n", " \"\"\" \n", " y=np.array(rewards)\n", " \n", " for k in range(len(y)):\n", " if not not_terminal[k]:\n", " continue\n", " if double_dqn:\n", " argmax=np.argmax([action_value.q_network(s)[j,a] \n", " for j,a in enumerate(actions)] )\n", " qval=target_action_value.q_network(s)[k,actions[argmax]]\n", " # should we use here the old action on the new state? \n", " y[k]+=self.discount_factor*qval\n", " else:\n", " maxi=np.amax([action_value.q_network(s)[j,a] \n", " for j,a in enumerate(actions)] )\n", " y[k]+=self.discount_factor*maxi\n", " \n", " return y \n", " \n", " def my_criterion(s, actions, rewards, s2, not_terminal):\n", " \"\"\"\n", " input:\n", " the same as in labels\n", " return\n", " |y -Q(s',a)|=|(r+𝛾Q_target(s',argmax_a Q(s',a)))-Q(s') |\n", " \"\"\"\n", " Q=action_value.q_network(s)\n", " qval=np.array([Q[j,a] for j,a in enumerate(actions) ])\n", " y=labels(s, actions, rewards, s2, not_terminal)\n", " \n", " return np.absolute(y-qval )\n", " \n", " \n", " transitions, sample_weights = replay_memory.sample(my_criterion, step/total_steps)\n", " s, actions, rewards, s2, not_terminal=transitions\n", " labels=labels(s, actions, rewards, s2, not_terminal)\n", " action_value.gradient_step(s, actions, labels, sample_weights)\n", " if step % self.target_network_update_frequency == 0: #update\n", " target_action_value.copy_weights_from(action_value)\n", " \n", " if terminated:\n", " break\n", " \n", " if evaluate is not None and episode % evaluate == 0:\n", " total_reward = self.evaluate(target_action_value, weight_filename)\n", " print(\"Total reward: {}\".format(total_reward))\n", "\n", " def initialize_replay_memory(self):\n", " \"\"\"Initialize the replay memory using a random policy.\"\"\"\n", " self.env.reset()\n", " self.replay_memory.purge()\n", " state = self.preprocess(self.env.state)\n", " size = min(self.replay_start_size, self.replay_memory.capacity)\n", " print(\"Initialize replay memory with {} transitions\".format(size))\n", " for _ in tqdm(range(size)):\n", " action_index, action = self.policy.sample(return_index=True)\n", " new_state, reward, terminated, _info = self.env.step(action)\n", " new_state = self.preprocess(new_state)\n", " self.replay_memory.add(state, action_index, reward, new_state)\n", " if terminated:\n", " self.env.reset()\n", " state = self.preprocess(self.env.state)\n", " else:\n", " state = new_state\n", "\n", " def evaluate(self, policy, weight_filename=None):\n", " state = self.env.reset()\n", " total_reward = 0\n", " for _ in tqdm(range(env.spec.max_episode_steps)):\n", " # get action from policy\n", " _, action = policy(self.preprocess(state))\n", " state, r, terminal, _ = self.env.step(action)\n", " total_reward = r + total_reward\n", " if terminal:\n", " break\n", " if self.best_agent is None or total_reward > max(self.rewards):\n", " self.best_agent = policy.copy()\n", " if weight_filename is not None:\n", " self.best_agent.q_network.save(weight_filename + '.best')\n", " self.rewards.append(total_reward)\n", " return total_reward" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's create all objects, set parameters, and start training. **Make sure the replay memory is not too big for your memory!**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "WARNING:tensorflow:From /home/racs/Documents/ProgPrakSoSe20/ENV/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.\n", "Instructions for updating:\n", "Colocations handled automatically by placer.\n" ] } ], "source": [ "# Get shape of transformed state\n", "s = preprocess(env.reset())\n", "img_shape = s[0].shape\n", "scalar_shape = s[1].shape\n", "\n", "# Create the Q-Network\n", "q_network = QNetwork((img_shape, scalar_shape), action_space)\n", "\n", "policy = EpsilonGreedyPolicy(q_network)\n", "policy.initial_exploration = 1.0 # initial epsilon value\n", "policy.final_exploration = 0.01 # lowest epsilon value\n", "policy.evaluation_exploration = 0.001 # epsilon used during evaluation\n", "policy.final_exploration_step = 500_000 # number of steps over which epsilon is linearly decreased\n", "\n", "# Create the (empty) replay memory\n", "replay_memory = ProportionalPrioritizationReplayMemory( #this isn't defined???\n", " img_shape, scalar_shape,\n", " # ATTENTION: This is most likely too much for a laptop\n", " capacity=800_000, batch_size=32)\n", "# capacity=5_000, batch_size=32)\n", " \n", "dqn = DeepQLearning(env, replay_memory, policy)\n", "dqn.target_network_update_frequency = 5_000\n", "dqn.replay_start_size = 512" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Initialize replay memory with 512 transitions\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "4665b151a7e34274938657c79ed2a1cb", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=512), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "Episode 1 (0 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "9c034a6bd2814678bcf796357b771938", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 2 (517 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "a0a51771b9904ade97ad56448d2d7208", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 3 (1283 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "5a0c046799b148589c1836133f9405c2", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 4 (2052 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "6fac8274190c4eb796b89603ace5c175", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 5 (2669 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "9ab97146cde849218fa0a6caed37a70a", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "77f4684589c84733888ff117cbf335db", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -51.56043165467641\n", "Episode 6 (3566 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "57cf99332b554c77b9be93441db1dc8b", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 7 (4314 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "9703bf6c9dc846449bc503d3e8724162", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 8 (5002 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "bf1dd0089fe248039b2037f7c5715f5d", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 9 (5747 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "4c3adc97fc434ac6940f96eb684d4943", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 10 (6488 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "df1106a24c4e4194ae3469dc56de88e8", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "867bd36324914cb5b77729d93a1ad29a", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -47.552447552448356\n", "Episode 11 (7204 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "d7f98836d6d74aeb8b430ab4def61ba6", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 12 (7866 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "772889475e3548f7af5122a2d6788851", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 13 (8607 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "7a7dcebace4c44c38260e9b44e8cc8a5", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 14 (9325 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "02d7146f8741486e869a775eafdf33ea", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 15 (10114 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "1e7b5195775e4900a5d0101bb79a88de", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "73223d395df44115b19db6f7c2e07173", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -30.000000000000085\n", "Episode 16 (10889 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "00f82a774a564e66a8227aa50e72d06d", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 17 (11611 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "53f243a221fc4d21a841219472c4f449", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 18 (12142 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "5334992dafa04630b5904314eabe8fd7", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 19 (12928 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "fa1f6fb00252416d801980e030868c0c", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 20 (13716 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "aed88c846f144fcaae2c707a4407ebb9", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "0ead45e4d4214bd69f799f9bef314894", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -39.82808022922663\n", "Episode 21 (14272 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "7a96d334e2fa4a5d9d98a1765d3607f8", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 22 (15063 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "8a7bdff9a00445ca87d2321c9ead7a3e", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 23 (15844 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "a8faa24d1284451b99922ad7f505b9e9", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 24 (16461 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "9fb4e1924de74d49858c8168e6fd8489", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 25 (17098 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "1615b3666c414f3fa2fcca8920d5b63a", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "417a08b773b44568a6c751f743f34151", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -26.82926829268334\n", "Episode 26 (17766 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "cec8b8b4d3cd487eafe299b27a250fc6", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 27 (18339 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "68a13189b41e44e5a421ff2e78d6d765", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 28 (19128 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "0bfe5c1874c2467dbc6f9a70f5043535", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 29 (19880 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "6e20d3a9be8e472bb64070bc0fefa40c", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 30 (20645 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "a2b37de888704663b99b687355e4ab1a", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "9ca6703f8f5548faaa09625a7086b6aa", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -35.897435897436104\n", "Episode 31 (21439 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "736b15d0cab9461994f8ca254bf290cd", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 32 (22087 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "a6f37aaa26f8429582e3fb2457dd67d6", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 33 (22934 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "a0586a4352d34959b2e46dba1c09b9de", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 34 (23768 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "617ba2dbef0f45fab68feb0485ff60ed", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 35 (24449 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "2ba9667845dd4373bcd0fddd4a017288", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "c79c3da1a8e74a54b99e9104d038f744", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -24.99999999999992\n", "Episode 36 (25228 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "ddf6cfb11f6a40028163af14c2b49f8f", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 37 (25888 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "ae6114dbba254352a76f2537164ae077", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 38 (26614 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "b03b1df785494376ae89c4ff399c84f1", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 39 (27333 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "a3fb412be68c4554ab8044fec4214d76", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 40 (28104 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "0723e34db94a4b1cb8b09435abb69184", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "3bcedc69b88044388bc2a47edecd63b0", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -35.58282208588968\n", "Episode 41 (28643 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "2cc9c96f3a424a6a806ffe77580d3d60", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 42 (29414 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "21aeca0cf6df481686de4dc76db436a3", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 43 (30124 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "ca1694410431436a91cb03fd8befab10", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 44 (30720 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "fb95e616b6e44c7eb7f883412d1f4e68", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 45 (31500 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "15cadbc11cdb4e6fbc39d5f68ae8ff16", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "40629977fc9f4e488f86cda0c5bb4146", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -22.509225092250823\n", "Episode 46 (32102 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "f6a035dcfac044c9a93cde70715f03ad", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 47 (32982 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "b8eb328c5b1f4ad4aeb9c36cbbe1294a", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 48 (33982 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "26f9b6a62ed145dfa80225f19dfc0218", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 49 (34777 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "a0cc98b23e8b4102a7074dab613fdca5", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 50 (35657 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "56c718488c55405aa37cb72d3aa10f5f", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "90effcbdef414e409eb6e0271f8aada0", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -23.07692307692306\n", "Episode 51 (36118 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "f422368d773b4ba489078b42e607c45f", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 52 (37118 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "cfc2d83cb8db4ccf84f840302d5acfde", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 53 (37615 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "1ff71bb69a7f4859b74ff99cce5f1664", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 54 (38482 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "5e2c39558a234cb7a653ccb0de07f63b", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 55 (39052 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "731a0e12848542929f00740cef3efdaf", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "7e6d38c3a9e84043beee8fb98f2e6d29", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -31.59609120521192\n", "Episode 56 (39526 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "6ff3c81af4f048efa92c07da11135317", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 57 (40032 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "994ad944c32d40fbb95e35713da28f4f", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 58 (40570 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "032b98a6441c4954be8f8266f3a1ae5e", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 59 (41150 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "a34d99fd409147729e45bbe3e5b687b7", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 60 (41965 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "2ae1bf90ec3a43c28173b78e273789f7", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "403ceaedf913470f856224c06824823a", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -30.232558139534934\n", "Episode 61 (42725 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "2aecfb4d1f8749df99f96fe19f33bbd8", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 62 (43509 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "1b54dffd209e47068a945910f4060fa9", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 63 (44261 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "6394a20eacc541ec84bdbe3a8e386eb1", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 64 (45048 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "0a77ff0174ba47bfac5f7e5860c1ad7b", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 65 (45490 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "535afe779189463fbc018660f0f12602", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "45d484148afc433ab8b416fe1ddeee51", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -20.127795527156803\n", "Episode 66 (46490 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "a0457d5baef546ca98af0ac522e02fbe", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 67 (47310 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "961ed15d442f42a4b55e3287e1bccbda", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 68 (48129 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "f5a2400bca2148d28609445876d65a8e", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 69 (48892 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "e3ee253889ad4ef19bdad6df577a5b4d", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 70 (49644 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "5bb495c6c0ae46539a8461476bdb91e8", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "ab0d0400cbeb4ff0a625cab20445fab8", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -79.66101694915238\n", "Episode 71 (50327 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "5b596937e5074a1fbcc95bc79b036037", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 72 (51302 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "fac62b12f8194c838ced7af9dde77311", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 73 (51889 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "9166a794399047968f82e1b973aa4cdd", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 74 (52559 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "8eadaeb785bd4717a64ab997047e81d6", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 75 (53131 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "5b1466749c43436e907ccbc1b0a83115", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "0b484844e00844c0be28ae4398ef9aea", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -77.09923664122137\n", "Episode 76 (53666 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "635d203ec401478c85a93047d8ffb9b6", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 77 (54487 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "729ab63698d84cb7830bf5894e5e05d5", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 78 (55151 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "9225dd2d355345768e4e301016683cf4", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 79 (55938 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "62327ac52e7d44c5969c78f4662dbd99", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 80 (56626 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "0cc1a96fc1f74394a479968a950643e8", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "7da59706ebc94cb28297eeef98321cad", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -77.69516728624532\n", "Episode 81 (57307 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "6d07928a9e5d494492314625a371e90a", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 82 (58307 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "b2032f4fc1ae42979e7d87bfcfd876fe", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 83 (59131 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "ab9209ff874e49648f0908c5abc1473d", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 84 (59894 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "ab4ce0928fb545ecaaa100fa02ca0487", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 85 (60894 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "238aa1226e9b40f8ab7c3b3d929a597d", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "aa820ead48fa4c029ee1b36814895900", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -80.95238095238071\n", "Episode 86 (61894 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "7b11740b871b4888947051a1a51faa3d", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 87 (62894 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "6155a4d77b5a4c3293942578cf6780bb", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 88 (63894 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "c51f0bb36195450cbfa924ee176b09c3", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 89 (64894 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "9aa0583a116d4e02974f84eeffbd76a0", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 90 (65694 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "79591e31dd2a4d459bb83e6dc37723cd", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "98802a9778f949b3bb6a14bac5f4307c", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -83.22147651006665\n", "Episode 91 (66542 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "94d511cea61c414493df0a48f0d35cba", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 92 (67279 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "36eca489c4784c4f90049cabfe87952b", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 93 (68061 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "ffa30520d32a4286833c078db69c8c03", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 94 (69061 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "d545a74d49874ad984136e2e2ec49f4c", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 95 (70061 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "f75d0a61734c43729792e774b120f101", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "a3615b84bd1549e4801db5067e70de72", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -79.72972972972956\n", "Episode 96 (70518 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "666c7d25a9294d5eaab1671565f07278", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 97 (70996 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "7cfdda185a0c4319b754d77a83010614", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 98 (71503 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "56c754e9caec40b08783b2e2b4e86388", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 99 (71930 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "2cdfacf9434b46169f8544e7828cb5f5", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 100 (72390 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "f4e0b864c20e4f4781cee8e7d9d86518", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "5a7f86155ef34f29843e20ceb9232e34", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -79.45205479452041\n", "Episode 101 (72826 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "2cfdaa7ce5234cb3b47b55c08cc7220d", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 102 (73637 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "b186ffed589d45629dd26dd912dc52d0", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 103 (74433 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "ca3d9124b97e4db0a914adde5d12bb1d", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 104 (75433 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "1e18834eb2aa44658aae0eeb99e5c386", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 105 (76231 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "d317dd8d381a436dac63c54bd1beafa9", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "def605c96c2d4a9f8f838c0a98c2eb0b", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -33.933933933934014\n", "Episode 106 (77231 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "d30a255041a04f1bb7b7c887edeaab1c", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 107 (78055 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "eb230fb511d649b98551b3a67efe357b", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 108 (78594 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "1cf47630f29d415caaa892586fdc979c", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 109 (79594 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "0814b631a515479783b82084eca86f34", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 110 (80396 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "bacaedd8275e4e6d917dbe0a1ad08777", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "84d995768c3e4bb2aaf1688fb409dcc7", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: 73.74517374517373\n", "Episode 111 (81396 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "38a1e79fb3ac40bda20b72ee8fc30da2", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 112 (82396 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "ec155fcbab50419c8ab1dbbb107e81da", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 113 (83396 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "5b3dada63dd741c1ab2af1feeb50b102", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 114 (84396 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "bd6bb3312d744c61ac81bfd752b00e60", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 115 (85396 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "bbca8667fa3546c5a0cf0099af2bbfb3", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "3b46bbaf1f334679b2efb628aeef4155", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -77.01149425287356\n", "Episode 116 (86238 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "65a6921a94864ae7ae6ea3a935ab0866", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 117 (87023 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "e0682e0c24984463b9ceec02765a4b06", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 118 (88023 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "e70001d8b61c4810a3c5906a1e248f27", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 119 (88850 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "6d1d75896dc649de9c2b5e556e305aed", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 120 (89552 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "66b0898e31ae4a11817e08367dbb2232", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "a0c85874242f4f52bc55f2edc6123792", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -79.9331103678928\n", "Episode 121 (90552 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "7a8d035187f84d7cb39ab187702237a0", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 122 (91552 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "db00c3ea880a4a3eaad14ed3d9698bbc", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 123 (92552 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "1d9a6dcc04c241a5aa726351cc0ba8ad", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 124 (93552 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "2fd083c2bbe545b8ac1a5ea7e7fa11ad", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 125 (94552 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "d14b783522b444b8bb50d7b40bc660dd", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "f408e185e74a4f2a86c004a140dd0a0e", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -62.068965517241914\n", "Episode 126 (95552 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "bfe3718596b54e808cf0c77c3318f0e2", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 127 (96387 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "2d2d707d7dc04ae59bf74b2dfaedda4f", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 128 (97387 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "b2c530c9ed6742bb8cae4ea1a1937d9a", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 129 (98387 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "58010fca51fc499cb091d7d42b171ab1", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 130 (99234 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "e3a32eb1cd1841ebb1bb130ecbb7d193", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "045e85743ac94032a5cc26c37d0e7e35", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -73.1543624161075\n", "Episode 131 (100234 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "ad62d8bdae824b3a83ec8c52c5d890d3", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 132 (100711 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "b3bad64ec16a4302a8983a90d912a12c", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 133 (101711 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "da6ae4e6f58f4323934c4f4bfb1c9da4", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 134 (102536 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "d43551ef06b74b119ec204b07876b563", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 135 (103536 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "2914fd3e089d4699ac96defce570eb81", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "3be33f2d749446a081a1689c75c5ad77", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -79.02097902097888\n", "Episode 136 (104536 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "bb0a912716954338a62e7f9093f08ccf", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 137 (105536 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "13ef93fb52a94db4a02b690767b8d55a", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 138 (106536 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "bfa8947122224665be8c9015389b6ea5", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 139 (107536 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "9a2e51e1123944a0abca1a67a36685b6", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 140 (108381 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "afcebc6cdef94c3a8b328ecdfd03cb04", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "ef1e128c50264e3894b120d2d1efed52", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -79.99999999999979\n", "Episode 141 (109381 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "0d1b378dff344816b3bd5409dca41950", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 142 (110381 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "4447776ca1fb404481e5e44530bac555", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 143 (111381 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "2cd56ec131a44b8889e8cd110d14ac85", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 144 (112381 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "c2a9afb482164235b3a19bdf7a08d6f7", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 145 (113381 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "9b65474eb71744e7a81dcaaaeee2c901", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "7fd791b0a8c14af18f625156adae5cb5", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -80.5194805194803\n", "Episode 146 (114381 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "bef9cd57a2ea4dc9856658c434ce523e", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 147 (115381 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "96d07fb01365455981ce51473b710dbb", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 148 (116381 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "79841ca6f2894127b919263e9e0d6acd", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 149 (117381 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "db0d7e0f2728434ea353f4476dd01299", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 150 (118381 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "523a9ba534514b839f28604501c0a06a", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "e38d99b8ca434e549b99ac628a4216fe", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -80.13245033112563\n", "Episode 151 (119381 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "cfbaa17ed11849c4ac86eb114c7c1b5d", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 152 (120210 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "7447b036a8cb44af9d6853d86eb3a183", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 153 (121026 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "4e40c9655a1d42138b9e82a53a5472aa", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 154 (122026 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "81ef3159d1264d3884406401d8aa6c3e", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 155 (123026 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "f9ac23670a9c4f1288ca273dfb00cc7a", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "861a3c95db63467ea07dbfe9945583b0", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -80.64516129032235\n", "Episode 156 (124026 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "14a5d3abf4524eff86779eba689629bf", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 157 (125026 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "d4641e6b74d84eddb03e8ea401528696", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 158 (126026 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "be9016aef48546e6a252cb22da0468f5", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 159 (127026 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "29927fc5459e47fdbe1de3e668322d2a", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 160 (128026 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "2d9c387f9f254647be94044258f68559", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "97bb89a1de7e45b28b50f2e5d940e43d", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -81.36645962732892\n", "Episode 161 (129026 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "da89a9f685044a02a1f778af1e5e8edf", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 162 (130026 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "56b51d6662e74e9d92c1412b09fe76f9", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 163 (131026 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "63f21baed31a4352add981ab319f70b8", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 164 (132026 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "de60730f25cc4c979189f6711e64e62c", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 165 (132834 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "8dfacdb9984d4f538c0117a268509a8e", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "815bdc4fe5904a4c854f294294c236a4", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -75.60975609756105\n", "Episode 166 (133834 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "24de37ae3b324ad6825938c371758a0f", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 167 (134692 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "e0f52f1ebc39494d964029bc660b040a", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 168 (135499 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "feae232cd72c4b6db4df6f80c227806f", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 169 (136499 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "b5edb963281445a583aaa191896267e9", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 170 (137499 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "ac3c0bd21062463598b69e6b7f4a4224", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "142e544ae5174a018d3d4029579a282f", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -78.41726618705017\n", "Episode 171 (138499 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "60eb6d7272d948e4b7059c8e96b990e4", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 172 (139265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "a9b79d625d65406ca86f0db55622116c", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 173 (140265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "dc1c41fe84354eef98fe11be1c598221", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 174 (141265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "de023b64f10b49e5b01e537647500326", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 175 (142265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "f42a1b4ccd82491db9f4f51fab766712", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "e0d7bf44b5c64d3da1dadf39c8e11501", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -79.38144329896886\n", "Episode 176 (143265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "62a7844adacb4f34a58efb325ec9f0fc", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 177 (144265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "2911db7d0d324a2095b913e2bcfb6a8a", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 178 (145265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "44725dd5f73043ef966aa53b3fca91d1", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 179 (146265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "571c078b87954bc483d2233feb196694", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 180 (147265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "d70bc583d8ea4e1fb57f976c7d6da0f2", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "4abc2384f73d4404a1103e400bbc6150", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -79.16666666666653\n", "Episode 181 (148265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "0b04ff798aa646b5b148075794270e9c", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 182 (149265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "253c2bf060cd44c1bfbdd3603ea7dc7a", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 183 (150265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "fdda8fcd211c49bba7ddc4ff8403e57b", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 184 (151265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "cbfeaa6847eb48b68a60ee9f376a58d5", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 185 (152265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "7e47cae703ae4cadb59f8218b5904051", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "4aa618fa9d9040d9914d5e7a5a607d64", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -83.1460674157297\n", "Episode 186 (153265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "bf770931c46044ecb654b0c5ff5eaf8c", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 187 (154265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "9b5ed44b9734442b8cecf4dca283055b", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 188 (155265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "958bd7e0fd194c3c8a1013c437a22343", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 189 (156265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "84761b49f7464fc7aea7807bcce3b251", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 190 (157265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "e989a36cc7b14208afe686dc44d99dea", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "3ac7e38172f9439d810392d5188f80ad", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -77.94117647058818\n", "Episode 191 (158265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "a56a738a32af498e8d321b02802edf23", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 192 (159265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "9f9c5fa42ea14f388bf5a4d12d193e38", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 193 (160265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "b2af6cc0188c4a929cf904b416974846", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 194 (161265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "53740bcf0d334519a5ac2be29bb15a0a", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 195 (162265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "041b06918c364977983ad9466813c30f", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "2690a46af4fb414d9e23abd2df7e8b2f", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -80.06644518272407\n", "Episode 196 (163265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "da9d3e62183d4b7f8356cd69accda2a4", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 197 (164265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "1b326c3ed3064d8c83017e2a9ee76a7d", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 198 (165265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "0868e7dd871142de85ada3c99600eb39", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 199 (166265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "584b45cd73584d8d941bb21f9d4c5b50", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 200 (167265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "851197db7821453daaf157bf8218e6f6", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "07cb958a92e948689466c2e723027b08", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -80.26315789473664\n", "Episode 201 (168265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "5737333ec3de4864a0366f30c4ba8d54", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 202 (169265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "fe7887535f1a42f794bc9bcfffc39b91", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 203 (170265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "5013f1a020ae4ee3aa246645f0be0f87", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 204 (171265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "95ca9f487f5d40dda19d441fa6ca2201", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 205 (172265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "a21104998a704b86b9ddd500802d03d7", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "82ff1407f0144419b628e046efdb5818", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -77.01149425287355\n", "Episode 206 (173265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "94adbae4248c4cc884f9d8536dbf16ee", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 207 (174265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "260adf4be7874f3aaa206df7924ec71e", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 208 (175265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "24d68ef11bd44c1890effa1f66bca411", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 209 (176265 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "7d75dc4ab839439787c738471d062c33", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 210 (177095 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "d7564e7a893c4a68949f2760424b0711", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "d86b4fb394f740e28fd816ab4fee5fbd", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -78.02197802197793\n", "Episode 211 (178095 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "904007824edd475d9baa469e0a8b25e8", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 212 (179095 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "aa2f626789b24501834300a70cbe1821", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 213 (180095 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "7db80f0ce24f48e0baaf728e23b4ebf8", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 214 (181095 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "cff4911243b2404ca3c55afc2bc49f45", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 215 (182095 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "9db316d0e9f84a0a9e4c474b05740818", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "c33a32b71a0a41a3ad136a19f9703a86", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -79.79797979797962\n", "Episode 216 (183095 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "ca019ab8123e4919964640e54fe4ac62", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 217 (184095 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "f5c6429ef65f45eb8ba7e49a178dc82b", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 218 (185095 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "ecfe5b6b9d39434ea655a122d88d9e95", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 219 (186095 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "27a8ee344f45471d9c971b1db48a359a", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 220 (187095 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "45ead74d1f434e85a2d864ffe217a399", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "b497cbdb3c214d7d8c7b35a2b734fa7d", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -79.23875432525938\n", "Episode 221 (188095 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "d6fd8ef0af4045c4812fa2c28687c944", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 222 (188937 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "e4ee3110dd4b4d73a776651a28beb2b4", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 223 (189937 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "e0dbc8add5fc4241b97caf46db3a7480", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 224 (190937 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "27abb0fffda349babaf975fb826e860d", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 225 (191937 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "cae1d620abb44dcb88d8b3f54aaf7b01", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "5313c33552d24a91bbea208d40c59dd6", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: 67.85714285714252\n", "Episode 226 (192937 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "ad8b5431da4c4da5b266d6c7def558f2", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 227 (193937 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "2cc9b7c40a7a4f9bb21747872ed7a477", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 228 (194937 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "7b1d418daf264a1bb2a4b4ffbfc07aa0", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 229 (195937 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "d89b933dafbe4034a94376cecf166a0e", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 230 (196937 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "c6ba40e7df50463ba509478862dea1c6", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "93ea43e07a02488a9862621b7efe4199", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -80.06644518272407\n", "Episode 231 (197937 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "aedbcf27170642d3b99f37f301f92391", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 232 (198937 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "eaac676676904053b55555178ac933de", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 233 (199937 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "4fd3162b4b314eeaac752a70d2631188", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 234 (200937 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "e7a8619fb84941089166ddbb817fe59e", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 235 (201526 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "c5d59ab8b318473485833157fc55a99c", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "b38dfdc1ca4f49fcaa0880c1ea856d34", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Total reward: -78.64768683274012\n", "Episode 236 (202526 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "1c0d9cde2aa84f018a0b5f58d0e9dc8f", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Episode 237 (203526 steps so far)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "407826e4e28245bd9cc0457f548ab40a", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, max=1000), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "#dqn.train(episodes=1000, max_steps_per_episode=1000, evaluate=5, weight_filename=\"agent.h5\")\n", "dqn.train(total_steps=1_000_000, replay_period=512, weight_filename=\"agent.h5\", evaluate=5)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can watch the agent:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def render_policy(env, preprocess, policy):\n", " \"\"\"Visualize a policy for an environment.\"\"\"\n", " env.reset()\n", " while True:\n", " state = preprocess(env.state)\n", " terminal = env.step(policy(state)[1])[2]\n", " env.render()\n", " if terminal:\n", " break" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# Optional: To record the video uncomment the following lines \n", "# and change \"env\" in the call to render_policy below to \"rec_env\"\n", "\n", "rec_env = gym.wrappers.Monitor(env, \"recording-racing\", video_callable=lambda episode_id: True, force=True)\n", "rec_env.reset_video_recorder()\n", "render_policy(rec_env, preprocess, policy)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "env.close()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 2 }