Skip to content

Python code to create blackjack player that uses reinforecement learnig

Notifications You must be signed in to change notification settings

sarveshshah/AI-Black-Jack-Player

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Blackjack using AI

We implement Q-Star algorithm and design a blackjack game enviornment to help our AI play the game.

We use very basic Python libraries for the project

#Import Functions
import numpy as np
import random as r

The Q-Table Class

The Q-Table is effectively the brain of the AI. It's a matrix that stores the appropriate actions that need to be taken when a given Scenario is presented to the AI.

The Q-Table has following data:
State: State is a combination of Usable Ace [0-1] (discussed below), AI's Hand (Sum of the player's cards) [2-21], Dealer's Up Card [1-10] and below)
Action: Action holds the success of a the move AI plays. (Number of Succesfull attempts,Total attempts)

Below is an example of how a Q-Table might look in the memory

Q-Table

We start with our basic class definition for q_table

class q_table:
    def __init__(self):
        self.hitcount= 1
        self.totalgame= 1
        self.ratio=self.hitcount/self.totalgame

Let's define a q-table structure. The qtable stores the values of the hitcount and the total games played seperated by a "_" for Hit and Stay Column.

There are a total of 484 combinations possible in our states. Hence we guve our numpy array a shape of (484,2)

#The qtable stores the values of the hitcount and the total games played seperated by a "_" for Hit and Stay Column  
qtable=['0000000000_0000000000' for i in range(968)]
qtable=np.array(qtable)
qtable=(qtable.reshape(484,2))
print((qtable).shape)

We need to be able to cleverly extract information from our states, and since we cannot directly index (0_1_1) from numpy array, we map all the combinations of (0_0_0) to (1_21_10) with a number stored in our index list.

Below is the code

index=[]

def generate_index():
index=[]
for i in range(2):
    for j in range(11):
        for k in range(22):
            # We use strings to access states
            index_i_j_k = str(i)+"_"+str(j)+"_"+str(k)
            index.append(index_i_j_k)
return index

index=generate_index()

Now that we have our Q-table defined. Let's write a code to update the Q-Table as the AI plays the game. As explained above, we calculate the numeric index for our given state from Index list and update the values accordingly at that index in our numpy array.

We have a row_index variable that essentially holds the 'State' of the Game and a col_index variable which checks which action was taken.

# The update_qtable updates the q_table with Action taken (Hit or stay), Dealer's card and player's card.
def update_qtable(action, dealer_cards, player_cards, usable_ace, result, obj_qtable):
    qtable=obj_qtable.qtable
    
    # Result variable shows whether the given move was a 'Win' or a 'Lose'.
    print(result)
    
    # Here we simply output the state of the game.
    print("function called, Action decided: {} Dealer Cards: {} Player Cards: {} Result: {} "
          .format(action, dealer_cards[0], player_cards, result))
    
    # Based on the action assign the column to be read.
    if action=='Hit':
        col_index=0
    else:
        col_index=1
    
    # We create a temporary game object to access necessary functions.
    tempgame = game()
    print(tempgame.sum_cards(player_cards))
    
    # As defined above, we store the state in row_index variable.
    row_index = obj_qtable.index.index(str(usable_ace)+"_"+str(dealer_cards[0])+"_"+str(tempgame.sum_cards(player_cards)))
    print(row_index)
    
    # We extract the q_value from the table
    q_value=qtable[row_index,col_index]
    
    # We split the elements on an '_' and update the counts based on the result.
    if(result=='Win'):
        counts=(q_value.split("_"))
        counts[0]=int(counts[0])+1
        counts[1]=int(counts[1])+1

    else:
        counts=(q_value.split("_"))
        counts[1]=int(counts[1])+1
    
    # Reformat the updated values
    q_value=str(counts[0])+'_'+str(counts[1])
    
    # We return the updated values back to the table
    qtable[row_index,col_index]=q_value

AI Action is a function we use to chose what AI should do based on the q-values.

Like above, it first extracts the values from q-table and then takes the action based on the percentage of winning.

# The ai_action is made for the AI player to take actions once it has learned sufficient number of times to take further actions
def ai_action(dealer_cards,player_cards,usable_ace,obj_qtable):
    tempgame = game()
    qtable=obj_qtable.qtable
    row_index = obj_qtable.index.index(str(usable_ace)+"_"+str(dealer_cards[0])+"_"+str(tempgame.sum_cards(player_cards)))
    q_value=qtable[row_index,0]

    counts=(q_value.split("_"))
    
    # This conditions states whether it should hit or stay
    if (int(counts[1])==0 or int(counts[0])/int(counts[1]) > 0.5):
        return 'Hit'
    else:
        return 'Stay'

Blackjack Game Enviornment

Let's start with the Blackjack game enviornment.

Game Enviornment

Blackjack has following elements in it's simplest form.
The deck of cards.
Player's Hand.
Dealer's Hand.
We Initialize these things as our class objects.

class game:
    def __init__(self):
        self.dealer_cards = []
        self.player_cards = []
        self.deck = []

The Deck

A deck is simply 52 cards or 'numbers' having a value from 1-10 and then 3 Face cards (K,Q,J) for 4 Houses.
But since the value for face cards in blackjack is 10. We simulate a deck procedurely using a for loop.

# Random card generator which generates card from 1 to 13, taking the count of face cards as 10
def card_generator(self,deck):  
    for i in range(4):
        for j in range(1,14):
            if(j>=10):
                self.deck.append(10)
            else:
                self.deck.append(j)
    # We shuffle the cards, hence making our deck 'Shuffled' and ready to use.
    r.shuffle(self.deck)

Now we define some simple card operatoins like Reset the deck and draw cards

# Resets the deck and calls the card generator function
def reset_deck(self,deck):  
    self.card_generator(self.deck)

Now, if there are very few cards left, we risk a chance of player's being able to count cards. Hence we reset th deck after a certain amount of cards are left.

# Draws card from the deck and also checks the cards in deck, if it goes below 7 it resets the deck
# We simulate a draw with the list.pop() function, which mimics taking the last card out from the deck.
def draw_card(self,deck):  
    if(len(deck)<=7):
        deck = self.reset_deck(self.deck)
    return self.deck.pop()

Deal Cards function deals the 2 cards each to player and the AI

def deal_cards(self,deck):  # 2 Dealer and 2 player cards are drawn
    self.dealer_cards.append(self.draw_card(self.deck))
    self.dealer_cards.append(self.draw_card(self.deck))
    self.player_cards.append(self.draw_card(self.deck))
    self.player_cards.append(self.draw_card(self.deck))

Usable Ace

Blackjack has a rule where the ace can take a value of 1 or 11. This is purely upto the player, hence we need our AI to be able to decide the same. Hence usable_ace function returns 'True' if the ace has the value of 11, and 'False' if the ace has the value of 1.

def usable_ace(self,player_cards):  # Checks, does this hand have a usable ace?
    if(1 in (self.player_cards) and sum(self.player_cards) + 10 <= 21):
        return 1
    else:
        return 0

Let's look at some other functions essential to the game.
sum_cards(): Returns the sum of the value of the cards, possessed by the entity.
is_bust(): Checks for a Blackjack bust. If sum > 21, it is called Bust.
hit(): Hit occurs when the player asks for a card.
stay(): When player decides to stay with his cards.
dealer_hit(): Simulates the dealer's actions after the player's moves are over.

def sum_cards(self,cards):  # Returns the current hand total
    if self.usable_ace(cards):
        return sum(cards) + 10
    else:
        return sum(cards)
def is_bust(self,cards):  # Checks, if this hand is a bust?
    return self.sum_cards(cards) > 21
def hit(self,player_cards,deck,dealer_cards, qtable):  # Checks if the hand is a bust and draws a card for the player
    self.player_cards.append(self.draw_card(self.deck))
    if(len(player_cards)<=2):
        continue:
    else:
        if (not self.is_bust(self.player_cards)):
            # Q-Table is updated with the action and the result of the game.
            q_table.update_qtable('Hit', self.dealer_cards, self.player_cards[:-1], self.usable_ace(self.player_cards), 'Win', qtable)
    else:
        # Q-Table is updated with the action and the result of the game.
        q_table.update_qtable('Hit', self.dealer_cards, self.player_cards[:-1], self.usable_ace(self.player_cards), 'Lose', qtable)
def stay(self,player_cards,deck,dealer_cards,qtable):  
    # When player stays, the dealer checks its sum and ends the game if it is greater than the player   
    if(self.sum_cards(self.dealer_cards)>self.sum_cards(self.player_cards)):
        q_table.update_qtable('Stay', self.dealer_cards, self.player_cards, self.usable_ace(self.player_cards), 'Lose', qtable)
        print(self.sum_cards(self.dealer_cards),self.sum_cards(self.player_cards))
        
    # If the dealer's sum is less than 22 and less than the player, it will hit    
    else:
        while(self.sum_cards(self.dealer_cards)<22 and self.sum_cards(self.dealer_cards)<self.sum_cards(self.player_cards)): #This condition checks the dealer cards when the AI player has opted to stay, it checks if the sum of dealer's cards <22 and if its total is less than dealer's sum, the dealer will hit.
            self.dealer_hit(self.player_cards,deck,self.dealer_cards)
            print("Dealer Cards",self.dealer_cards,self.player_cards)
            print("Player Cards",self.sum_cards(self.dealer_cards),self.sum_cards(self.player_cards))
            if(self.is_bust(self.dealer_cards)): # If dealer busts, we update in the q table as a win for us
                # Q-Table is updated with the action and the result of the game.
                q_table.update_qtable('Stay', self.dealer_cards, self.player_cards, self.usable_ace(self.player_cards), 'Win', qtable)
            else:
                # Q-Table is updated with the action and the result of the game.
                q_table.update_qtable('Stay', self.dealer_cards, self.player_cards, self.usable_ace(self.player_cards), 'Lose', qtable)
            
# Checks if the dealer hits the card then is it getting bust, if not it hits the card

def dealer_hit(self,player_cards,deck,dealer_cards):  
    if not self.is_bust(self.dealer_cards):
        self.dealer_cards.append(self.draw_card(self.deck))
    else:
        print("Burst! Game over")

Now we initialize the Blackjack game here and use epsilon to train the AI.

class Blackjack:
    epsilon=1 # We would use a decrementing rate from 1 to 0 for epsilon and run atleast 10000 training episodes so that the player has explored atleast sufficient conditions for it to run on itself
    training_episodes=training_episodes_left=10000
    small_decrement = (0.1 * epsilon) / (0.5 * training_episodes) # reduces epsilon slowly
    big_decrement = (0.3 * epsilon) / (0.6 * training_episodes)

    game1=game()
    game1.card_generator(game1.deck)
    game1.deal_cards(game1.deck)


    print(game1.deck)#prints the deck of the game initially
    print("Dealer Cards",game1.dealer_cards)
    print("Player Cards",game1.player_cards)

    qtable = q_table()

      """ while not game1.is_bust(game1.player_cards):
        game1.hit(game1.player_cards,game1.deck,game1.dealer_cards, qtable)
        print("Player Cards",game1.player_cards)
        print("Player Pre hit Cards",game1.player_cards[:-1]) """

    while(epsilon>=0): # The AI player would be exploring the environment until it has explored sufficient conditions
        game1.player_cards.clear()
        game1.dealer_cards.clear()
        game1.card_generator(game1.deck)
        game1.deal_cards(game1.deck)

        while not game1.is_bust(game1.player_cards):
            game1.hit(game1.player_cards,game1.deck,game1.dealer_cards, qtable)

        if training_episodes_left > 0.7 * training_episodes_left:#If the training episodes are greater than 0.70 it will have a small decrement
            epsilon -= small_decrement
        elif training_episodes_left > 0.3 * training_episodes_left:#If the training episodes are greater than 0.30 it will have a bigger decrement
            epsilon -= big_decrement
        elif training_episodes_left > 0:
            epsilon -= small_decrement
        else:
            epsilon = 0.0

        training_episodes_left -= 1
        print("Epsilon:",epsilon, "Training Episodes:",training_episodes_left)



    print("-------------------------------------------------------------------------")

    print("Training Q Table")
    print(qtable.qtable)



    for i in range(1000): #For initial 1000 games, the AI player would just hit to get the estimates for q table and update the q table, this is randomly chosen
        game1.player_cards.clear()
        game1.dealer_cards.clear()
        game1.card_generator(game1.deck)
        game1.deal_cards(game1.deck)
        while not game1.is_bust(game1.player_cards):
            action=q_table.ai_action(game1.dealer_cards,game1.player_cards,game1.usable_ace(game1.player_cards),qtable)

            if(action=='Hit'):
                game1.hit(game1.player_cards,game1.deck,game1.dealer_cards, qtable)
                print("AI Hits")
            else:
                game1.stay(game1.player_cards,game1.deck,game1.dealer_cards, qtable)
                print("AI Stays")
                break
    
    print("Testing Q Table")
    print(qtable.qtable)  

THE ENTIRE CODE

#Import Functions

import numpy as np
import random as r

class q_table:
  def __init__(self):
    self.hitcount= 1
    self.totalgame= 1
    self.ratio=self.hitcount/self.totalgame
  
  index=[]
  
  def generate_index():
    index=[]
    for i in range(2):
      for j in range(11):
        for k in range(22):
          index_i_j_k = str(i)+"_"+str(j)+"_"+str(k)
          index.append(index_i_j_k)
    return index
  
  index=generate_index()
  
  #The qtable stores the values of the hitcount and the total games played seperated by a "_" for Hit and Stay Column  
  qtable=['0000000000_0000000000' for i in range(968)]
  qtable=np.array(qtable)
  qtable=(qtable.reshape(484,2))
  print((qtable).shape)
  
  
 
  # The update_qtable updates the q_table with Action taken (Hit or stay), Dealer's card and player's card.
  def update_qtable(action, dealer_cards, player_cards, usable_ace, result, obj_qtable):
    qtable=obj_qtable.qtable
    print(result)
    print("function called, Action decided: {} Dealer Cards: {} Player Cards: {} Result: {} ".format(action, dealer_cards[0], player_cards, result))
    if action=='Hit':
      col_index=0
    else:
      col_index=1
    
    tempgame = game()
    
    print(tempgame.sum_cards(player_cards))
    row_index = obj_qtable.index.index(str(usable_ace)+"_"+str(dealer_cards[0])+"_"+str(tempgame.sum_cards(player_cards)))# Calling tempgame for the Q table in the format (Usable ace (0 or 1), dealers cards and players cards)
    
    print(row_index)
    
    print("Before")
    #print(qtable[row_index,col_index])
    q_value=qtable[row_index,col_index]
    print(q_value)
    
    if(result=='Win'):
      counts=(q_value.split("_"))
      counts[0]=int(counts[0])+1
      counts[1]=int(counts[1])+1

    else:
      counts=(q_value.split("_"))
      counts[1]=int(counts[1])+1
      
    q_value=str(counts[0])+'_'+str(counts[1])
    
    print("After")
    print(q_value)
    qtable[row_index,col_index]=q_value
    

    
      
# The ai_action is made for the AI player to take actions once it has learned sufficient number of times to take further actions
  def ai_action(dealer_cards,player_cards,usable_ace,obj_qtable):
    tempgame = game()
    qtable=obj_qtable.qtable
    row_index = obj_qtable.index.index(str(usable_ace)+"_"+str(dealer_cards[0])+"_"+str(tempgame.sum_cards(player_cards)))
    q_value=qtable[row_index,0]

    counts=(q_value.split("_"))
    
    
    #q_value_stay=qtable[row_index,1]
    # This conditions states whether it should hit or stay
    if (int(counts[1])==0 or int(counts[0])/int(counts[1]) > 0.5):
      return 'Hit'
    else:
      return 'Stay'


class game:
  def __init__(self):
    self.dealer_cards = []
    self.player_cards = []
    self.deck = []
    # self.game_result = win/lose
    #q_table1=q_table()
  
    # Random card generator which generates card from 1 to 13, taking the count of face cards as 10
  def card_generator(self,deck):  
    for i in range(4):
      for j in range(1,14):
        if(j>=10):
          self.deck.append(10)
        else:
          self.deck.append(j)
    r.shuffle(self.deck)
  
  def reset_deck(self,deck):  # Resets the deck and calls the card generator function
    self.card_generator(self.deck)
  
  def draw_card(self,deck):  # draws card from the deck and also checks the cards in deck, if it goes below 7 it resets the deck
    if(len(deck)<=7):
      deck = self.reset_deck(self.deck)
    return self.deck.pop()
  
  def deal_cards(self,deck):  # 2 Dealer and 2 player cards are drawn
    self.dealer_cards.append(self.draw_card(self.deck))
    self.dealer_cards.append(self.draw_card(self.deck))
    self.player_cards.append(self.draw_card(self.deck))
    self.player_cards.append(self.draw_card(self.deck))
    
  def usable_ace(self,player_cards):  # Checks, does this hand have a usable ace?
    if(1 in (self.player_cards) and sum(self.player_cards) + 10 <= 21):
      return 1
    else:
      return 0
  
  def sum_cards(self,cards):  # Returns the current hand total
      if self.usable_ace(cards):
          return sum(cards) + 10
      else:
        return sum(cards)
  
  def is_bust(self,cards):  # Checks, if this hand is a bust?
    return self.sum_cards(cards) > 21
  
  def hit(self,player_cards,deck,dealer_cards, qtable):  # Checks if the hand is a bust and draws a card for the player
    self.player_cards.append(self.draw_card(self.deck))
    if(len(player_cards)<=2):
      tmp=0
    else:
      if (not self.is_bust(self.player_cards)):
        q_table.update_qtable('Hit', self.dealer_cards, self.player_cards[:-1], self.usable_ace(self.player_cards), 'Win', qtable)
      else:
        q_table.update_qtable('Hit', self.dealer_cards, self.player_cards[:-1], self.usable_ace(self.player_cards), 'Lose', qtable)
      
  def dealer_hit(self,player_cards,deck,dealer_cards):  # Checks if the dealer hits the card then is it getting bust, if not it hits the card
    if not self.is_bust(self.dealer_cards):
      self.dealer_cards.append(self.draw_card(self.deck))
    else:
      print("Burst! Game over")
         
  def stay(self,player_cards,deck,dealer_cards,qtable):  # When player stays, the dealer checks its sum and ends the game if it is greater than the player
                                                  # If the dealer's sum is less than 22 and less than the player, it will hit    
    if(self.sum_cards(self.dealer_cards)>self.sum_cards(self.player_cards)):
        q_table.update_qtable('Stay', self.dealer_cards, self.player_cards, self.usable_ace(self.player_cards), 'Lose', qtable)
        print(self.sum_cards(self.dealer_cards),self.sum_cards(self.player_cards))
    else:
      while(self.sum_cards(self.dealer_cards)<22 and self.sum_cards(self.dealer_cards)<self.sum_cards(self.player_cards)): #This condition checks the dealer cards when the AI player has opted to stay, it checks if the sum of dealer's cards <22 and if its total is less than dealer's sum, the dealer will hit.
        self.dealer_hit(self.player_cards,deck,self.dealer_cards)
        print("Dealer Cards",self.dealer_cards,self.player_cards)
        print("Player Cards",self.sum_cards(self.dealer_cards),self.sum_cards(self.player_cards))
      if(self.is_bust(self.dealer_cards)): # If dealer busts, we update in the q table as a win for us
        q_table.update_qtable('Stay', self.dealer_cards, self.player_cards, self.usable_ace(self.player_cards), 'Win', qtable)
      else:
        q_table.update_qtable('Stay', self.dealer_cards, self.player_cards, self.usable_ace(self.player_cards), 'Lose', qtable)
            
            
class Blackjack:
  epsilon=1 # We would use a decrementing rate from 1 to 0 for epsilon and run atleast 10000 training episodes so that the player has explored atleast sufficient conditions for it to run on itself
  training_episodes=training_episodes_left=10000
  small_decrement = (0.1 * epsilon) / (0.5 * training_episodes) # reduces epsilon slowly
  big_decrement = (0.3 * epsilon) / (0.6 * training_episodes)
  
  game1=game()
  game1.card_generator(game1.deck)
  game1.deal_cards(game1.deck)
  
  
  print(game1.deck)#prints the deck of the game initially
  print("Dealer Cards",game1.dealer_cards)
  print("Player Cards",game1.player_cards)
  #print("Player Pre hit Cards",game1.player_cards[:-1])

  qtable = q_table()
 
  """ while not game1.is_bust(game1.player_cards):
    game1.hit(game1.player_cards,game1.deck,game1.dealer_cards, qtable)
    print("Player Cards",game1.player_cards)
    print("Player Pre hit Cards",game1.player_cards[:-1]) """

  while(epsilon>=0): # The AI player would be exploring the environment until it has explored sufficient conditions
    game1.player_cards.clear()
    game1.dealer_cards.clear()
    game1.card_generator(game1.deck)
    game1.deal_cards(game1.deck)

    while not game1.is_bust(game1.player_cards):
      game1.hit(game1.player_cards,game1.deck,game1.dealer_cards, qtable)
    
    if training_episodes_left > 0.7 * training_episodes_left:#If the training episodes are greater than 0.70 it will have a small decrement
      epsilon -= small_decrement
    elif training_episodes_left > 0.3 * training_episodes_left:#If the training episodes are greater than 0.30 it will have a bigger decrement
      epsilon -= big_decrement
    elif training_episodes_left > 0:
      epsilon -= small_decrement
    else:
      epsilon = 0.0
      
    training_episodes_left -= 1
    print("Epsilon:",epsilon, "Training Episodes:",training_episodes_left)
    
  
  for i in range(100): #This is break line which divides the system into training and testing, and shows two differen value.
    print("-------------------------------------------------------------------------")
    
  print("Training Q Table")
  print(qtable.qtable)
  
  
  
  for i in range(1000): #For initial 1000 games, the AI player would just hit to get the estimates for q table and update the q table, this is randomly chosen
    game1.player_cards.clear()
    game1.dealer_cards.clear()
    game1.card_generator(game1.deck)
    game1.deal_cards(game1.deck)
    while not game1.is_bust(game1.player_cards):
        action=q_table.ai_action(game1.dealer_cards,game1.player_cards,game1.usable_ace(game1.player_cards),qtable)
    
      if(action=='Hit'):
        game1.hit(game1.player_cards,game1.deck,game1.dealer_cards, qtable)
        print("AI Hits")
      else:
        game1.stay(game1.player_cards,game1.deck,game1.dealer_cards, qtable)
        print("AI Stays")
        break

  print("Testing Q Table")    
  print(qtable.qtable)
  
(484, 2)
[2, 3, 10, 10, 5, 9, 6, 8, 1, 10, 10, 10, 1, 8, 10, 10, 9, 7, 9, 2, 5, 3, 10, 3, 7, 5, 3, 7, 4, 6, 7, 4, 10, 10, 4, 10, 4, 2, 8, 1, 2, 6, 8, 10, 5, 1, 10, 10]
Dealer Cards [10, 10]
Player Cards [9, 6]
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [7, 10] Result: Lose 
17
61
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.99998 Training Episodes: 9999
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 10] Result: Win 
20
42
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 10, 1] Result: Lose 
21
43
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.99996 Training Episodes: 9998
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [7, 3] Result: Win 
10
208
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [7, 3, 10] Result: Lose 
20
218
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9999399999999999 Training Episodes: 9997
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [5, 5] Result: Win 
10
98
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [5, 5, 3] Result: Lose 
13
101
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9999199999999999 Training Episodes: 9996
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 4] Result: Win 
10
230
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 4, 3] Result: Win 
13
233
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 4, 3, 4] Result: Win 
17
237
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 4, 3, 4, 2] Result: Lose 
19
239
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9998999999999999 Training Episodes: 9995
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [8, 5] Result: Lose 
13
145
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9998799999999999 Training Episodes: 9994
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 8] Result: Win 
18
84
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 8, 3] Result: Lose 
21
87
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9998599999999999 Training Episodes: 9993
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [3, 10] Result: Lose 
13
189
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9998399999999998 Training Episodes: 9992
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 1] Result: Win 
9
229
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 1, 10] Result: Lose 
19
239
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9998199999999998 Training Episodes: 9991
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [8, 2] Result: Win 
10
120
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [8, 2, 3] Result: Win 
13
123
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [8, 2, 3, 1] Result: Win 
14
124
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [8, 2, 3, 1, 7] Result: Lose 
21
131
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9997999999999998 Training Episodes: 9990
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [4, 1] Result: Win 
5
181
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [4, 1, 8] Result: Lose 
13
189
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9997799999999998 Training Episodes: 9989
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 7] Result: Lose 
17
215
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9997599999999998 Training Episodes: 9988
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 5] Result: Win 
15
169
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 5, 2] Result: Lose 
17
171
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9997399999999997 Training Episodes: 9987
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [1, 8] Result: Win 
9
207
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [1, 8, 9] Result: Lose 
18
216
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9997199999999997 Training Episodes: 9986
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 2] Result: Win 
11
231
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 2, 6] Result: Win 
17
237
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 2, 6, 1] Result: Lose 
18
238
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9996999999999997 Training Episodes: 9985
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 9] Result: Win 
16
236
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 9, 2] Result: Win 
18
238
Before
0000000000_1
After
1_2
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 9, 2, 1] Result: Lose 
19
239
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9996799999999997 Training Episodes: 9984
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [9, 1] Result: Win 
10
54
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [9, 1, 10] Result: Win 
20
64
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [9, 1, 10, 1] Result: Lose 
21
65
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9996599999999997 Training Episodes: 9983
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [3, 10] Result: Win 
13
145
Before
0000000000_1
After
1_2
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [3, 10, 7] Result: Lose 
20
152
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9996399999999996 Training Episodes: 9982
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 8] Result: Lose 
18
238
Before
1_2
After
1_3
Epsilon: 0.9996199999999996 Training Episodes: 9981
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [10, 4] Result: Win 
14
58
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [10, 4, 1] Result: Lose 
15
59
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9995999999999996 Training Episodes: 9980
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 3] Result: Win 
11
231
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 3, 8] Result: Lose 
19
239
Before
0000000000_3
After
0000000000_4
Epsilon: 0.9995799999999996 Training Episodes: 9979
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [1, 7] Result: Win 
8
140
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [1, 7, 10] Result: Lose 
18
150
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9995599999999996 Training Episodes: 9978
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [8, 10] Result: Win 
18
40
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [8, 10, 3] Result: Lose 
21
43
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9995399999999995 Training Episodes: 9977
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 9] Result: Win 
11
231
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 9, 2] Result: Lose 
13
233
Before
1_1
After
1_2
Epsilon: 0.9995199999999995 Training Episodes: 9976
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 7] Result: Lose 
17
237
Before
2_2
After
2_3
Epsilon: 0.9994999999999995 Training Episodes: 9975
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 5] Result: Win 
7
227
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 5, 6] Result: Lose 
13
233
Before
1_2
After
1_3
Epsilon: 0.9994799999999995 Training Episodes: 9974
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 6] Result: Win 
9
229
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 6, 7] Result: Win 
16
236
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 6, 7, 1] Result: Lose 
17
237
Before
2_3
After
2_4
Epsilon: 0.9994599999999995 Training Episodes: 9973
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [2, 4] Result: Win 
6
380
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [2, 4, 1] Result: Win 
7
139
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [2, 4, 1, 10] Result: Lose 
17
149
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9994399999999994 Training Episodes: 9972
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [5, 10] Result: Lose 
15
169
Before
1_1
After
1_2
Epsilon: 0.9994199999999994 Training Episodes: 9971
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [6, 8] Result: Win 
14
124
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [6, 8, 1] Result: Lose 
15
125
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9993999999999994 Training Episodes: 9970
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [8, 2] Result: Win 
10
98
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [8, 2, 8] Result: Lose 
18
106
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9993799999999994 Training Episodes: 9969
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 8] Result: Win 
10
230
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 8, 2] Result: Win 
12
232
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 8, 2, 6] Result: Win 
18
238
Before
1_3
After
2_4
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 8, 2, 6, 3] Result: Lose 
21
241
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9993599999999994 Training Episodes: 9968
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [7, 1] Result: Win 
8
360
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [7, 1, 1] Result: Win 
9
119
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [7, 1, 1, 10] Result: Lose 
19
129
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9993399999999993 Training Episodes: 9967
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 4] Result: Win 
14
146
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 4, 5] Result: Lose 
19
151
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9993199999999993 Training Episodes: 9966
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [8, 10] Result: Lose 
18
172
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9992999999999993 Training Episodes: 9965
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 10] Result: Lose 
20
218
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9992799999999993 Training Episodes: 9964
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 2] Result: Win 
12
78
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 2, 7] Result: Lose 
19
85
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9992599999999993 Training Episodes: 9963
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [1, 10] Result: Win 
11
77
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [1, 10, 10] Result: Lose 
21
87
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9992399999999992 Training Episodes: 9962
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [1, 7] Result: Win 
8
360
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [1, 7, 3] Result: Win 
11
121
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [1, 7, 3, 2] Result: Win 
13
123
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [1, 7, 3, 2, 8] Result: Lose 
21
131
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9992199999999992 Training Episodes: 9961
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 6] Result: Lose 
16
214
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9991999999999992 Training Episodes: 9960
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9991799999999992 Training Episodes: 9959
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [7, 10] Result: Lose 
17
61
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9991599999999992 Training Episodes: 9958
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10] Result: Win 
13
233
Before
1_3
After
2_4
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10, 1] Result: Win 
14
234
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10, 1, 1] Result: Win 
15
235
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10, 1, 1, 5] Result: Win 
20
240
Before
0000000000_1
After
1_2
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10, 1, 1, 5, 1] Result: Lose 
21
241
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9991399999999991 Training Episodes: 9957
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [2, 10] Result: Win 
12
166
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [2, 10, 5] Result: Lose 
17
171
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9991199999999991 Training Episodes: 9956
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 1] Result: Win 
11
143
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 1, 6] Result: Lose 
17
149
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9990999999999991 Training Episodes: 9955
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 10] Result: Lose 
20
42
Before
1_1
After
1_2
Epsilon: 0.9990799999999991 Training Episodes: 9954
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [9, 10] Result: Lose 
19
129
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9990599999999991 Training Episodes: 9953
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [5, 6] Result: Win 
11
99
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [5, 6, 3] Result: Lose 
14
102
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.999039999999999 Training Episodes: 9952
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 6] Result: Lose 
16
126
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.999019999999999 Training Episodes: 9951
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 10] Result: Win 
11
231
Before
3_3
After
4_4
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 10, 2] Result: Win 
13
233
Before
2_4
After
3_5
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 10, 2, 3] Result: Lose 
16
236
Before
2_2
After
2_3
Epsilon: 0.998999999999999 Training Episodes: 9950
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [8, 10] Result: Lose 
18
106
Before
0000000000_1
After
0000000000_2
Epsilon: 0.998979999999999 Training Episodes: 9949
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [8, 2] Result: Win 
10
164
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [8, 2, 3] Result: Lose 
13
167
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.998959999999999 Training Episodes: 9948
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [2, 10] Result: Lose 
12
78
Before
1_1
After
1_2
Epsilon: 0.9989399999999989 Training Episodes: 9947
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 3] Result: Lose 
13
233
Before
3_5
After
3_6
Epsilon: 0.9989199999999989 Training Episodes: 9946
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [5, 10] Result: Lose 
15
147
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9988999999999989 Training Episodes: 9945
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 10] Result: Lose 
18
238
Before
2_4
After
2_5
Epsilon: 0.9988799999999989 Training Episodes: 9944
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [5, 10] Result: Lose 
15
59
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9988599999999989 Training Episodes: 9943
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [4, 10] Result: Win 
14
102
Before
0000000000_1
After
1_2
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [4, 10, 1] Result: Win 
15
103
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [4, 10, 1, 6] Result: Lose 
21
109
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9988399999999988 Training Episodes: 9942
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 1] Result: Win 
11
77
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 1, 10] Result: Lose 
21
87
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9988199999999988 Training Episodes: 9941
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [4, 1] Result: Win 
5
357
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [4, 1, 1] Result: Win 
6
358
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [4, 1, 1, 5] Result: Win 
11
121
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [4, 1, 1, 5, 10] Result: Lose 
21
131
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9987999999999988 Training Episodes: 9940
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [10, 5] Result: Lose 
15
59
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9987799999999988 Training Episodes: 9939
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 4] Result: Win 
6
226
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 4, 8] Result: Lose 
14
234
Before
1_1
After
1_2
Epsilon: 0.9987599999999988 Training Episodes: 9938
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 3] Result: Win 
13
123
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 3, 4] Result: Lose 
17
127
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9987399999999987 Training Episodes: 9937
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 2] Result: Win 
12
188
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 2, 6] Result: Lose 
18
194
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9987199999999987 Training Episodes: 9936
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
1_2
After
1_3
Epsilon: 0.9986999999999987 Training Episodes: 9935
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 3] Result: Win 
8
228
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 3, 2] Result: Win 
10
230
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 3, 2, 10] Result: Lose 
20
240
Before
1_3
After
1_4
Epsilon: 0.9986799999999987 Training Episodes: 9934
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 9] Result: Lose 
19
195
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9986599999999987 Training Episodes: 9933
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 4] Result: Win 
9
229
Before
2_2
After
3_3
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 4, 4] Result: Win 
13
233
Before
3_6
After
4_7
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 4, 4, 4] Result: Lose 
17
237
Before
2_4
After
2_5
Epsilon: 0.9986399999999986 Training Episodes: 9932
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 1] Result: Win 
11
165
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 1, 10] Result: Lose 
21
175
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9986199999999986 Training Episodes: 9931
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 10] Result: Win 
20
86
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 10, 1] Result: Lose 
21
87
Before
0000000000_3
After
0000000000_4
Epsilon: 0.9985999999999986 Training Episodes: 9930
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [6, 10] Result: Lose 
16
170
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9985799999999986 Training Episodes: 9929
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [2, 10] Result: Win 
12
122
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [2, 10, 4] Result: Lose 
16
126
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9985599999999986 Training Episodes: 9928
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 1] Result: Win 
7
469
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 1, 1] Result: Win 
8
228
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 1, 1, 10] Result: Lose 
18
238
Before
2_5
After
2_6
Epsilon: 0.9985399999999985 Training Episodes: 9927
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
1_4
After
1_5
Epsilon: 0.9985199999999985 Training Episodes: 9926
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [1, 7] Result: Win 
8
96
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [1, 7, 9] Result: Lose 
17
105
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9984999999999985 Training Episodes: 9925
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [9, 6] Result: Lose 
15
169
Before
1_2
After
1_3
Epsilon: 0.9984799999999985 Training Episodes: 9924
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [8, 5] Result: Lose 
13
79
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9984599999999985 Training Episodes: 9923
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 8] Result: Lose 
18
216
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9984399999999984 Training Episodes: 9922
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 10] Result: Win 
17
237
Before
2_5
After
3_6
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 10, 4] Result: Lose 
21
241
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9984199999999984 Training Episodes: 9921
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 10] Result: Win 
19
239
Before
0000000000_4
After
1_5
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 10, 2] Result: Lose 
21
241
Before
0000000000_3
After
0000000000_4
Epsilon: 0.9983999999999984 Training Episodes: 9920
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [2, 7] Result: Win 
9
207
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [2, 7, 6] Result: Win 
15
213
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [2, 7, 6, 3] Result: Lose 
18
216
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9983799999999984 Training Episodes: 9919
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 4] Result: Lose 
14
124
Before
2_2
After
2_3
Epsilon: 0.9983599999999984 Training Episodes: 9918
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 2] Result: Lose 
12
34
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9983399999999983 Training Episodes: 9917
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 8] Result: Win 
11
231
Before
4_4
After
5_5
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 8, 2] Result: Win 
13
233
Before
4_7
After
5_8
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 8, 2, 7] Result: Lose 
20
240
Before
1_5
After
1_6
Epsilon: 0.9983199999999983 Training Episodes: 9916
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [8, 10] Result: Lose 
18
128
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9982999999999983 Training Episodes: 9915
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [2, 10] Result: Win 
12
122
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [2, 10, 9] Result: Lose 
21
131
Before
0000000000_3
After
0000000000_4
Epsilon: 0.9982799999999983 Training Episodes: 9914
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 4] Result: Win 
14
146
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 4, 1] Result: Lose 
15
147
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9982599999999983 Training Episodes: 9913
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [7, 5] Result: Win 
12
144
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [7, 5, 8] Result: Lose 
20
152
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9982399999999982 Training Episodes: 9912
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 10] Result: Win 
14
234
Before
1_2
After
2_3
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 10, 5] Result: Lose 
19
239
Before
1_5
After
1_6
Epsilon: 0.9982199999999982 Training Episodes: 9911
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [1, 10] Result: Win 
11
99
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [1, 10, 10] Result: Lose 
21
109
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9981999999999982 Training Episodes: 9910
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 1] Result: Win 
11
121
Before
2_2
After
3_3
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 1, 1] Result: Win 
12
122
Before
2_2
After
3_3
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 1, 1, 5] Result: Win 
17
127
Before
0000000000_1
After
1_2
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 1, 1, 5, 4] Result: Lose 
21
131
Before
0000000000_4
After
0000000000_5
Epsilon: 0.9981799999999982 Training Episodes: 9909
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [6, 10] Result: Lose 
16
170
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9981599999999982 Training Episodes: 9908
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 3] Result: Win 
6
226
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 3, 4] Result: Win 
10
230
Before
3_3
After
4_4
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 3, 4, 2] Result: Win 
12
232
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 3, 4, 2, 6] Result: Win 
18
238
Before
2_6
After
3_7
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 3, 4, 2, 6, 1] Result: Win 
19
239
Before
1_6
After
2_7
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 3, 4, 2, 6, 1, 2] Result: Lose 
21
241
Before
0000000000_4
After
0000000000_5
Epsilon: 0.9981399999999981 Training Episodes: 9907
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 10] Result: Lose 
20
152
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9981199999999981 Training Episodes: 9906
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 3] Result: Win 
13
123
Before
3_3
After
4_4
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 3, 7] Result: Win 
20
130
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 3, 7, 1] Result: Lose 
21
131
Before
0000000000_5
After
0000000000_6
Epsilon: 0.9980999999999981 Training Episodes: 9905
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 10] Result: Lose 
20
86
Before
1_1
After
1_2
Epsilon: 0.9980799999999981 Training Episodes: 9904
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [9, 6] Result: Win 
15
125
Before
0000000000_1
After
1_2
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [9, 6, 6] Result: Lose 
21
131
Before
0000000000_6
After
0000000000_7
Epsilon: 0.9980599999999981 Training Episodes: 9903
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 7] Result: Lose 
17
105
Before
0000000000_1
After
0000000000_2
Epsilon: 0.998039999999998 Training Episodes: 9902
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 10] Result: Lose 
20
152
Before
0000000000_3
After
0000000000_4
Epsilon: 0.998019999999998 Training Episodes: 9901
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 6] Result: Win 
16
214
Before
0000000000_1
After
1_2
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 6, 3] Result: Lose 
19
217
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.997999999999998 Training Episodes: 9900
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [6, 7] Result: Win 
13
145
Before
1_2
After
2_3
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [6, 7, 8] Result: Lose 
21
153
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.997979999999998 Training Episodes: 9899
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [8, 3] Result: Win 
11
209
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [8, 3, 10] Result: Lose 
21
219
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.997959999999998 Training Episodes: 9898
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 1] Result: Win 
11
231
Before
5_5
After
6_6
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 1, 1] Result: Win 
12
232
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 1, 1, 6] Result: Lose 
18
238
Before
3_7
After
3_8
Epsilon: 0.9979399999999979 Training Episodes: 9897
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [2, 5] Result: Win 
7
161
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [2, 5, 10] Result: Lose 
17
171
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9979199999999979 Training Episodes: 9896
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [2, 7] Result: Win 
9
405
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [2, 7, 1] Result: Win 
10
164
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [2, 7, 1, 4] Result: Win 
14
168
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [2, 7, 1, 4, 3] Result: Lose 
17
171
Before
0000000000_3
After
0000000000_4
Epsilon: 0.9978999999999979 Training Episodes: 9895
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 10] Result: Lose 
20
218
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9978799999999979 Training Episodes: 9894
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [10, 5] Result: Lose 
15
59
Before
0000000000_3
After
0000000000_4
Epsilon: 0.9978599999999979 Training Episodes: 9893
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [9, 2] Result: Win 
11
33
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [9, 2, 3] Result: Win 
14
36
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [9, 2, 3, 5] Result: Lose 
19
41
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9978399999999978 Training Episodes: 9892
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [3, 4] Result: Win 
7
51
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [3, 4, 10] Result: Lose 
17
61
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9978199999999978 Training Episodes: 9891
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 1] Result: Win 
11
99
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 1, 6] Result: Lose 
17
105
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9977999999999978 Training Episodes: 9890
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 3] Result: Win 
5
225
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 3, 3] Result: Win 
8
228
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 3, 3, 10] Result: Lose 
18
238
Before
3_8
After
3_9
Epsilon: 0.9977799999999978 Training Episodes: 9889
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 6] Result: Win 
11
231
Before
6_6
After
7_7
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 6, 7] Result: Lose 
18
238
Before
3_9
After
3_10
Epsilon: 0.9977599999999978 Training Episodes: 9888
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 5] Result: Win 
15
125
Before
1_2
After
2_3
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 5, 2] Result: Win 
17
127
Before
1_2
After
2_3
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 5, 2, 4] Result: Lose 
21
131
Before
0000000000_7
After
0000000000_8
Epsilon: 0.9977399999999977 Training Episodes: 9887
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [9, 5] Result: Win 
14
80
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [9, 5, 5] Result: Lose 
19
85
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9977199999999977 Training Episodes: 9886
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [2, 6] Result: Win 
8
96
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [2, 6, 6] Result: Win 
14
102
Before
1_2
After
2_3
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [2, 6, 6, 1] Result: Lose 
15
103
Before
1_1
After
1_2
Epsilon: 0.9976999999999977 Training Episodes: 9885
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [6, 10] Result: Win 
16
82
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [6, 10, 4] Result: Win 
20
86
Before
1_2
After
2_3
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [6, 10, 4, 1] Result: Lose 
21
87
Before
0000000000_4
After
0000000000_5
Epsilon: 0.9976799999999977 Training Episodes: 9884
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [9, 3] Result: Lose 
12
144
Before
1_1
After
1_2
Epsilon: 0.9976599999999977 Training Episodes: 9883
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [2, 9] Result: Win 
11
187
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [2, 9, 10] Result: Lose 
21
197
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9976399999999976 Training Episodes: 9882
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 4] Result: Win 
14
234
Before
2_3
After
3_4
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 4, 2] Result: Win 
16
236
Before
2_3
After
3_4
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 4, 2, 4] Result: Lose 
20
240
Before
1_6
After
1_7
Epsilon: 0.9976199999999976 Training Episodes: 9881
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 4] Result: Lose 
14
234
Before
3_4
After
3_5
Epsilon: 0.9975999999999976 Training Episodes: 9880
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 4] Result: Win 
11
231
Before
7_7
After
8_8
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 4, 5] Result: Lose 
16
236
Before
3_4
After
3_5
Epsilon: 0.9975799999999976 Training Episodes: 9879
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [5, 1] Result: Win 
6
380
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [5, 1, 4] Result: Win 
10
142
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [5, 1, 4, 6] Result: Lose 
16
148
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9975599999999976 Training Episodes: 9878
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10] Result: Win 
13
233
Before
5_8
After
6_9
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10, 3] Result: Lose 
16
236
Before
3_5
After
3_6
Epsilon: 0.9975399999999975 Training Episodes: 9877
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [8, 9] Result: Lose 
17
127
Before
2_3
After
2_4
Epsilon: 0.9975199999999975 Training Episodes: 9876
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 10] Result: Lose 
14
234
Before
3_5
After
3_6
Epsilon: 0.9974999999999975 Training Episodes: 9875
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [4, 5] Result: Win 
9
97
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [4, 5, 9] Result: Lose 
18
106
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9974799999999975 Training Episodes: 9874
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [7, 8] Result: Lose 
15
103
Before
1_2
After
1_3
Epsilon: 0.9974599999999975 Training Episodes: 9873
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 8] Result: Lose 
18
216
Before
0000000000_3
After
0000000000_4
Epsilon: 0.9974399999999974 Training Episodes: 9872
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 6] Result: Win 
16
126
Before
0000000000_2
After
1_3
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 6, 2] Result: Lose 
18
128
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9974199999999974 Training Episodes: 9871
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 5] Result: Lose 
15
81
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9973999999999974 Training Episodes: 9870
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 3] Result: Win 
9
229
Before
3_3
After
4_4
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 3, 7] Result: Lose 
16
236
Before
3_6
After
3_7
Epsilon: 0.9973799999999974 Training Episodes: 9869
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 9] Result: Win 
18
238
Before
3_10
After
4_11
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 9, 3] Result: Lose 
21
241
Before
0000000000_5
After
0000000000_6
Epsilon: 0.9973599999999974 Training Episodes: 9868
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [10, 9] Result: Lose 
19
63
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9973399999999973 Training Episodes: 9867
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [5, 10] Result: Win 
15
169
Before
1_3
After
2_4
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [5, 10, 1] Result: Lose 
16
170
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9973199999999973 Training Episodes: 9866
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [9, 3] Result: Win 
12
188
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [9, 3, 5] Result: Lose 
17
193
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9972999999999973 Training Episodes: 9865
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
1_7
After
1_8
Epsilon: 0.9972799999999973 Training Episodes: 9864
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 2] Result: Win 
3
465
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 2, 1] Result: Win 
4
224
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 2, 1, 9] Result: Win 
13
233
Before
6_9
After
7_10
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 2, 1, 9, 6] Result: Lose 
19
239
Before
2_7
After
2_8
Epsilon: 0.9972599999999973 Training Episodes: 9863
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [1, 9] Result: Win 
10
98
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [1, 9, 6] Result: Lose 
16
104
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9972399999999972 Training Episodes: 9862
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 7] Result: Lose 
17
127
Before
2_4
After
2_5
Epsilon: 0.9972199999999972 Training Episodes: 9861
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [6, 8] Result: Lose 
14
146
Before
2_2
After
2_3
Epsilon: 0.9971999999999972 Training Episodes: 9860
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 7] Result: Lose 
17
215
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9971799999999972 Training Episodes: 9859
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [2, 3] Result: Win 
5
445
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [2, 3, 1] Result: Win 
6
204
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [2, 3, 1, 10] Result: Win 
16
214
Before
1_2
After
2_3
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [2, 3, 1, 10, 3] Result: Lose 
19
217
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9971599999999972 Training Episodes: 9858
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [7, 10] Result: Lose 
17
105
Before
0000000000_3
After
0000000000_4
Epsilon: 0.9971399999999971 Training Episodes: 9857
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 3] Result: Lose 
13
101
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9971199999999971 Training Episodes: 9856
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [10, 9] Result: Lose 
19
63
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9970999999999971 Training Episodes: 9855
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10] Result: Win 
13
233
Before
7_10
After
8_11
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10, 5] Result: Lose 
18
238
Before
4_11
After
4_12
Epsilon: 0.9970799999999971 Training Episodes: 9854
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [10, 9] Result: Win 
19
63
Before
0000000000_2
After
1_3
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [10, 9, 2] Result: Lose 
21
65
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9970599999999971 Training Episodes: 9853
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [1, 6] Result: Win 
7
73
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [1, 6, 10] Result: Win 
17
83
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [1, 6, 10, 4] Result: Lose 
21
87
Before
0000000000_5
After
0000000000_6
Epsilon: 0.997039999999997 Training Episodes: 9852
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 9] Result: Win 
19
239
Before
2_8
After
3_9
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 9, 1] Result: Lose 
20
240
Before
1_8
After
1_9
Epsilon: 0.997019999999997 Training Episodes: 9851
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [5, 7] Result: Win 
12
188
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [5, 7, 7] Result: Lose 
19
195
Before
0000000000_1
After
0000000000_2
Epsilon: 0.996999999999997 Training Episodes: 9850
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [8, 9] Result: Lose 
17
171
Before
0000000000_4
After
0000000000_5
Epsilon: 0.996979999999997 Training Episodes: 9849
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [1, 10] Result: Win 
11
33
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [1, 10, 10] Result: Lose 
21
43
Before
0000000000_2
After
0000000000_3
Epsilon: 0.996959999999997 Training Episodes: 9848
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [10, 4] Result: Lose 
14
58
Before
1_1
After
1_2
Epsilon: 0.9969399999999969 Training Episodes: 9847
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [4, 9] Result: Win 
13
57
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [4, 9, 6] Result: Lose 
19
63
Before
1_3
After
1_4
Epsilon: 0.9969199999999969 Training Episodes: 9846
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 10] Result: Lose 
20
218
Before
0000000000_3
After
0000000000_4
Epsilon: 0.9968999999999969 Training Episodes: 9845
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [2, 10] Result: Win 
12
34
Before
0000000000_1
After
1_2
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [2, 10, 6] Result: Lose 
18
40
Before
1_1
After
1_2
Epsilon: 0.9968799999999969 Training Episodes: 9844
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
1_9
After
1_10
Epsilon: 0.9968599999999969 Training Episodes: 9843
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [7, 10] Result: Lose 
17
171
Before
0000000000_5
After
0000000000_6
Epsilon: 0.9968399999999968 Training Episodes: 9842
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [1, 9] Result: Win 
10
120
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [1, 9, 9] Result: Win 
19
129
Before
0000000000_2
After
1_3
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [1, 9, 9, 2] Result: Lose 
21
131
Before
0000000000_8
After
0000000000_9
Epsilon: 0.9968199999999968 Training Episodes: 9841
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [7, 10] Result: Lose 
17
105
Before
0000000000_4
After
0000000000_5
Epsilon: 0.9967999999999968 Training Episodes: 9840
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [3, 9] Result: Win 
12
210
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [3, 9, 5] Result: Lose 
17
215
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9967799999999968 Training Episodes: 9839
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [4, 10] Result: Lose 
14
102
Before
2_3
After
2_4
Epsilon: 0.9967599999999968 Training Episodes: 9838
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 8] Result: Lose 
15
235
Before
1_1
After
1_2
Epsilon: 0.9967399999999967 Training Episodes: 9837
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [9, 6] Result: Lose 
15
81
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9967199999999967 Training Episodes: 9836
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 8] Result: Win 
18
106
Before
0000000000_3
After
1_4
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 8, 2] Result: Lose 
20
108
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9966999999999967 Training Episodes: 9835
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [1, 2] Result: Win 
3
157
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [1, 2, 10] Result: Win 
13
167
Before
0000000000_1
After
1_2
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [1, 2, 10, 4] Result: Win 
17
171
Before
0000000000_6
After
1_7
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [1, 2, 10, 4, 1] Result: Lose 
18
172
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9966799999999967 Training Episodes: 9834
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [8, 3] Result: Win 
11
165
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [8, 3, 10] Result: Lose 
21
175
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9966599999999967 Training Episodes: 9833
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 5] Result: Win 
15
235
Before
1_2
After
2_3
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 5, 2] Result: Win 
17
237
Before
3_6
After
4_7
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 5, 2, 3] Result: Lose 
20
240
Before
1_10
After
1_11
Epsilon: 0.9966399999999966 Training Episodes: 9832
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [6, 10] Result: Lose 
16
126
Before
1_3
After
1_4
Epsilon: 0.9966199999999966 Training Episodes: 9831
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [9, 10] Result: Lose 
19
217
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9965999999999966 Training Episodes: 9830
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [8, 6] Result: Lose 
14
58
Before
1_2
After
1_3
Epsilon: 0.9965799999999966 Training Episodes: 9829
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 10] Result: Lose 
17
237
Before
4_7
After
4_8
Epsilon: 0.9965599999999966 Training Episodes: 9828
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 3] Result: Win 
7
227
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 3, 6] Result: Win 
13
233
Before
8_11
After
9_12
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 3, 6, 2] Result: Lose 
15
235
Before
2_3
After
2_4
Epsilon: 0.9965399999999965 Training Episodes: 9827
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 10] Result: Win 
15
235
Before
2_4
After
3_5
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 10, 3] Result: Lose 
18
238
Before
4_12
After
4_13
Epsilon: 0.9965199999999965 Training Episodes: 9826
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [9, 1] Result: Win 
10
120
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [9, 1, 10] Result: Lose 
20
130
Before
1_1
After
1_2
Epsilon: 0.9964999999999965 Training Episodes: 9825
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [4, 9] Result: Lose 
13
79
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9964799999999965 Training Episodes: 9824
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [6, 10] Result: Lose 
16
126
Before
1_4
After
1_5
Epsilon: 0.9964599999999965 Training Episodes: 9823
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 10] Result: Lose 
20
86
Before
2_3
After
2_4
Epsilon: 0.9964399999999964 Training Episodes: 9822
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [3, 10] Result: Lose 
13
101
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9964199999999964 Training Episodes: 9821
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 10] Result: Lose 
20
152
Before
0000000000_4
After
0000000000_5
Epsilon: 0.9963999999999964 Training Episodes: 9820
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [3, 3] Result: Win 
6
50
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [3, 3, 4] Result: Win 
10
54
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [3, 3, 4, 10] Result: Lose 
20
64
Before
1_1
After
1_2
Epsilon: 0.9963799999999964 Training Episodes: 9819
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 2] Result: Lose 
12
232
Before
3_3
After
3_4
Epsilon: 0.9963599999999964 Training Episodes: 9818
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [1, 10] Result: Win 
11
165
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [1, 10, 10] Result: Lose 
21
175
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9963399999999963 Training Episodes: 9817
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
1_11
After
1_12
Epsilon: 0.9963199999999963 Training Episodes: 9816
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 1] Result: Win 
11
231
Before
8_8
After
9_9
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 1, 9] Result: Lose 
20
240
Before
1_12
After
1_13
Epsilon: 0.9962999999999963 Training Episodes: 9815
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 10] Result: Lose 
20
86
Before
2_4
After
2_5
Epsilon: 0.9962799999999963 Training Episodes: 9814
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [6, 3] Result: Win 
9
185
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [6, 3, 8] Result: Lose 
17
193
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9962599999999963 Training Episodes: 9813
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [9, 1] Result: Win 
10
76
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [9, 1, 10] Result: Win 
20
86
Before
2_5
After
3_6
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [9, 1, 10, 1] Result: Lose 
21
87
Before
0000000000_6
After
0000000000_7
Epsilon: 0.9962399999999962 Training Episodes: 9812
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
1_13
After
1_14
Epsilon: 0.9962199999999962 Training Episodes: 9811
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [5, 6] Result: Win 
11
165
Before
3_3
After
4_4
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [5, 6, 8] Result: Lose 
19
173
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9961999999999962 Training Episodes: 9810
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 9] Result: Lose 
19
239
Before
3_9
After
3_10
Epsilon: 0.9961799999999962 Training Episodes: 9809
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 5] Result: Lose 
15
125
Before
2_3
After
2_4
Epsilon: 0.9961599999999962 Training Episodes: 9808
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 9] Result: Lose 
19
173
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9961399999999961 Training Episodes: 9807
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 9] Result: Lose 
19
217
Before
0000000000_3
After
0000000000_4
Epsilon: 0.9961199999999961 Training Episodes: 9806
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [1, 4] Result: Win 
5
313
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [1, 4, 4] Result: Win 
9
75
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [1, 4, 4, 5] Result: Win 
14
80
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [1, 4, 4, 5, 7] Result: Lose 
21
87
Before
0000000000_7
After
0000000000_8
Epsilon: 0.9960999999999961 Training Episodes: 9805
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
1_14
After
1_15
Epsilon: 0.9960799999999961 Training Episodes: 9804
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 7] Result: Lose 
17
237
Before
4_8
After
4_9
Epsilon: 0.9960599999999961 Training Episodes: 9803
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 1] Result: Win 
10
230
Before
4_4
After
5_5
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 1, 10] Result: Lose 
20
240
Before
1_15
After
1_16
Epsilon: 0.996039999999996 Training Episodes: 9802
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 8] Result: Lose 
14
234
Before
3_6
After
3_7
Epsilon: 0.996019999999996 Training Episodes: 9801
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 6] Result: Lose 
16
214
Before
2_3
After
2_4
Epsilon: 0.995999999999996 Training Episodes: 9800
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 6] Result: Lose 
16
236
Before
3_7
After
3_8
Epsilon: 0.995979999999996 Training Episodes: 9799
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [6, 10] Result: Win 
16
60
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [6, 10, 2] Result: Lose 
18
62
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.995959999999996 Training Episodes: 9798
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 7] Result: Win 
11
231
Before
9_9
After
10_10
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 7, 6] Result: Win 
17
237
Before
4_9
After
5_10
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 7, 6, 1] Result: Lose 
18
238
Before
4_13
After
4_14
Epsilon: 0.9959399999999959 Training Episodes: 9797
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 6] Result: Lose 
16
236
Before
3_8
After
3_9
Epsilon: 0.9959199999999959 Training Episodes: 9796
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [4, 5] Result: Win 
9
207
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [4, 5, 10] Result: Lose 
19
217
Before
0000000000_4
After
0000000000_5
Epsilon: 0.9958999999999959 Training Episodes: 9795
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 4] Result: Lose 
14
80
Before
2_2
After
2_3
Epsilon: 0.9958799999999959 Training Episodes: 9794
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 9] Result: Win 
12
232
Before
3_4
After
4_5
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 9, 6] Result: Win 
18
238
Before
4_14
After
5_15
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 9, 6, 1] Result: Lose 
19
239
Before
3_10
After
3_11
Epsilon: 0.9958599999999959 Training Episodes: 9793
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [8, 5] Result: Win 
13
101
Before
0000000000_3
After
1_4
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [8, 5, 7] Result: Lose 
20
108
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9958399999999958 Training Episodes: 9792
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 1] Result: Win 
8
470
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 1, 3] Result: Win 
11
231
Before
10_10
After
11_11
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 1, 3, 4] Result: Win 
15
235
Before
3_5
After
4_6
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 1, 3, 4, 5] Result: Lose 
20
240
Before
1_16
After
1_17
Epsilon: 0.9958199999999958 Training Episodes: 9791
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [1, 7] Result: Win 
8
404
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [1, 7, 1] Result: Win 
9
163
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [1, 7, 1, 4] Result: Win 
13
167
Before
1_2
After
2_3
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [1, 7, 1, 4, 8] Result: Lose 
21
175
Before
0000000000_3
After
0000000000_4
Epsilon: 0.9957999999999958 Training Episodes: 9790
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [2, 1] Result: Win 
3
267
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [2, 1, 3] Result: Win 
6
28
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [2, 1, 3, 8] Result: Lose 
14
36
Before
1_1
After
1_2
Epsilon: 0.9957799999999958 Training Episodes: 9789
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
1_17
After
1_18
Epsilon: 0.9957599999999958 Training Episodes: 9788
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [5, 4] Result: Win 
9
31
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [5, 4, 10] Result: Lose 
19
41
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9957399999999957 Training Episodes: 9787
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [6, 2] Result: Win 
8
118
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [6, 2, 10] Result: Win 
18
128
Before
0000000000_2
After
1_3
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [6, 2, 10, 3] Result: Lose 
21
131
Before
0000000000_9
After
0000000000_10
Epsilon: 0.9957199999999957 Training Episodes: 9786
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [5, 4] Result: Win 
9
185
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [5, 4, 9] Result: Lose 
18
194
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9956999999999957 Training Episodes: 9785
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [3, 5] Result: Win 
8
30
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [3, 5, 2] Result: Win 
10
32
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [3, 5, 2, 3] Result: Lose 
13
35
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9956799999999957 Training Episodes: 9784
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 6] Result: Win 
16
236
Before
3_9
After
4_10
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 6, 4] Result: Lose 
20
240
Before
1_18
After
1_19
Epsilon: 0.9956599999999957 Training Episodes: 9783
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [4, 5] Result: Win 
9
163
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [4, 5, 10] Result: Lose 
19
173
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9956399999999956 Training Episodes: 9782
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [5, 7] Result: Lose 
12
100
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9956199999999956 Training Episodes: 9781
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 6] Result: Win 
11
231
Before
11_11
After
12_12
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 6, 3] Result: Win 
14
234
Before
3_7
After
4_8
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 6, 3, 2] Result: Lose 
16
236
Before
4_10
After
4_11
Epsilon: 0.9955999999999956 Training Episodes: 9780
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 7] Result: Lose 
17
237
Before
5_10
After
5_11
Epsilon: 0.9955799999999956 Training Episodes: 9779
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 4] Result: Win 
10
230
Before
5_5
After
6_6
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 4, 4] Result: Win 
14
234
Before
4_8
After
5_9
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 4, 4, 7] Result: Lose 
21
241
Before
0000000000_6
After
0000000000_7
Epsilon: 0.9955599999999956 Training Episodes: 9778
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [4, 10] Result: Win 
14
190
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [4, 10, 5] Result: Lose 
19
195
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9955399999999955 Training Episodes: 9777
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 4] Result: Win 
7
227
Before
2_2
After
3_3
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 4, 9] Result: Win 
16
236
Before
4_11
After
5_12
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 4, 9, 2] Result: Lose 
18
238
Before
5_15
After
5_16
Epsilon: 0.9955199999999955 Training Episodes: 9776
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [5, 4] Result: Win 
9
361
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [5, 4, 1] Result: Win 
10
120
Before
3_3
After
4_4
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [5, 4, 1, 10] Result: Lose 
20
130
Before
1_2
After
1_3
Epsilon: 0.9954999999999955 Training Episodes: 9775
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 5] Result: Lose 
13
233
Before
9_12
After
9_13
Epsilon: 0.9954799999999955 Training Episodes: 9774
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 3] Result: Win 
13
189
Before
0000000000_2
After
1_3
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 3, 1] Result: Lose 
14
190
Before
1_1
After
1_2
Epsilon: 0.9954599999999955 Training Episodes: 9773
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 5] Result: Win 
14
234
Before
5_9
After
6_10
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 5, 7] Result: Lose 
21
241
Before
0000000000_7
After
0000000000_8
Epsilon: 0.9954399999999954 Training Episodes: 9772
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [3, 1] Result: Win 
4
136
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [3, 1, 10] Result: Lose 
14
146
Before
2_3
After
2_4
Epsilon: 0.9954199999999954 Training Episodes: 9771
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [4, 4] Result: Win 
8
184
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [4, 4, 7] Result: Lose 
15
191
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9953999999999954 Training Episodes: 9770
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [4, 3] Result: Win 
7
117
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [4, 3, 9] Result: Win 
16
126
Before
1_5
After
2_6
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [4, 3, 9, 5] Result: Lose 
21
131
Before
0000000000_10
After
0000000000_11
Epsilon: 0.9953799999999954 Training Episodes: 9769
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 10] Result: Lose 
20
196
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9953599999999954 Training Episodes: 9768
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 9] Result: Lose 
19
239
Before
3_11
After
3_12
Epsilon: 0.9953399999999953 Training Episodes: 9767
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [8, 10] Result: Lose 
18
106
Before
1_4
After
1_5
Epsilon: 0.9953199999999953 Training Episodes: 9766
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 7] Result: Win 
17
127
Before
2_5
After
3_6
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 7, 2] Result: Win 
19
129
Before
1_3
After
2_4
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 7, 2, 1] Result: Lose 
20
130
Before
1_3
After
1_4
Epsilon: 0.9952999999999953 Training Episodes: 9765
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [2, 6] Result: Win 
8
74
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [2, 6, 7] Result: Lose 
15
81
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9952799999999953 Training Episodes: 9764
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [9, 8] Result: Lose 
17
83
Before
1_1
After
1_2
Epsilon: 0.9952599999999953 Training Episodes: 9763
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 6] Result: Win 
16
126
Before
2_6
After
3_7
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 6, 1] Result: Lose 
17
127
Before
3_6
After
3_7
Epsilon: 0.9952399999999952 Training Episodes: 9762
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [2, 8] Result: Win 
10
120
Before
4_4
After
5_5
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [2, 8, 8] Result: Lose 
18
128
Before
1_3
After
1_4
Epsilon: 0.9952199999999952 Training Episodes: 9761
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [10, 10] Result: Lose 
20
64
Before
1_2
After
1_3
Epsilon: 0.9951999999999952 Training Episodes: 9760
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [5, 4] Result: Win 
9
53
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [5, 4, 2] Result: Win 
11
55
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [5, 4, 2, 5] Result: Lose 
16
60
Before
1_1
After
1_2
Epsilon: 0.9951799999999952 Training Episodes: 9759
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [9, 6] Result: Lose 
15
147
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9951599999999952 Training Episodes: 9758
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 3] Result: Lose 
12
232
Before
4_5
After
4_6
Epsilon: 0.9951399999999951 Training Episodes: 9757
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [2, 1] Result: Win 
3
157
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [2, 1, 10] Result: Win 
13
167
Before
2_3
After
3_4
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [2, 1, 10, 2] Result: Lose 
15
169
Before
2_4
After
2_5
Epsilon: 0.9951199999999951 Training Episodes: 9756
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [8, 1] Result: Win 
9
317
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [8, 1, 2] Result: Win 
11
77
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [8, 1, 2, 10] Result: Lose 
21
87
Before
0000000000_8
After
0000000000_9
Epsilon: 0.9950999999999951 Training Episodes: 9755
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [8, 10] Result: Lose 
18
40
Before
1_2
After
1_3
Epsilon: 0.9950799999999951 Training Episodes: 9754
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [1, 10] Result: Win 
11
55
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [1, 10, 1] Result: Win 
12
56
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [1, 10, 1, 3] Result: Lose 
15
59
Before
0000000000_4
After
0000000000_5
Epsilon: 0.9950599999999951 Training Episodes: 9753
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [4, 4] Result: Win 
8
52
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [4, 4, 5] Result: Lose 
13
57
Before
1_1
After
1_2
Epsilon: 0.995039999999995 Training Episodes: 9752
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 2] Result: Win 
12
210
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 2, 7] Result: Win 
19
217
Before
0000000000_5
After
1_6
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 2, 7, 2] Result: Lose 
21
219
Before
0000000000_1
After
0000000000_2
Epsilon: 0.995019999999995 Training Episodes: 9751
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 10] Result: Win 
12
232
Before
4_6
After
5_7
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 10, 7] Result: Lose 
19
239
Before
3_12
After
3_13
Epsilon: 0.994999999999995 Training Episodes: 9750
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [6, 10] Result: Win 
16
126
Before
3_7
After
4_8
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [6, 10, 2] Result: Win 
18
128
Before
1_4
After
2_5
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [6, 10, 2, 1] Result: Lose 
19
129
Before
2_4
After
2_5
Epsilon: 0.994979999999995 Training Episodes: 9749
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [3, 10] Result: Lose 
13
167
Before
3_4
After
3_5
Epsilon: 0.994959999999995 Training Episodes: 9748
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 8] Result: Lose 
18
238
Before
5_16
After
5_17
Epsilon: 0.9949399999999949 Training Episodes: 9747
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [3, 8] Result: Win 
11
143
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [3, 8, 9] Result: Lose 
20
152
Before
0000000000_5
After
0000000000_6
Epsilon: 0.9949199999999949 Training Episodes: 9746
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 8] Result: Lose 
18
238
Before
5_17
After
5_18
Epsilon: 0.9948999999999949 Training Episodes: 9745
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 10] Result: Lose 
20
218
Before
0000000000_4
After
0000000000_5
Epsilon: 0.9948799999999949 Training Episodes: 9744
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 8] Result: Lose 
12
232
Before
5_7
After
5_8
Epsilon: 0.9948599999999949 Training Episodes: 9743
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [7, 5] Result: Lose 
12
210
Before
2_2
After
2_3
Epsilon: 0.9948399999999948 Training Episodes: 9742
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [1, 7] Result: Win 
8
118
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [1, 7, 10] Result: Lose 
18
128
Before
2_5
After
2_6
Epsilon: 0.9948199999999948 Training Episodes: 9741
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 6] Result: Win 
9
229
Before
4_4
After
5_5
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 6, 3] Result: Lose 
12
232
Before
5_8
After
5_9
Epsilon: 0.9947999999999948 Training Episodes: 9740
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 7] Result: Lose 
14
234
Before
6_10
After
6_11
Epsilon: 0.9947799999999948 Training Episodes: 9739
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [6, 8] Result: Win 
14
102
Before
2_4
After
3_5
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [6, 8, 2] Result: Lose 
16
104
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9947599999999948 Training Episodes: 9738
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 5] Result: Lose 
15
103
Before
1_3
After
1_4
Epsilon: 0.9947399999999947 Training Episodes: 9737
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 6] Result: Lose 
12
232
Before
5_9
After
5_10
Epsilon: 0.9947199999999947 Training Episodes: 9736
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 10] Result: Lose 
20
218
Before
0000000000_5
After
0000000000_6
Epsilon: 0.9946999999999947 Training Episodes: 9735
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 5] Result: Win 
10
230
Before
6_6
After
7_7
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 5, 8] Result: Lose 
18
238
Before
5_18
After
5_19
Epsilon: 0.9946799999999947 Training Episodes: 9734
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 5] Result: Lose 
15
235
Before
4_6
After
4_7
Epsilon: 0.9946599999999947 Training Episodes: 9733
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 5] Result: Win 
13
233
Before
9_13
After
10_14
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 5, 7] Result: Lose 
20
240
Before
1_19
After
1_20
Epsilon: 0.9946399999999946 Training Episodes: 9732
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 10] Result: Lose 
20
108
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9946199999999946 Training Episodes: 9731
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 9] Result: Lose 
19
107
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9945999999999946 Training Episodes: 9730
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [1, 2] Result: Win 
3
113
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [1, 2, 10] Result: Lose 
13
123
Before
4_4
After
4_5
Epsilon: 0.9945799999999946 Training Episodes: 9729
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [2, 3] Result: Win 
5
115
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [2, 3, 2] Result: Win 
7
117
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [2, 3, 2, 10] Result: Lose 
17
127
Before
3_7
After
3_8
Epsilon: 0.9945599999999946 Training Episodes: 9728
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 8] Result: Win 
18
106
Before
1_5
After
2_6
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 8, 3] Result: Lose 
21
109
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9945399999999945 Training Episodes: 9727
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [9, 3] Result: Lose 
12
122
Before
3_3
After
3_4
Epsilon: 0.9945199999999945 Training Episodes: 9726
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [9, 4] Result: Lose 
13
189
Before
1_3
After
1_4
Epsilon: 0.9944999999999945 Training Episodes: 9725
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [7, 8] Result: Win 
15
81
Before
0000000000_3
After
1_4
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [7, 8, 5] Result: Win 
20
86
Before
3_6
After
4_7
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [7, 8, 5, 1] Result: Lose 
21
87
Before
0000000000_9
After
0000000000_10
Epsilon: 0.9944799999999945 Training Episodes: 9724
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [2, 8] Result: Win 
10
120
Before
5_5
After
6_6
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [2, 8, 10] Result: Lose 
20
130
Before
1_4
After
1_5
Epsilon: 0.9944599999999945 Training Episodes: 9723
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 1] Result: Win 
11
121
Before
3_3
After
4_4
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 1, 10] Result: Lose 
21
131
Before
0000000000_11
After
0000000000_12
Epsilon: 0.9944399999999944 Training Episodes: 9722
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [1, 10] Result: Win 
11
77
Before
3_3
After
4_4
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [1, 10, 1] Result: Lose 
12
78
Before
1_2
After
1_3
Epsilon: 0.9944199999999944 Training Episodes: 9721
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 6] Result: Lose 
16
236
Before
5_12
After
5_13
Epsilon: 0.9943999999999944 Training Episodes: 9720
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 4] Result: Win 
11
231
Before
12_12
After
13_13
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 4, 1] Result: Lose 
12
232
Before
5_10
After
5_11
Epsilon: 0.9943799999999944 Training Episodes: 9719
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 10] Result: Win 
11
231
Before
13_13
After
14_14
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 10, 3] Result: Lose 
14
234
Before
6_11
After
6_12
Epsilon: 0.9943599999999944 Training Episodes: 9718
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [7, 10] Result: Lose 
17
83
Before
1_2
After
1_3
Epsilon: 0.9943399999999943 Training Episodes: 9717
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 6] Result: Lose 
16
236
Before
5_13
After
5_14
Epsilon: 0.9943199999999943 Training Episodes: 9716
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 10] Result: Lose 
18
238
Before
5_19
After
5_20
Epsilon: 0.9942999999999943 Training Episodes: 9715
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [6, 10] Result: Lose 
16
214
Before
2_4
After
2_5
Epsilon: 0.9942799999999943 Training Episodes: 9714
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 2] Result: Win 
4
224
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 2, 10] Result: Lose 
14
234
Before
6_12
After
6_13
Epsilon: 0.9942599999999943 Training Episodes: 9713
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [6, 5] Result: Win 
11
187
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [6, 5, 5] Result: Lose 
16
192
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9942399999999942 Training Episodes: 9712
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [9, 9] Result: Lose 
18
128
Before
2_6
After
2_7
Epsilon: 0.9942199999999942 Training Episodes: 9711
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 6] Result: Win 
13
233
Before
10_14
After
11_15
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 6, 5] Result: Lose 
18
238
Before
5_20
After
5_21
Epsilon: 0.9941999999999942 Training Episodes: 9710
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [4, 7] Result: Win 
11
165
Before
4_4
After
5_5
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [4, 7, 3] Result: Lose 
14
168
Before
1_1
After
1_2
Epsilon: 0.9941799999999942 Training Episodes: 9709
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [1, 8] Result: Win 
9
163
Before
2_2
After
3_3
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [1, 8, 8] Result: Win 
17
171
Before
1_7
After
2_8
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [1, 8, 8, 4] Result: Lose 
21
175
Before
0000000000_4
After
0000000000_5
Epsilon: 0.9941599999999942 Training Episodes: 9708
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [1, 8] Result: Win 
9
361
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [1, 8, 2] Result: Win 
11
121
Before
4_4
After
5_5
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [1, 8, 2, 10] Result: Lose 
21
131
Before
0000000000_12
After
0000000000_13
Epsilon: 0.9941399999999941 Training Episodes: 9707
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 4] Result: Win 
14
234
Before
6_13
After
7_14
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 4, 5] Result: Lose 
19
239
Before
3_13
After
3_14
Epsilon: 0.9941199999999941 Training Episodes: 9706
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [4, 1] Result: Win 
5
291
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [4, 1, 4] Result: Win 
9
53
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [4, 1, 4, 10] Result: Lose 
19
63
Before
1_4
After
1_5
Epsilon: 0.9940999999999941 Training Episodes: 9705
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [2, 3] Result: Win 
5
203
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [2, 3, 10] Result: Win 
15
213
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [2, 3, 10, 2] Result: Win 
17
215
Before
0000000000_3
After
1_4
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [2, 3, 10, 2, 1] Result: Win 
18
216
Before
0000000000_4
After
1_5
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [2, 3, 10, 2, 1, 2] Result: Lose 
20
218
Before
0000000000_6
After
0000000000_7
Epsilon: 0.9940799999999941 Training Episodes: 9704
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 5] Result: Lose 
15
147
Before
0000000000_3
After
0000000000_4
Epsilon: 0.9940599999999941 Training Episodes: 9703
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 5] Result: Win 
15
147
Before
0000000000_4
After
1_5
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 5, 3] Result: Lose 
18
150
Before
0000000000_1
After
0000000000_2
Epsilon: 0.994039999999994 Training Episodes: 9702
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [8, 6] Result: Win 
14
80
Before
2_3
After
3_4
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [8, 6, 3] Result: Lose 
17
83
Before
1_3
After
1_4
Epsilon: 0.994019999999994 Training Episodes: 9701
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [3, 10] Result: Lose 
13
211
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.993999999999994 Training Episodes: 9700
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [7, 2] Result: Win 
9
31
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [7, 2, 10] Result: Lose 
19
41
Before
0000000000_2
After
0000000000_3
Epsilon: 0.993979999999994 Training Episodes: 9699
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [3, 6] Result: Win 
9
31
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [3, 6, 6] Result: Lose 
15
37
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.993959999999994 Training Episodes: 9698
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 7] Result: Lose 
13
233
Before
11_15
After
11_16
Epsilon: 0.9939399999999939 Training Episodes: 9697
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 6] Result: Lose 
16
236
Before
5_14
After
5_15
Epsilon: 0.9939199999999939 Training Episodes: 9696
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 5] Result: Win 
8
228
Before
3_3
After
4_4
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 5, 9] Result: Lose 
17
237
Before
5_11
After
5_12
Epsilon: 0.9938999999999939 Training Episodes: 9695
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [7, 6] Result: Win 
13
57
Before
1_2
After
2_3
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [7, 6, 2] Result: Lose 
15
59
Before
0000000000_5
After
0000000000_6
Epsilon: 0.9938799999999939 Training Episodes: 9694
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [4, 6] Result: Win 
10
54
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [4, 6, 10] Result: Lose 
20
64
Before
1_3
After
1_4
Epsilon: 0.9938599999999939 Training Episodes: 9693
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 10] Result: Lose 
20
196
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9938399999999938 Training Episodes: 9692
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 6] Result: Win 
13
233
Before
11_16
After
12_17
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 6, 8] Result: Lose 
21
241
Before
0000000000_8
After
0000000000_9
Epsilon: 0.9938199999999938 Training Episodes: 9691
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 6] Result: Lose 
16
170
Before
0000000000_3
After
0000000000_4
Epsilon: 0.9937999999999938 Training Episodes: 9690
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 1] Result: Win 
11
187
Before
2_2
After
3_3
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 1, 1] Result: Win 
12
188
Before
3_3
After
4_4
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 1, 1, 9] Result: Lose 
21
197
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9937799999999938 Training Episodes: 9689
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [1, 9] Result: Win 
10
32
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [1, 9, 4] Result: Lose 
14
36
Before
1_2
After
1_3
Epsilon: 0.9937599999999938 Training Episodes: 9688
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 7] Result: Win 
17
215
Before
1_4
After
2_5
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 7, 4] Result: Lose 
21
219
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9937399999999937 Training Episodes: 9687
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [7, 10] Result: Win 
17
61
Before
0000000000_3
After
1_4
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [7, 10, 3] Result: Lose 
20
64
Before
1_4
After
1_5
Epsilon: 0.9937199999999937 Training Episodes: 9686
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [5, 2] Result: Win 
7
205
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [5, 2, 2] Result: Win 
9
207
Before
3_3
After
4_4
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [5, 2, 2, 10] Result: Lose 
19
217
Before
1_6
After
1_7
Epsilon: 0.9936999999999937 Training Episodes: 9685
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [9, 4] Result: Lose 
13
101
Before
1_4
After
1_5
Epsilon: 0.9936799999999937 Training Episodes: 9684
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 10] Result: Win 
11
231
Before
14_14
After
15_15
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 10, 10] Result: Lose 
21
241
Before
0000000000_9
After
0000000000_10
Epsilon: 0.9936599999999937 Training Episodes: 9683
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [9, 10] Result: Lose 
19
173
Before
0000000000_3
After
0000000000_4
Epsilon: 0.9936399999999936 Training Episodes: 9682
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [5, 9] Result: Lose 
14
36
Before
1_3
After
1_4
Epsilon: 0.9936199999999936 Training Episodes: 9681
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 10] Result: Lose 
12
232
Before
5_11
After
5_12
Epsilon: 0.9935999999999936 Training Episodes: 9680
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 3] Result: Win 
10
230
Before
7_7
After
8_8
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 3, 2] Result: Lose 
12
232
Before
5_12
After
5_13
Epsilon: 0.9935799999999936 Training Episodes: 9679
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [7, 10] Result: Lose 
17
39
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9935599999999936 Training Episodes: 9678
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 8] Result: Lose 
18
238
Before
5_21
After
5_22
Epsilon: 0.9935399999999935 Training Episodes: 9677
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 2] Result: Win 
7
227
Before
3_3
After
4_4
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 2, 5] Result: Win 
12
232
Before
5_13
After
6_14
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 2, 5, 6] Result: Lose 
18
238
Before
5_22
After
5_23
Epsilon: 0.9935199999999935 Training Episodes: 9676
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [2, 10] Result: Win 
12
100
Before
0000000000_1
After
1_2
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [2, 10, 5] Result: Win 
17
105
Before
0000000000_5
After
1_6
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [2, 10, 5, 2] Result: Lose 
19
107
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9934999999999935 Training Episodes: 9675
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 10] Result: Lose 
17
237
Before
5_12
After
5_13
Epsilon: 0.9934799999999935 Training Episodes: 9674
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 8] Result: Win 
18
106
Before
2_6
After
3_7
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 8, 1] Result: Lose 
19
107
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9934599999999935 Training Episodes: 9673
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [9, 9] Result: Lose 
18
216
Before
1_5
After
1_6
Epsilon: 0.9934399999999934 Training Episodes: 9672
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [9, 1] Result: Win 
10
32
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [9, 1, 10] Result: Lose 
20
42
Before
1_2
After
1_3
Epsilon: 0.9934199999999934 Training Episodes: 9671
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
1_20
After
1_21
Epsilon: 0.9933999999999934 Training Episodes: 9670
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10] Result: Lose 
13
233
Before
12_17
After
12_18
Epsilon: 0.9933799999999934 Training Episodes: 9669
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 10] Result: Lose 
20
86
Before
4_7
After
4_8
Epsilon: 0.9933599999999934 Training Episodes: 9668
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [8, 2] Result: Win 
10
98
Before
3_3
After
4_4
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [8, 2, 8] Result: Lose 
18
106
Before
3_7
After
3_8
Epsilon: 0.9933399999999933 Training Episodes: 9667
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [7, 4] Result: Win 
11
33
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [7, 4, 8] Result: Lose 
19
41
Before
0000000000_3
After
0000000000_4
Epsilon: 0.9933199999999933 Training Episodes: 9666
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [9, 8] Result: Win 
17
39
Before
0000000000_1
After
1_2
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [9, 8, 3] Result: Lose 
20
42
Before
1_3
After
1_4
Epsilon: 0.9932999999999933 Training Episodes: 9665
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 10] Result: Lose 
18
238
Before
5_23
After
5_24
Epsilon: 0.9932799999999933 Training Episodes: 9664
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [4, 9] Result: Win 
13
167
Before
3_5
After
4_6
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [4, 9, 2] Result: Lose 
15
169
Before
2_5
After
2_6
Epsilon: 0.9932599999999933 Training Episodes: 9663
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 10] Result: Lose 
20
218
Before
0000000000_7
After
0000000000_8
Epsilon: 0.9932399999999932 Training Episodes: 9662
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 6] Result: Lose 
16
236
Before
5_15
After
5_16
Epsilon: 0.9932199999999932 Training Episodes: 9661
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 6] Result: Win 
9
229
Before
5_5
After
6_6
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 6, 10] Result: Win 
19
239
Before
3_14
After
4_15
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 6, 10, 2] Result: Lose 
21
241
Before
0000000000_10
After
0000000000_11
Epsilon: 0.9931999999999932 Training Episodes: 9660
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [10, 10] Result: Lose 
20
64
Before
1_5
After
1_6
Epsilon: 0.9931799999999932 Training Episodes: 9659
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [9, 10] Result: Lose 
19
107
Before
0000000000_3
After
0000000000_4
Epsilon: 0.9931599999999932 Training Episodes: 9658
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [9, 10] Result: Win 
19
107
Before
0000000000_4
After
1_5
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [9, 10, 2] Result: Lose 
21
109
Before
0000000000_3
After
0000000000_4
Epsilon: 0.9931399999999931 Training Episodes: 9657
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [7, 5] Result: Win 
12
34
Before
1_2
After
2_3
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [7, 5, 9] Result: Lose 
21
43
Before
0000000000_3
After
0000000000_4
Epsilon: 0.9931199999999931 Training Episodes: 9656
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 9] Result: Win 
16
236
Before
5_16
After
6_17
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 9, 1] Result: Lose 
17
237
Before
5_13
After
5_14
Epsilon: 0.9930999999999931 Training Episodes: 9655
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [7, 10] Result: Lose 
17
39
Before
1_2
After
1_3
Epsilon: 0.9930799999999931 Training Episodes: 9654
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [9, 1] Result: Win 
10
32
Before
3_3
After
4_4
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [9, 1, 3] Result: Win 
13
35
Before
0000000000_1
After
1_2
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [9, 1, 3, 5] Result: Lose 
18
40
Before
1_3
After
1_4
Epsilon: 0.9930599999999931 Training Episodes: 9653
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [4, 10] Result: Win 
14
124
Before
2_3
After
3_4
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [4, 10, 3] Result: Lose 
17
127
Before
3_8
After
3_9
Epsilon: 0.993039999999993 Training Episodes: 9652
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 6] Result: Lose 
16
82
Before
1_1
After
1_2
Epsilon: 0.993019999999993 Training Episodes: 9651
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [5, 10] Result: Win 
15
81
Before
1_4
After
2_5
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [5, 10, 5] Result: Lose 
20
86
Before
4_8
After
4_9
Epsilon: 0.992999999999993 Training Episodes: 9650
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 7] Result: Lose 
17
127
Before
3_9
After
3_10
Epsilon: 0.992979999999993 Training Episodes: 9649
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 4] Result: Win 
8
228
Before
4_4
After
5_5
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 4, 9] Result: Lose 
17
237
Before
5_14
After
5_15
Epsilon: 0.992959999999993 Training Episodes: 9648
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [2, 10] Result: Lose 
12
122
Before
3_4
After
3_5
Epsilon: 0.9929399999999929 Training Episodes: 9647
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 8] Result: Lose 
18
216
Before
1_6
After
1_7
Epsilon: 0.9929199999999929 Training Episodes: 9646
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 2] Result: Lose 
12
144
Before
1_2
After
1_3
Epsilon: 0.9928999999999929 Training Episodes: 9645
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 6] Result: Win 
16
236
Before
6_17
After
7_18
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 6, 5] Result: Lose 
21
241
Before
0000000000_11
After
0000000000_12
Epsilon: 0.9928799999999929 Training Episodes: 9644
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 9] Result: Win 
13
233
Before
12_18
After
13_19
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 9, 6] Result: Lose 
19
239
Before
4_15
After
4_16
Epsilon: 0.9928599999999929 Training Episodes: 9643
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
1_21
After
1_22
Epsilon: 0.9928399999999928 Training Episodes: 9642
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 3] Result: Win 
6
226
Before
2_2
After
3_3
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 3, 10] Result: Win 
16
236
Before
7_18
After
8_19
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 3, 10, 5] Result: Lose 
21
241
Before
0000000000_12
After
0000000000_13
Epsilon: 0.9928199999999928 Training Episodes: 9641
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 4] Result: Lose 
14
168
Before
1_2
After
1_3
Epsilon: 0.9927999999999928 Training Episodes: 9640
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 5] Result: Win 
15
235
Before
4_7
After
5_8
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 5, 2] Result: Lose 
17
237
Before
5_15
After
5_16
Epsilon: 0.9927799999999928 Training Episodes: 9639
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [1, 10] Result: Win 
11
143
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [1, 10, 4] Result: Lose 
15
147
Before
1_5
After
1_6
Epsilon: 0.9927599999999928 Training Episodes: 9638
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [9, 10] Result: Lose 
19
85
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9927399999999927 Training Episodes: 9637
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 6] Result: Win 
16
236
Before
8_19
After
9_20
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 6, 3] Result: Lose 
19
239
Before
4_16
After
4_17
Epsilon: 0.9927199999999927 Training Episodes: 9636
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 4] Result: Win 
14
80
Before
3_4
After
4_5
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 4, 7] Result: Lose 
21
87
Before
0000000000_10
After
0000000000_11
Epsilon: 0.9926999999999927 Training Episodes: 9635
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [5, 2] Result: Win 
7
183
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [5, 2, 10] Result: Lose 
17
193
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9926799999999927 Training Episodes: 9634
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 4] Result: Win 
14
146
Before
2_4
After
3_5
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 4, 7] Result: Lose 
21
153
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9926599999999927 Training Episodes: 9633
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 6] Result: Lose 
16
236
Before
9_20
After
9_21
Epsilon: 0.9926399999999926 Training Episodes: 9632
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [4, 10] Result: Lose 
14
212
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9926199999999926 Training Episodes: 9631
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 9] Result: Win 
10
230
Before
8_8
After
9_9
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 9, 10] Result: Lose 
20
240
Before
1_22
After
1_23
Epsilon: 0.9925999999999926 Training Episodes: 9630
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [8, 3] Result: Win 
11
55
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [8, 3, 2] Result: Lose 
13
57
Before
2_3
After
2_4
Epsilon: 0.9925799999999926 Training Episodes: 9629
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [4, 2] Result: Win 
6
50
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [4, 2, 4] Result: Win 
10
54
Before
3_3
After
4_4
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [4, 2, 4, 5] Result: Win 
15
59
Before
0000000000_6
After
1_7
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [4, 2, 4, 5, 1] Result: Win 
16
60
Before
1_2
After
2_3
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [4, 2, 4, 5, 1, 4] Result: Lose 
20
64
Before
1_6
After
1_7
Epsilon: 0.9925599999999926 Training Episodes: 9628
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 2] Result: Win 
8
228
Before
5_5
After
6_6
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 2, 7] Result: Lose 
15
235
Before
5_8
After
5_9
Epsilon: 0.9925399999999925 Training Episodes: 9627
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 5] Result: Win 
14
234
Before
7_14
After
8_15
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 5, 4] Result: Lose 
18
238
Before
5_24
After
5_25
Epsilon: 0.9925199999999925 Training Episodes: 9626
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 5] Result: Win 
14
234
Before
8_15
After
9_16
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 5, 2] Result: Win 
16
236
Before
9_21
After
10_22
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 5, 2, 1] Result: Win 
17
237
Before
5_16
After
6_17
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 5, 2, 1, 4] Result: Lose 
21
241
Before
0000000000_13
After
0000000000_14
Epsilon: 0.9924999999999925 Training Episodes: 9625
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 9] Result: Win 
11
231
Before
15_15
After
16_16
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 9, 5] Result: Lose 
16
236
Before
10_22
After
10_23
Epsilon: 0.9924799999999925 Training Episodes: 9624
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [9, 7] Result: Lose 
16
214
Before
2_5
After
2_6
Epsilon: 0.9924599999999925 Training Episodes: 9623
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 1] Result: Win 
9
471
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 1, 1] Result: Win 
10
230
Before
9_9
After
10_10
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 1, 1, 10] Result: Win 
20
240
Before
1_23
After
2_24
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 1, 1, 10, 1] Result: Lose 
21
241
Before
0000000000_14
After
0000000000_15
Epsilon: 0.9924399999999924 Training Episodes: 9622
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [4, 5] Result: Win 
9
75
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [4, 5, 9] Result: Lose 
18
84
Before
1_1
After
1_2
Epsilon: 0.9924199999999924 Training Episodes: 9621
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 1] Result: Win 
11
231
Before
16_16
After
17_17
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 1, 8] Result: Lose 
19
239
Before
4_17
After
4_18
Epsilon: 0.9923999999999924 Training Episodes: 9620
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10] Result: Win 
13
233
Before
13_19
After
14_20
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10, 8] Result: Lose 
21
241
Before
0000000000_15
After
0000000000_16
Epsilon: 0.9923799999999924 Training Episodes: 9619
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [2, 6] Result: Win 
8
74
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [2, 6, 4] Result: Win 
12
78
Before
1_3
After
2_4
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [2, 6, 4, 2] Result: Lose 
14
80
Before
4_5
After
4_6
Epsilon: 0.9923599999999924 Training Episodes: 9618
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 10] Result: Lose 
20
108
Before
0000000000_3
After
0000000000_4
Epsilon: 0.9923399999999923 Training Episodes: 9617
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 10] Result: Win 
14
234
Before
9_16
After
10_17
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 10, 1] Result: Lose 
15
235
Before
5_9
After
5_10
Epsilon: 0.9923199999999923 Training Episodes: 9616
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 10] Result: Lose 
19
239
Before
4_18
After
4_19
Epsilon: 0.9922999999999923 Training Episodes: 9615
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [5, 3] Result: Win 
8
316
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [5, 3, 1] Result: Win 
9
75
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [5, 3, 1, 10] Result: Lose 
19
85
Before
0000000000_3
After
0000000000_4
Epsilon: 0.9922799999999923 Training Episodes: 9614
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 8] Result: Lose 
18
238
Before
5_25
After
5_26
Epsilon: 0.9922599999999923 Training Episodes: 9613
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 5] Result: Win 
15
169
Before
2_6
After
3_7
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 5, 1] Result: Lose 
16
170
Before
0000000000_4
After
0000000000_5
Epsilon: 0.9922399999999922 Training Episodes: 9612
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [2, 6] Result: Win 
8
96
Before
2_2
After
3_3
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [2, 6, 5] Result: Win 
13
101
Before
1_5
After
2_6
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [2, 6, 5, 3] Result: Lose 
16
104
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9922199999999922 Training Episodes: 9611
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [9, 10] Result: Lose 
19
217
Before
1_7
After
1_8
Epsilon: 0.9921999999999922 Training Episodes: 9610
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 4] Result: Lose 
14
80
Before
4_6
After
4_7
Epsilon: 0.9921799999999922 Training Episodes: 9609
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
2_24
After
2_25
Epsilon: 0.9921599999999922 Training Episodes: 9608
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [5, 6] Result: Win 
11
209
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [5, 6, 1] Result: Lose 
12
210
Before
2_3
After
2_4
Epsilon: 0.9921399999999921 Training Episodes: 9607
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [8, 9] Result: Lose 
17
61
Before
1_4
After
1_5
Epsilon: 0.9921199999999921 Training Episodes: 9606
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [6, 10] Result: Lose 
16
104
Before
0000000000_3
After
0000000000_4
Epsilon: 0.9920999999999921 Training Episodes: 9605
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 3] Result: Win 
13
233
Before
14_20
After
15_21
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 3, 2] Result: Win 
15
235
Before
5_10
After
6_11
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 3, 2, 1] Result: Lose 
16
236
Before
10_23
After
10_24
Epsilon: 0.9920799999999921 Training Episodes: 9604
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 8] Result: Lose 
18
84
Before
1_2
After
1_3
Epsilon: 0.9920599999999921 Training Episodes: 9603
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [8, 3] Result: Win 
11
187
Before
3_3
After
4_4
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [8, 3, 8] Result: Lose 
19
195
Before
0000000000_3
After
0000000000_4
Epsilon: 0.992039999999992 Training Episodes: 9602
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 8] Result: Lose 
18
128
Before
2_7
After
2_8
Epsilon: 0.992019999999992 Training Episodes: 9601
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [8, 8] Result: Lose 
16
82
Before
1_2
After
1_3
Epsilon: 0.991999999999992 Training Episodes: 9600
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [5, 10] Result: Lose 
15
125
Before
2_4
After
2_5
Epsilon: 0.991979999999992 Training Episodes: 9599
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 7] Result: Win 
10
230
Before
10_10
After
11_11
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 7, 9] Result: Lose 
19
239
Before
4_19
After
4_20
Epsilon: 0.991959999999992 Training Episodes: 9598
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 7] Result: Win 
10
230
Before
11_11
After
12_12
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 7, 5] Result: Lose 
15
235
Before
6_11
After
6_12
Epsilon: 0.9919399999999919 Training Episodes: 9597
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 6] Result: Win 
9
229
Before
6_6
After
7_7
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 6, 3] Result: Win 
12
232
Before
6_14
After
7_15
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 6, 3, 9] Result: Lose 
21
241
Before
0000000000_16
After
0000000000_17
Epsilon: 0.9919199999999919 Training Episodes: 9596
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 1] Result: Win 
11
33
Before
3_3
After
4_4
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 1, 5] Result: Lose 
16
38
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9918999999999919 Training Episodes: 9595
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [7, 1] Result: Win 
8
206
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [7, 1, 9] Result: Lose 
17
215
Before
2_5
After
2_6
Epsilon: 0.9918799999999919 Training Episodes: 9594
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [7, 5] Result: Win 
12
56
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [7, 5, 4] Result: Lose 
16
60
Before
2_3
After
2_4
Epsilon: 0.9918599999999919 Training Episodes: 9593
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [9, 9] Result: Lose 
18
150
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9918399999999918 Training Episodes: 9592
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [3, 2] Result: Win 
5
49
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [3, 2, 3] Result: Win 
8
52
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [3, 2, 3, 10] Result: Win 
18
62
Before
0000000000_1
After
1_2
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [3, 2, 3, 10, 2] Result: Lose 
20
64
Before
1_7
After
1_8
Epsilon: 0.9918199999999918 Training Episodes: 9591
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 8] Result: Lose 
18
216
Before
1_7
After
1_8
Epsilon: 0.9917999999999918 Training Episodes: 9590
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [6, 6] Result: Win 
12
56
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [6, 6, 1] Result: Lose 
13
57
Before
2_4
After
2_5
Epsilon: 0.9917799999999918 Training Episodes: 9589
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [6, 10] Result: Win 
16
192
Before
0000000000_1
After
1_2
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [6, 10, 1] Result: Win 
17
193
Before
0000000000_3
After
1_4
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [6, 10, 1, 1] Result: Lose 
18
194
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9917599999999918 Training Episodes: 9588
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
2_25
After
2_26
Epsilon: 0.9917399999999917 Training Episodes: 9587
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [7, 8] Result: Lose 
15
213
Before
2_2
After
2_3
Epsilon: 0.9917199999999917 Training Episodes: 9586
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [9, 10] Result: Lose 
19
63
Before
1_5
After
1_6
Epsilon: 0.9916999999999917 Training Episodes: 9585
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [9, 6] Result: Lose 
15
37
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9916799999999917 Training Episodes: 9584
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [4, 7] Result: Win 
11
187
Before
4_4
After
5_5
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [4, 7, 7] Result: Lose 
18
194
Before
0000000000_3
After
0000000000_4
Epsilon: 0.9916599999999917 Training Episodes: 9583
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [5, 10] Result: Win 
15
103
Before
1_4
After
2_5
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [5, 10, 4] Result: Win 
19
107
Before
1_5
After
2_6
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [5, 10, 4, 2] Result: Lose 
21
109
Before
0000000000_4
After
0000000000_5
Epsilon: 0.9916399999999916 Training Episodes: 9582
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 1] Result: Win 
11
231
Before
17_17
After
18_18
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 1, 10] Result: Lose 
21
241
Before
0000000000_17
After
0000000000_18
Epsilon: 0.9916199999999916 Training Episodes: 9581
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [4, 10] Result: Win 
14
36
Before
1_4
After
2_5
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [4, 10, 4] Result: Lose 
18
40
Before
1_4
After
1_5
Epsilon: 0.9915999999999916 Training Episodes: 9580
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 8] Result: Lose 
18
128
Before
2_8
After
2_9
Epsilon: 0.9915799999999916 Training Episodes: 9579
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [6, 10] Result: Lose 
16
148
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9915599999999916 Training Episodes: 9578
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
2_26
After
2_27
Epsilon: 0.9915399999999915 Training Episodes: 9577
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 10] Result: Lose 
20
152
Before
0000000000_6
After
0000000000_7
Epsilon: 0.9915199999999915 Training Episodes: 9576
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [7, 2] Result: Win 
9
31
Before
3_3
After
4_4
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [7, 2, 5] Result: Lose 
14
36
Before
2_5
After
2_6
Epsilon: 0.9914999999999915 Training Episodes: 9575
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 1] Result: Win 
11
77
Before
4_4
After
5_5
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 1, 2] Result: Lose 
13
79
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9914799999999915 Training Episodes: 9574
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 6] Result: Win 
16
126
Before
4_8
After
5_9
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 6, 2] Result: Lose 
18
128
Before
2_9
After
2_10
Epsilon: 0.9914599999999915 Training Episodes: 9573
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
2_27
After
2_28
Epsilon: 0.9914399999999914 Training Episodes: 9572
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [6, 10] Result: Win 
16
104
Before
0000000000_4
After
1_5
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [6, 10, 1] Result: Lose 
17
105
Before
1_6
After
1_7
Epsilon: 0.9914199999999914 Training Episodes: 9571
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [1, 1] Result: Win 
2
332
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [1, 1, 1] Result: Win 
3
91
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [1, 1, 1, 10] Result: Lose 
13
101
Before
2_6
After
2_7
Epsilon: 0.9913999999999914 Training Episodes: 9570
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [5, 8] Result: Win 
13
101
Before
2_7
After
3_8
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [5, 8, 1] Result: Lose 
14
102
Before
3_5
After
3_6
Epsilon: 0.9913799999999914 Training Episodes: 9569
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 3] Result: Win 
11
231
Before
18_18
After
19_19
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 3, 10] Result: Lose 
21
241
Before
0000000000_18
After
0000000000_19
Epsilon: 0.9913599999999914 Training Episodes: 9568
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [8, 8] Result: Lose 
16
82
Before
1_3
After
1_4
Epsilon: 0.9913399999999913 Training Episodes: 9567
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [1, 10] Result: Win 
11
99
Before
3_3
After
4_4
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [1, 10, 5] Result: Win 
16
104
Before
1_5
After
2_6
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [1, 10, 5, 2] Result: Lose 
18
106
Before
3_8
After
3_9
Epsilon: 0.9913199999999913 Training Episodes: 9566
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 10] Result: Lose 
20
42
Before
1_4
After
1_5
Epsilon: 0.9912999999999913 Training Episodes: 9565
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [6, 2] Result: Win 
8
184
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [6, 2, 10] Result: Lose 
18
194
Before
0000000000_4
After
0000000000_5
Epsilon: 0.9912799999999913 Training Episodes: 9564
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [8, 8] Result: Lose 
16
82
Before
1_4
After
1_5
Epsilon: 0.9912599999999913 Training Episodes: 9563
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [7, 4] Result: Win 
11
55
Before
3_3
After
4_4
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [7, 4, 5] Result: Win 
16
60
Before
2_4
After
3_5
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [7, 4, 5, 2] Result: Lose 
18
62
Before
1_2
After
1_3
Epsilon: 0.9912399999999912 Training Episodes: 9562
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [6, 1] Result: Win 
7
161
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [6, 1, 10] Result: Lose 
17
171
Before
2_8
After
2_9
Epsilon: 0.9912199999999912 Training Episodes: 9561
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [8, 8] Result: Win 
16
60
Before
3_5
After
4_6
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [8, 8, 1] Result: Win 
17
61
Before
1_5
After
2_6
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [8, 8, 1, 3] Result: Lose 
20
64
Before
1_8
After
1_9
Epsilon: 0.9911999999999912 Training Episodes: 9560
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [4, 10] Result: Lose 
14
80
Before
4_7
After
4_8
Epsilon: 0.9911799999999912 Training Episodes: 9559
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 1] Result: Win 
9
229
Before
7_7
After
8_8
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 1, 5] Result: Win 
14
234
Before
10_17
After
11_18
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 1, 5, 5] Result: Lose 
19
239
Before
4_20
After
4_21
Epsilon: 0.9911599999999912 Training Episodes: 9558
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 1] Result: Win 
3
223
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 1, 10] Result: Win 
13
233
Before
15_21
After
16_22
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 1, 10, 4] Result: Lose 
17
237
Before
6_17
After
6_18
Epsilon: 0.9911399999999911 Training Episodes: 9557
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 1] Result: Win 
11
143
Before
3_3
After
4_4
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 1, 10] Result: Lose 
21
153
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9911199999999911 Training Episodes: 9556
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [4, 5] Result: Win 
9
207
Before
4_4
After
5_5
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [4, 5, 6] Result: Lose 
15
213
Before
2_3
After
2_4
Epsilon: 0.9910999999999911 Training Episodes: 9555
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 10] Result: Win 
11
231
Before
19_19
After
20_20
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 10, 10] Result: Lose 
21
241
Before
0000000000_19
After
0000000000_20
Epsilon: 0.9910799999999911 Training Episodes: 9554
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [5, 5] Result: Win 
10
120
Before
6_6
After
7_7
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [5, 5, 5] Result: Lose 
15
125
Before
2_5
After
2_6
Epsilon: 0.9910599999999911 Training Episodes: 9553
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [9, 10] Result: Lose 
19
151
Before
0000000000_1
After
0000000000_2
Epsilon: 0.991039999999991 Training Episodes: 9552
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [2, 9] Result: Win 
11
143
Before
4_4
After
5_5
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [2, 9, 5] Result: Win 
16
148
Before
0000000000_2
After
1_3
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [2, 9, 5, 4] Result: Lose 
20
152
Before
0000000000_7
After
0000000000_8
Epsilon: 0.991019999999991 Training Episodes: 9551
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 9] Result: Lose 
16
236
Before
10_24
After
10_25
Epsilon: 0.990999999999991 Training Episodes: 9550
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 6] Result: Win 
9
229
Before
8_8
After
9_9
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 6, 5] Result: Lose 
14
234
Before
11_18
After
11_19
Epsilon: 0.990979999999991 Training Episodes: 9549
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 10] Result: Lose 
20
196
Before
0000000000_2
After
0000000000_3
Epsilon: 0.990959999999991 Training Episodes: 9548
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [4, 10] Result: Win 
14
102
Before
3_6
After
4_7
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [4, 10, 2] Result: Lose 
16
104
Before
2_6
After
2_7
Epsilon: 0.9909399999999909 Training Episodes: 9547
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [3, 10] Result: Lose 
13
211
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9909199999999909 Training Episodes: 9546
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 5] Result: Lose 
15
147
Before
1_6
After
1_7
Epsilon: 0.9908999999999909 Training Episodes: 9545
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 1] Result: Win 
8
228
Before
6_6
After
7_7
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 1, 5] Result: Win 
13
233
Before
16_22
After
17_23
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 1, 5, 4] Result: Lose 
17
237
Before
6_18
After
6_19
Epsilon: 0.9908799999999909 Training Episodes: 9544
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [6, 2] Result: Win 
8
74
Before
2_2
After
3_3
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [6, 2, 4] Result: Win 
12
78
Before
2_4
After
3_5
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [6, 2, 4, 6] Result: Lose 
18
84
Before
1_3
After
1_4
Epsilon: 0.9908599999999909 Training Episodes: 9543
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [3, 5] Result: Win 
8
96
Before
3_3
After
4_4
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [3, 5, 10] Result: Lose 
18
106
Before
3_9
After
3_10
Epsilon: 0.9908399999999908 Training Episodes: 9542
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [7, 10] Result: Lose 
17
193
Before
1_4
After
1_5
Epsilon: 0.9908199999999908 Training Episodes: 9541
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 6] Result: Win 
16
236
Before
10_25
After
11_26
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 6, 3] Result: Lose 
19
239
Before
4_21
After
4_22
Epsilon: 0.9907999999999908 Training Episodes: 9540
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 2] Result: Win 
12
78
Before
3_5
After
4_6
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 2, 3] Result: Lose 
15
81
Before
2_5
After
2_6
Epsilon: 0.9907799999999908 Training Episodes: 9539
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [8, 9] Result: Win 
17
61
Before
2_6
After
3_7
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [8, 9, 3] Result: Lose 
20
64
Before
1_9
After
1_10
Epsilon: 0.9907599999999908 Training Episodes: 9538
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 9] Result: Lose 
19
85
Before
0000000000_4
After
0000000000_5
Epsilon: 0.9907399999999907 Training Episodes: 9537
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 10] Result: Lose 
18
238
Before
5_26
After
5_27
Epsilon: 0.9907199999999907 Training Episodes: 9536
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 2] Result: Win 
8
228
Before
7_7
After
8_8
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 2, 10] Result: Lose 
18
238
Before
5_27
After
5_28
Epsilon: 0.9906999999999907 Training Episodes: 9535
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [4, 3] Result: Win 
7
29
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [4, 3, 7] Result: Lose 
14
36
Before
2_6
After
2_7
Epsilon: 0.9906799999999907 Training Episodes: 9534
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 5] Result: Win 
9
229
Before
9_9
After
10_10
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 5, 10] Result: Lose 
19
239
Before
4_22
After
4_23
Epsilon: 0.9906599999999907 Training Episodes: 9533
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [3, 4] Result: Win 
7
73
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [3, 4, 7] Result: Lose 
14
80
Before
4_8
After
4_9
Epsilon: 0.9906399999999906 Training Episodes: 9532
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [9, 7] Result: Lose 
16
104
Before
2_7
After
2_8
Epsilon: 0.9906199999999906 Training Episodes: 9531
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [5, 10] Result: Lose 
15
213
Before
2_4
After
2_5
Epsilon: 0.9905999999999906 Training Episodes: 9530
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [7, 2] Result: Win 
9
141
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [7, 2, 8] Result: Lose 
17
149
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9905799999999906 Training Episodes: 9529
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 4] Result: Win 
5
225
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 4, 9] Result: Lose 
14
234
Before
11_19
After
11_20
Epsilon: 0.9905599999999906 Training Episodes: 9528
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 8] Result: Lose 
18
216
Before
1_8
After
1_9
Epsilon: 0.9905399999999905 Training Episodes: 9527
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 1] Result: Win 
8
228
Before
8_8
After
9_9
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 1, 10] Result: Lose 
18
238
Before
5_28
After
5_29
Epsilon: 0.9905199999999905 Training Episodes: 9526
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [7, 10] Result: Win 
17
105
Before
1_7
After
2_8
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [7, 10, 3] Result: Lose 
20
108
Before
0000000000_4
After
0000000000_5
Epsilon: 0.9904999999999905 Training Episodes: 9525
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 4] Result: Win 
14
124
Before
3_4
After
4_5
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 4, 1] Result: Lose 
15
125
Before
2_6
After
2_7
Epsilon: 0.9904799999999905 Training Episodes: 9524
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 5] Result: Win 
15
235
Before
6_12
After
7_13
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 5, 3] Result: Lose 
18
238
Before
5_29
After
5_30
Epsilon: 0.9904599999999905 Training Episodes: 9523
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [3, 10] Result: Win 
13
167
Before
4_6
After
5_7
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [3, 10, 7] Result: Lose 
20
174
Before
0000000000_0000000000
After
0000000000_1
Epsilon: 0.9904399999999904 Training Episodes: 9522
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [7, 10] Result: Lose 
17
61
Before
3_7
After
3_8
Epsilon: 0.9904199999999904 Training Episodes: 9521
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [3, 7] Result: Win 
10
54
Before
4_4
After
5_5
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [3, 7, 8] Result: Lose 
18
62
Before
1_3
After
1_4
Epsilon: 0.9903999999999904 Training Episodes: 9520
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [7, 10] Result: Lose 
17
149
Before
0000000000_3
After
0000000000_4
Epsilon: 0.9903799999999904 Training Episodes: 9519
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 10] Result: Lose 
20
42
Before
1_5
After
1_6
Epsilon: 0.9903599999999904 Training Episodes: 9518
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [3, 3] Result: Win 
6
94
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [3, 3, 6] Result: Win 
12
100
Before
1_2
After
2_3
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [3, 3, 6, 9] Result: Lose 
21
109
Before
0000000000_5
After
0000000000_6
Epsilon: 0.9903399999999903 Training Episodes: 9517
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 7] Result: Win 
9
229
Before
10_10
After
11_11
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 7, 9] Result: Win 
18
238
Before
5_30
After
6_31
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 7, 9, 3] Result: Lose 
21
241
Before
0000000000_20
After
0000000000_21
Epsilon: 0.9903199999999903 Training Episodes: 9516
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 3] Result: Win 
11
231
Before
20_20
After
21_21
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 3, 7] Result: Lose 
18
238
Before
6_31
After
6_32
Epsilon: 0.9902999999999903 Training Episodes: 9515
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 5] Result: Win 
10
230
Before
12_12
After
13_13
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 5, 6] Result: Lose 
16
236
Before
11_26
After
11_27
Epsilon: 0.9902799999999903 Training Episodes: 9514
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [3, 9] Result: Win 
12
144
Before
1_3
After
2_4
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [3, 9, 4] Result: Lose 
16
148
Before
1_3
After
1_4
Epsilon: 0.9902599999999903 Training Episodes: 9513
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 4] Result: Win 
9
229
Before
11_11
After
12_12
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 4, 3] Result: Win 
12
232
Before
7_15
After
8_16
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 4, 3, 7] Result: Lose 
19
239
Before
4_23
After
4_24
Epsilon: 0.9902399999999902 Training Episodes: 9512
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [6, 10] Result: Lose 
16
38
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9902199999999902 Training Episodes: 9511
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [4, 7] Result: Win 
11
143
Before
5_5
After
6_6
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [4, 7, 10] Result: Lose 
21
153
Before
0000000000_3
After
0000000000_4
Epsilon: 0.9901999999999902 Training Episodes: 9510
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 10] Result: Win 
11
231
Before
21_21
After
22_22
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 10, 2] Result: Lose 
13
233
Before
17_23
After
17_24
Epsilon: 0.9901799999999902 Training Episodes: 9509
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [6, 10] Result: Win 
16
60
Before
4_6
After
5_7
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [6, 10, 4] Result: Lose 
20
64
Before
1_10
After
1_11
Epsilon: 0.9901599999999902 Training Episodes: 9508
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [3, 1] Result: Win 
4
422
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [3, 1, 4] Result: Win 
8
426
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [3, 1, 4, 3] Result: Win 
11
187
Before
5_5
After
6_6
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [3, 1, 4, 3, 3] Result: Win 
14
190
Before
1_2
After
2_3
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [3, 1, 4, 3, 3, 5] Result: Lose 
19
195
Before
0000000000_4
After
0000000000_5
Epsilon: 0.9901399999999901 Training Episodes: 9507
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 4] Result: Win 
11
231
Before
22_22
After
23_23
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 4, 4] Result: Lose 
15
235
Before
7_13
After
7_14
Epsilon: 0.9901199999999901 Training Episodes: 9506
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10] Result: Lose 
13
233
Before
17_24
After
17_25
Epsilon: 0.9900999999999901 Training Episodes: 9505
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [2, 10] Result: Win 
12
122
Before
3_5
After
4_6
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [2, 10, 9] Result: Lose 
21
131
Before
0000000000_13
After
0000000000_14
Epsilon: 0.9900799999999901 Training Episodes: 9504
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [6, 8] Result: Win 
14
146
Before
3_5
After
4_6
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [6, 8, 1] Result: Win 
15
147
Before
1_7
After
2_8
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [6, 8, 1, 3] Result: Lose 
18
150
Before
0000000000_3
After
0000000000_4
Epsilon: 0.9900599999999901 Training Episodes: 9503
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 5] Result: Win 
7
227
Before
4_4
After
5_5
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 5, 10] Result: Lose 
17
237
Before
6_19
After
6_20
Epsilon: 0.99003999999999 Training Episodes: 9502
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 10] Result: Lose 
20
218
Before
0000000000_8
After
0000000000_9
Epsilon: 0.99001999999999 Training Episodes: 9501
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 10] Result: Lose 
20
152
Before
0000000000_8
After
0000000000_9
Epsilon: 0.98999999999999 Training Episodes: 9500
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [7, 5] Result: Lose 
12
56
Before
3_3
After
3_4
Epsilon: 0.98997999999999 Training Episodes: 9499
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [2, 8] Result: Win 
10
186
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [2, 8, 8] Result: Win 
18
194
Before
0000000000_5
After
1_6
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [2, 8, 8, 1] Result: Lose 
19
195
Before
0000000000_5
After
0000000000_6
Epsilon: 0.98995999999999 Training Episodes: 9498
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [6, 9] Result: Lose 
15
169
Before
3_7
After
3_8
Epsilon: 0.9899399999999899 Training Episodes: 9497
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [4, 3] Result: Win 
7
139
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [4, 3, 5] Result: Lose 
12
144
Before
2_4
After
2_5
Epsilon: 0.9899199999999899 Training Episodes: 9496
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [1, 10] Result: Win 
11
77
Before
5_5
After
6_6
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [1, 10, 10] Result: Lose 
21
87
Before
0000000000_11
After
0000000000_12
Epsilon: 0.9898999999999899 Training Episodes: 9495
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [7, 6] Result: Win 
13
123
Before
4_5
After
5_6
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [7, 6, 7] Result: Lose 
20
130
Before
1_5
After
1_6
Epsilon: 0.9898799999999899 Training Episodes: 9494
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [6, 9] Result: Win 
15
147
Before
2_8
After
3_9
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [6, 9, 2] Result: Win 
17
149
Before
0000000000_4
After
1_5
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [6, 9, 2, 3] Result: Win 
20
152
Before
0000000000_9
After
1_10
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [6, 9, 2, 3, 1] Result: Lose 
21
153
Before
0000000000_4
After
0000000000_5
Epsilon: 0.9898599999999899 Training Episodes: 9493
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [3, 10] Result: Lose 
13
145
Before
2_3
After
2_4
Epsilon: 0.9898399999999898 Training Episodes: 9492
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [8, 6] Result: Win 
14
36
Before
2_7
After
3_8
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [8, 6, 7] Result: Lose 
21
43
Before
0000000000_4
After
0000000000_5
Epsilon: 0.9898199999999898 Training Episodes: 9491
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 10] Result: Win 
18
238
Before
6_32
After
7_33
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 10, 3] Result: Lose 
21
241
Before
0000000000_21
After
0000000000_22
Epsilon: 0.9897999999999898 Training Episodes: 9490
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 9] Result: Lose 
19
85
Before
0000000000_5
After
0000000000_6
Epsilon: 0.9897799999999898 Training Episodes: 9489
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [3, 9] Result: Lose 
12
56
Before
3_4
After
3_5
Epsilon: 0.9897599999999898 Training Episodes: 9488
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 3] Result: Win 
8
228
Before
9_9
After
10_10
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 3, 10] Result: Lose 
18
238
Before
7_33
After
7_34
Epsilon: 0.9897399999999897 Training Episodes: 9487
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 5] Result: Win 
15
169
Before
3_8
After
4_9
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 5, 5] Result: Lose 
20
174
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9897199999999897 Training Episodes: 9486
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [5, 10] Result: Lose 
15
191
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9896999999999897 Training Episodes: 9485
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [3, 8] Result: Win 
11
165
Before
5_5
After
6_6
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [3, 8, 8] Result: Lose 
19
173
Before
0000000000_4
After
0000000000_5
Epsilon: 0.9896799999999897 Training Episodes: 9484
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 9] Result: Win 
11
231
Before
23_23
After
24_24
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 9, 8] Result: Lose 
19
239
Before
4_24
After
4_25
Epsilon: 0.9896599999999897 Training Episodes: 9483
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [5, 10] Result: Lose 
15
81
Before
2_6
After
2_7
Epsilon: 0.9896399999999896 Training Episodes: 9482
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [8, 2] Result: Win 
10
98
Before
4_4
After
5_5
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [8, 2, 9] Result: Lose 
19
107
Before
2_6
After
2_7
Epsilon: 0.9896199999999896 Training Episodes: 9481
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 5] Result: Win 
15
125
Before
2_7
After
3_8
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 5, 2] Result: Win 
17
127
Before
3_10
After
4_11
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 5, 2, 4] Result: Lose 
21
131
Before
0000000000_14
After
0000000000_15
Epsilon: 0.9895999999999896 Training Episodes: 9480
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 10] Result: Lose 
15
235
Before
7_14
After
7_15
Epsilon: 0.9895799999999896 Training Episodes: 9479
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [1, 2] Result: Win 
3
311
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [1, 2, 8] Result: Win 
11
77
Before
6_6
After
7_7
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [1, 2, 8, 3] Result: Lose 
14
80
Before
4_9
After
4_10
Epsilon: 0.9895599999999896 Training Episodes: 9478
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [2, 3] Result: Win 
5
159
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [2, 3, 5] Result: Win 
10
406
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [2, 3, 5, 1] Result: Win 
11
165
Before
6_6
After
7_7
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [2, 3, 5, 1, 6] Result: Win 
17
171
Before
2_9
After
3_10
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [2, 3, 5, 1, 6, 4] Result: Lose 
21
175
Before
0000000000_5
After
0000000000_6
Epsilon: 0.9895399999999895 Training Episodes: 9477
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [4, 1] Result: Win 
5
335
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [4, 1, 5] Result: Win 
10
98
Before
5_5
After
6_6
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [4, 1, 5, 10] Result: Lose 
20
108
Before
0000000000_5
After
0000000000_6
Epsilon: 0.9895199999999895 Training Episodes: 9476
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 10] Result: Win 
11
231
Before
24_24
After
25_25
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 10, 10] Result: Lose 
21
241
Before
0000000000_22
After
0000000000_23
Epsilon: 0.9894999999999895 Training Episodes: 9475
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 10] Result: Win 
16
236
Before
11_27
After
12_28
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 10, 5] Result: Lose 
21
241
Before
0000000000_23
After
0000000000_24
Epsilon: 0.9894799999999895 Training Episodes: 9474
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [3, 9] Result: Win 
12
78
Before
4_6
After
5_7
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [3, 9, 2] Result: Win 
14
80
Before
4_10
After
5_11
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [3, 9, 2, 2] Result: Lose 
16
82
Before
1_5
After
1_6
Epsilon: 0.9894599999999895 Training Episodes: 9473
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 10] Result: Lose 
20
152
Before
1_10
After
1_11
Epsilon: 0.9894399999999894 Training Episodes: 9472
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 10] Result: Lose 
20
152
Before
1_11
After
1_12
Epsilon: 0.9894199999999894 Training Episodes: 9471
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 2] Result: Win 
5
225
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 2, 10] Result: Lose 
15
235
Before
7_15
After
7_16
Epsilon: 0.9893999999999894 Training Episodes: 9470
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 6] Result: Lose 
16
38
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9893799999999894 Training Episodes: 9469
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
2_28
After
2_29
Epsilon: 0.9893599999999894 Training Episodes: 9468
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [6, 2] Result: Win 
8
118
Before
2_2
After
3_3
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [6, 2, 5] Result: Win 
13
123
Before
5_6
After
6_7
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [6, 2, 5, 1] Result: Win 
14
124
Before
4_5
After
5_6
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [6, 2, 5, 1, 1] Result: Lose 
15
125
Before
3_8
After
3_9
Epsilon: 0.9893399999999893 Training Episodes: 9467
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [8, 3] Result: Win 
11
99
Before
4_4
After
5_5
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [8, 3, 2] Result: Lose 
13
101
Before
3_8
After
3_9
Epsilon: 0.9893199999999893 Training Episodes: 9466
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [3, 3] Result: Win 
6
94
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [3, 3, 10] Result: Win 
16
104
Before
2_8
After
3_9
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [3, 3, 10, 5] Result: Lose 
21
109
Before
0000000000_6
After
0000000000_7
Epsilon: 0.9892999999999893 Training Episodes: 9465
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 2] Result: Win 
8
228
Before
10_10
After
11_11
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 2, 10] Result: Lose 
18
238
Before
7_34
After
7_35
Epsilon: 0.9892799999999893 Training Episodes: 9464
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 10] Result: Lose 
20
196
Before
0000000000_3
After
0000000000_4
Epsilon: 0.9892599999999893 Training Episodes: 9463
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [8, 2] Result: Win 
10
120
Before
7_7
After
8_8
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [8, 2, 2] Result: Win 
12
122
Before
4_6
After
5_7
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [8, 2, 2, 5] Result: Lose 
17
127
Before
4_11
After
4_12
Epsilon: 0.9892399999999892 Training Episodes: 9462
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 3] Result: Win 
13
233
Before
17_25
After
18_26
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 3, 2] Result: Win 
15
235
Before
7_16
After
8_17
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 3, 2, 5] Result: Lose 
20
240
Before
2_29
After
2_30
Epsilon: 0.9892199999999892 Training Episodes: 9461
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 7] Result: Lose 
17
237
Before
6_20
After
6_21
Epsilon: 0.9891999999999892 Training Episodes: 9460
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10] Result: Win 
13
233
Before
18_26
After
19_27
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10, 7] Result: Lose 
20
240
Before
2_30
After
2_31
Epsilon: 0.9891799999999892 Training Episodes: 9459
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 5] Result: Lose 
15
103
Before
2_5
After
2_6
Epsilon: 0.9891599999999892 Training Episodes: 9458
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 8] Result: Lose 
18
172
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9891399999999891 Training Episodes: 9457
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 1] Result: Win 
11
99
Before
5_5
After
6_6
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 1, 4] Result: Lose 
15
103
Before
2_6
After
2_7
Epsilon: 0.9891199999999891 Training Episodes: 9456
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [1, 10] Result: Win 
11
209
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [1, 10, 5] Result: Lose 
16
214
Before
2_6
After
2_7
Epsilon: 0.9890999999999891 Training Episodes: 9455
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10] Result: Lose 
13
233
Before
19_27
After
19_28
Epsilon: 0.9890799999999891 Training Episodes: 9454
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 10] Result: Lose 
20
42
Before
1_6
After
1_7
Epsilon: 0.9890599999999891 Training Episodes: 9453
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 10] Result: Lose 
20
86
Before
4_9
After
4_10
Epsilon: 0.989039999999989 Training Episodes: 9452
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [9, 6] Result: Win 
15
103
Before
2_7
After
3_8
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [9, 6, 3] Result: Lose 
18
106
Before
3_10
After
3_11
Epsilon: 0.989019999999989 Training Episodes: 9451
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [7, 7] Result: Win 
14
36
Before
3_8
After
4_9
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [7, 7, 7] Result: Lose 
21
43
Before
0000000000_5
After
0000000000_6
Epsilon: 0.988999999999989 Training Episodes: 9450
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 3] Result: Win 
5
225
Before
3_3
After
4_4
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 3, 7] Result: Win 
12
232
Before
8_16
After
9_17
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 3, 7, 2] Result: Win 
14
234
Before
11_20
After
12_21
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 3, 7, 2, 2] Result: Lose 
16
236
Before
12_28
After
12_29
Epsilon: 0.988979999999989 Training Episodes: 9449
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 5] Result: Win 
15
213
Before
2_5
After
3_6
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 5, 5] Result: Lose 
20
218
Before
0000000000_9
After
0000000000_10
Epsilon: 0.988959999999989 Training Episodes: 9448
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 8] Result: Win 
9
229
Before
12_12
After
13_13
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 8, 10] Result: Lose 
19
239
Before
4_25
After
4_26
Epsilon: 0.9889399999999889 Training Episodes: 9447
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 10] Result: Lose 
20
196
Before
0000000000_4
After
0000000000_5
Epsilon: 0.9889199999999889 Training Episodes: 9446
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [3, 3] Result: Win 
6
50
Before
2_2
After
3_3
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [3, 3, 10] Result: Win 
16
60
Before
5_7
After
6_8
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [3, 3, 10, 4] Result: Lose 
20
64
Before
1_11
After
1_12
Epsilon: 0.9888999999999889 Training Episodes: 9445
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 10] Result: Lose 
20
42
Before
1_7
After
1_8
Epsilon: 0.9888799999999889 Training Episodes: 9444
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 7] Result: Win 
12
232
Before
9_17
After
10_18
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 7, 2] Result: Win 
14
234
Before
12_21
After
13_22
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 7, 2, 2] Result: Win 
16
236
Before
12_29
After
13_30
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 7, 2, 2, 2] Result: Lose 
18
238
Before
7_35
After
7_36
Epsilon: 0.9888599999999889 Training Episodes: 9443
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [3, 5] Result: Win 
8
96
Before
4_4
After
5_5
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [3, 5, 2] Result: Win 
10
98
Before
6_6
After
7_7
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [3, 5, 2, 10] Result: Lose 
20
108
Before
0000000000_6
After
0000000000_7
Epsilon: 0.9888399999999888 Training Episodes: 9442
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 2] Result: Lose 
12
144
Before
2_5
After
2_6
Epsilon: 0.9888199999999888 Training Episodes: 9441
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 2] Result: Win 
12
122
Before
5_7
After
6_8
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 2, 2] Result: Lose 
14
124
Before
5_6
After
5_7
Epsilon: 0.9887999999999888 Training Episodes: 9440
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 1] Result: Win 
6
226
Before
3_3
After
4_4
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 1, 10] Result: Lose 
16
236
Before
13_30
After
13_31
Epsilon: 0.9887799999999888 Training Episodes: 9439
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 2] Result: Lose 
12
232
Before
10_18
After
10_19
Epsilon: 0.9887599999999888 Training Episodes: 9438
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 4] Result: Win 
11
231
Before
25_25
After
26_26
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 4, 10] Result: Lose 
21
241
Before
0000000000_24
After
0000000000_25
Epsilon: 0.9887399999999887 Training Episodes: 9437
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 6] Result: Win 
12
232
Before
10_19
After
11_20
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 6, 7] Result: Win 
19
239
Before
4_26
After
5_27
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 6, 7, 1] Result: Lose 
20
240
Before
2_31
After
2_32
Epsilon: 0.9887199999999887 Training Episodes: 9436
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [8, 7] Result: Win 
15
191
Before
0000000000_2
After
1_3
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [8, 7, 4] Result: Lose 
19
195
Before
0000000000_6
After
0000000000_7
Epsilon: 0.9886999999999887 Training Episodes: 9435
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [7, 8] Result: Lose 
15
103
Before
3_8
After
3_9
Epsilon: 0.9886799999999887 Training Episodes: 9434
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [3, 5] Result: Win 
8
140
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [3, 5, 10] Result: Lose 
18
150
Before
0000000000_4
After
0000000000_5
Epsilon: 0.9886599999999887 Training Episodes: 9433
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 8] Result: Lose 
17
237
Before
6_21
After
6_22
Epsilon: 0.9886399999999886 Training Episodes: 9432
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [7, 10] Result: Lose 
17
149
Before
1_5
After
1_6
Epsilon: 0.9886199999999886 Training Episodes: 9431
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 7] Result: Win 
12
232
Before
11_20
After
12_21
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 7, 6] Result: Lose 
18
238
Before
7_36
After
7_37
Epsilon: 0.9885999999999886 Training Episodes: 9430
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
2_32
After
2_33
Epsilon: 0.9885799999999886 Training Episodes: 9429
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [6, 5] Result: Win 
11
121
Before
5_5
After
6_6
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [6, 5, 10] Result: Lose 
21
131
Before
0000000000_15
After
0000000000_16
Epsilon: 0.9885599999999886 Training Episodes: 9428
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 2] Result: Lose 
12
188
Before
4_4
After
4_5
Epsilon: 0.9885399999999885 Training Episodes: 9427
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [8, 6] Result: Win 
14
146
Before
4_6
After
5_7
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [8, 6, 5] Result: Lose 
19
151
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9885199999999885 Training Episodes: 9426
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 8] Result: Win 
14
234
Before
13_22
After
14_23
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 8, 3] Result: Lose 
17
237
Before
6_22
After
6_23
Epsilon: 0.9884999999999885 Training Episodes: 9425
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 6] Result: Win 
13
233
Before
19_28
After
20_29
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 6, 3] Result: Lose 
16
236
Before
13_31
After
13_32
Epsilon: 0.9884799999999885 Training Episodes: 9424
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [7, 10] Result: Lose 
17
171
Before
3_10
After
3_11
Epsilon: 0.9884599999999885 Training Episodes: 9423
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [6, 2] Result: Win 
8
448
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [6, 2, 1] Result: Win 
9
207
Before
5_5
After
6_6
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [6, 2, 1, 4] Result: Win 
13
211
Before
0000000000_2
After
1_3
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [6, 2, 1, 4, 3] Result: Lose 
16
214
Before
2_7
After
2_8
Epsilon: 0.9884399999999884 Training Episodes: 9422
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [1, 4] Result: Win 
5
423
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [1, 4, 3] Result: Win 
8
184
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [1, 4, 3, 5] Result: Lose 
13
189
Before
1_4
After
1_5
Epsilon: 0.9884199999999884 Training Episodes: 9421
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 10] Result: Win 
11
231
Before
26_26
After
27_27
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 10, 9] Result: Lose 
20
240
Before
2_33
After
2_34
Epsilon: 0.9883999999999884 Training Episodes: 9420
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [3, 10] Result: Win 
13
211
Before
1_3
After
2_4
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [3, 10, 8] Result: Lose 
21
219
Before
0000000000_3
After
0000000000_4
Epsilon: 0.9883799999999884 Training Episodes: 9419
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 10] Result: Win 
19
239
Before
5_27
After
6_28
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 10, 2] Result: Lose 
21
241
Before
0000000000_25
After
0000000000_26
Epsilon: 0.9883599999999884 Training Episodes: 9418
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 9] Result: Lose 
19
239
Before
6_28
After
6_29
Epsilon: 0.9883399999999883 Training Episodes: 9417
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [4, 4] Result: Win 
8
140
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [4, 4, 9] Result: Lose 
17
149
Before
1_6
After
1_7
Epsilon: 0.9883199999999883 Training Episodes: 9416
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 10] Result: Lose 
19
239
Before
6_29
After
6_30
Epsilon: 0.9882999999999883 Training Episodes: 9415
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
2_34
After
2_35
Epsilon: 0.9882799999999883 Training Episodes: 9414
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [9, 9] Result: Lose 
18
194
Before
1_6
After
1_7
Epsilon: 0.9882599999999883 Training Episodes: 9413
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 4] Result: Win 
12
232
Before
12_21
After
13_22
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 4, 9] Result: Lose 
21
241
Before
0000000000_26
After
0000000000_27
Epsilon: 0.9882399999999882 Training Episodes: 9412
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 10] Result: Win 
12
232
Before
13_22
After
14_23
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 10, 8] Result: Lose 
20
240
Before
2_35
After
2_36
Epsilon: 0.9882199999999882 Training Episodes: 9411
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [4, 5] Result: Win 
9
75
Before
3_3
After
4_4
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [4, 5, 9] Result: Lose 
18
84
Before
1_4
After
1_5
Epsilon: 0.9881999999999882 Training Episodes: 9410
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 1] Result: Win 
11
33
Before
4_4
After
5_5
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 1, 1] Result: Lose 
12
34
Before
2_3
After
2_4
Epsilon: 0.9881799999999882 Training Episodes: 9409
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [3, 2] Result: Win 
5
181
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [3, 2, 7] Result: Win 
12
188
Before
4_5
After
5_6
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [3, 2, 7, 5] Result: Lose 
17
193
Before
1_5
After
1_6
Epsilon: 0.9881599999999882 Training Episodes: 9408
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 10] Result: Lose 
20
152
Before
1_12
After
1_13
Epsilon: 0.9881399999999881 Training Episodes: 9407
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [3, 10] Result: Lose 
13
101
Before
3_9
After
3_10
Epsilon: 0.9881199999999881 Training Episodes: 9406
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 10] Result: Lose 
20
174
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9880999999999881 Training Episodes: 9405
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [4, 9] Result: Lose 
13
189
Before
1_5
After
1_6
Epsilon: 0.9880799999999881 Training Episodes: 9404
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [8, 10] Result: Lose 
18
216
Before
1_9
After
1_10
Epsilon: 0.9880599999999881 Training Episodes: 9403
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 6] Result: Win 
13
233
Before
20_29
After
21_30
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 6, 1] Result: Win 
14
234
Before
14_23
After
15_24
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 6, 1, 7] Result: Lose 
21
241
Before
0000000000_27
After
0000000000_28
Epsilon: 0.988039999999988 Training Episodes: 9402
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [3, 2] Result: Win 
5
27
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [3, 2, 10] Result: Win 
15
37
Before
0000000000_2
After
1_3
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [3, 2, 10, 4] Result: Win 
19
41
Before
0000000000_4
After
1_5
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [3, 2, 10, 4, 2] Result: Lose 
21
43
Before
0000000000_6
After
0000000000_7
Epsilon: 0.988019999999988 Training Episodes: 9401
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [1, 10] Result: Win 
11
143
Before
6_6
After
7_7
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [1, 10, 10] Result: Lose 
21
153
Before
0000000000_5
After
0000000000_6
Epsilon: 0.987999999999988 Training Episodes: 9400
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [5, 4] Result: Win 
9
163
Before
3_3
After
4_4
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [5, 4, 3] Result: Lose 
12
166
Before
1_1
After
1_2
Epsilon: 0.987979999999988 Training Episodes: 9399
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
2_36
After
2_37
Epsilon: 0.987959999999988 Training Episodes: 9398
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
2_37
After
2_38
Epsilon: 0.9879399999999879 Training Episodes: 9397
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 9] Result: Win 
12
232
Before
14_23
After
15_24
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 9, 7] Result: Win 
19
239
Before
6_30
After
7_31
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 9, 7, 2] Result: Lose 
21
241
Before
0000000000_28
After
0000000000_29
Epsilon: 0.9879199999999879 Training Episodes: 9396
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [9, 2] Result: Win 
11
55
Before
4_4
After
5_5
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [9, 2, 2] Result: Win 
13
57
Before
2_5
After
3_6
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [9, 2, 2, 2] Result: Win 
15
59
Before
1_7
After
2_8
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [9, 2, 2, 2, 5] Result: Lose 
20
64
Before
1_12
After
1_13
Epsilon: 0.9878999999999879 Training Episodes: 9395
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
2_38
After
2_39
Epsilon: 0.9878799999999879 Training Episodes: 9394
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [6, 10] Result: Lose 
16
170
Before
0000000000_5
After
0000000000_6
Epsilon: 0.9878599999999879 Training Episodes: 9393
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [2, 10] Result: Win 
12
188
Before
5_6
After
6_7
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [2, 10, 4] Result: Win 
16
192
Before
1_2
After
2_3
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [2, 10, 4, 5] Result: Lose 
21
197
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9878399999999878 Training Episodes: 9392
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [7, 10] Result: Lose 
17
215
Before
2_6
After
2_7
Epsilon: 0.9878199999999878 Training Episodes: 9391
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 10] Result: Lose 
20
152
Before
1_13
After
1_14
Epsilon: 0.9877999999999878 Training Episodes: 9390
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 10] Result: Lose 
20
130
Before
1_6
After
1_7
Epsilon: 0.9877799999999878 Training Episodes: 9389
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 2] Result: Win 
12
144
Before
2_6
After
3_7
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 2, 8] Result: Lose 
20
152
Before
1_14
After
1_15
Epsilon: 0.9877599999999878 Training Episodes: 9388
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [1, 10] Result: Win 
11
187
Before
6_6
After
7_7
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [1, 10, 1] Result: Win 
12
188
Before
6_7
After
7_8
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [1, 10, 1, 8] Result: Lose 
20
196
Before
0000000000_5
After
0000000000_6
Epsilon: 0.9877399999999877 Training Episodes: 9387
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [9, 1] Result: Win 
10
54
Before
5_5
After
6_6
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [9, 1, 10] Result: Lose 
20
64
Before
1_13
After
1_14
Epsilon: 0.9877199999999877 Training Episodes: 9386
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 2] Result: Win 
12
122
Before
6_8
After
7_9
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 2, 6] Result: Lose 
18
128
Before
2_10
After
2_11
Epsilon: 0.9876999999999877 Training Episodes: 9385
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [3, 3] Result: Win 
6
50
Before
3_3
After
4_4
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [3, 3, 4] Result: Win 
10
54
Before
6_6
After
7_7
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [3, 3, 4, 4] Result: Win 
14
58
Before
1_3
After
2_4
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [3, 3, 4, 4, 1] Result: Lose 
15
59
Before
2_8
After
2_9
Epsilon: 0.9876799999999877 Training Episodes: 9384
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [6, 7] Result: Win 
13
145
Before
2_4
After
3_5
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [6, 7, 3] Result: Lose 
16
148
Before
1_4
After
1_5
Epsilon: 0.9876599999999877 Training Episodes: 9383
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [1, 3] Result: Win 
4
92
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [1, 3, 10] Result: Win 
14
102
Before
4_7
After
5_8
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [1, 3, 10, 3] Result: Lose 
17
105
Before
2_8
After
2_9
Epsilon: 0.9876399999999876 Training Episodes: 9382
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 9] Result: Lose 
19
129
Before
2_5
After
2_6
Epsilon: 0.9876199999999876 Training Episodes: 9381
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [6, 10] Result: Lose 
16
60
Before
6_8
After
6_9
Epsilon: 0.9875999999999876 Training Episodes: 9380
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [3, 3] Result: Win 
6
116
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [3, 3, 9] Result: Win 
15
125
Before
3_9
After
4_10
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [3, 3, 9, 6] Result: Lose 
21
131
Before
0000000000_16
After
0000000000_17
Epsilon: 0.9875799999999876 Training Episodes: 9379
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [1, 10] Result: Win 
11
165
Before
7_7
After
8_8
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [1, 10, 10] Result: Lose 
21
175
Before
0000000000_6
After
0000000000_7
Epsilon: 0.9875599999999876 Training Episodes: 9378
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [5, 10] Result: Win 
15
169
Before
4_9
After
5_10
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [5, 10, 3] Result: Win 
18
172
Before
0000000000_3
After
1_4
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [5, 10, 3, 2] Result: Lose 
20
174
Before
0000000000_3
After
0000000000_4
Epsilon: 0.9875399999999875 Training Episodes: 9377
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [6, 6] Result: Win 
12
210
Before
2_4
After
3_5
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [6, 6, 9] Result: Lose 
21
219
Before
0000000000_4
After
0000000000_5
Epsilon: 0.9875199999999875 Training Episodes: 9376
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [10, 8] Result: Lose 
18
62
Before
1_4
After
1_5
Epsilon: 0.9874999999999875 Training Episodes: 9375
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [3, 3] Result: Win 
6
204
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [3, 3, 10] Result: Win 
16
214
Before
2_8
After
3_9
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [3, 3, 10, 1] Result: Lose 
17
215
Before
2_7
After
2_8
Epsilon: 0.9874799999999875 Training Episodes: 9374
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [10, 2] Result: Lose 
12
56
Before
3_5
After
3_6
Epsilon: 0.9874599999999875 Training Episodes: 9373
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 10] Result: Lose 
15
235
Before
8_17
After
8_18
Epsilon: 0.9874399999999874 Training Episodes: 9372
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 8] Result: Lose 
18
40
Before
1_5
After
1_6
Epsilon: 0.9874199999999874 Training Episodes: 9371
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 2] Result: Win 
9
229
Before
13_13
After
14_14
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 2, 6] Result: Win 
15
235
Before
8_18
After
9_19
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 2, 6, 6] Result: Lose 
21
241
Before
0000000000_29
After
0000000000_30
Epsilon: 0.9873999999999874 Training Episodes: 9370
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [2, 8] Result: Win 
10
76
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [2, 8, 10] Result: Lose 
20
86
Before
4_10
After
4_11
Epsilon: 0.9873799999999874 Training Episodes: 9369
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [6, 10] Result: Win 
16
148
Before
1_5
After
2_6
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [6, 10, 4] Result: Lose 
20
152
Before
1_15
After
1_16
Epsilon: 0.9873599999999874 Training Episodes: 9368
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 2] Result: Win 
9
229
Before
14_14
After
15_15
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 2, 10] Result: Lose 
19
239
Before
7_31
After
7_32
Epsilon: 0.9873399999999873 Training Episodes: 9367
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 10] Result: Win 
14
234
Before
15_24
After
16_25
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 10, 5] Result: Win 
19
239
Before
7_32
After
8_33
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 10, 5, 2] Result: Lose 
21
241
Before
0000000000_30
After
0000000000_31
Epsilon: 0.9873199999999873 Training Episodes: 9366
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [2, 6] Result: Win 
8
140
Before
3_3
After
4_4
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [2, 6, 10] Result: Lose 
18
150
Before
0000000000_5
After
0000000000_6
Epsilon: 0.9872999999999873 Training Episodes: 9365
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [8, 7] Result: Lose 
15
213
Before
3_6
After
3_7
Epsilon: 0.9872799999999873 Training Episodes: 9364
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 9] Result: Lose 
19
107
Before
2_7
After
2_8
Epsilon: 0.9872599999999873 Training Episodes: 9363
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 6] Result: Lose 
16
236
Before
13_32
After
13_33
Epsilon: 0.9872399999999872 Training Episodes: 9362
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [2, 10] Result: Win 
12
166
Before
1_2
After
2_3
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [2, 10, 9] Result: Lose 
21
175
Before
0000000000_7
After
0000000000_8
Epsilon: 0.9872199999999872 Training Episodes: 9361
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [8, 6] Result: Win 
14
124
Before
5_7
After
6_8
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [8, 6, 5] Result: Lose 
19
129
Before
2_6
After
2_7
Epsilon: 0.9871999999999872 Training Episodes: 9360
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 2] Result: Win 
6
226
Before
4_4
After
5_5
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 2, 2] Result: Win 
8
228
Before
11_11
After
12_12
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 2, 2, 10] Result: Win 
18
238
Before
7_37
After
8_38
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 2, 2, 10, 3] Result: Lose 
21
241
Before
0000000000_31
After
0000000000_32
Epsilon: 0.9871799999999872 Training Episodes: 9359
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [6, 9] Result: Lose 
15
37
Before
1_3
After
1_4
Epsilon: 0.9871599999999872 Training Episodes: 9358
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10] Result: Win 
13
233
Before
21_30
After
22_31
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10, 6] Result: Win 
19
239
Before
8_33
After
9_34
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10, 6, 1] Result: Lose 
20
240
Before
2_39
After
2_40
Epsilon: 0.9871399999999871 Training Episodes: 9357
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [3, 3] Result: Win 
6
204
Before
2_2
After
3_3
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [3, 3, 4] Result: Win 
10
208
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [3, 3, 4, 8] Result: Lose 
18
216
Before
1_10
After
1_11
Epsilon: 0.9871199999999871 Training Episodes: 9356
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [4, 10] Result: Win 
14
190
Before
2_3
After
3_4
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [4, 10, 3] Result: Win 
17
193
Before
1_6
After
2_7
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [4, 10, 3, 1] Result: Lose 
18
194
Before
1_7
After
1_8
Epsilon: 0.9870999999999871 Training Episodes: 9355
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [2, 10] Result: Win 
12
100
Before
2_3
After
3_4
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [2, 10, 4] Result: Win 
16
104
Before
3_9
After
4_10
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [2, 10, 4, 1] Result: Lose 
17
105
Before
2_9
After
2_10
Epsilon: 0.9870799999999871 Training Episodes: 9354
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 4] Result: Win 
7
227
Before
5_5
After
6_6
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 4, 3] Result: Win 
10
230
Before
13_13
After
14_14
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 4, 3, 9] Result: Lose 
19
239
Before
9_34
After
9_35
Epsilon: 0.9870599999999871 Training Episodes: 9353
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [2, 10] Result: Win 
12
78
Before
5_7
After
6_8
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [2, 10, 6] Result: Lose 
18
84
Before
1_5
After
1_6
Epsilon: 0.987039999999987 Training Episodes: 9352
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [9, 2] Result: Win 
11
33
Before
5_5
After
6_6
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [9, 2, 5] Result: Lose 
16
38
Before
0000000000_3
After
0000000000_4
Epsilon: 0.987019999999987 Training Episodes: 9351
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 9] Result: Win 
19
239
Before
9_35
After
10_36
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 9, 1] Result: Lose 
20
240
Before
2_40
After
2_41
Epsilon: 0.986999999999987 Training Episodes: 9350
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 8] Result: Win 
18
194
Before
1_8
After
2_9
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 8, 2] Result: Lose 
20
196
Before
0000000000_6
After
0000000000_7
Epsilon: 0.986979999999987 Training Episodes: 9349
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [9, 7] Result: Lose 
16
126
Before
5_9
After
5_10
Epsilon: 0.986959999999987 Training Episodes: 9348
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [9, 1] Result: Win 
10
98
Before
7_7
After
8_8
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [9, 1, 9] Result: Lose 
19
107
Before
2_8
After
2_9
Epsilon: 0.9869399999999869 Training Episodes: 9347
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
2_41
After
2_42
Epsilon: 0.9869199999999869 Training Episodes: 9346
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [6, 10] Result: Lose 
16
104
Before
4_10
After
4_11
Epsilon: 0.9868999999999869 Training Episodes: 9345
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 10] Result: Lose 
20
152
Before
1_16
After
1_17
Epsilon: 0.9868799999999869 Training Episodes: 9344
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 3] Result: Win 
11
231
Before
27_27
After
28_28
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 3, 10] Result: Lose 
21
241
Before
0000000000_32
After
0000000000_33
Epsilon: 0.9868599999999869 Training Episodes: 9343
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [4, 2] Result: Win 
6
160
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [4, 2, 7] Result: Lose 
13
167
Before
5_7
After
5_8
Epsilon: 0.9868399999999868 Training Episodes: 9342
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 4] Result: Win 
9
229
Before
15_15
After
16_16
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 4, 5] Result: Win 
14
234
Before
16_25
After
17_26
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 4, 5, 5] Result: Win 
19
239
Before
10_36
After
11_37
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 4, 5, 5, 2] Result: Lose 
21
241
Before
0000000000_33
After
0000000000_34
Epsilon: 0.9868199999999868 Training Episodes: 9341
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 4] Result: Lose 
14
168
Before
1_3
After
1_4
Epsilon: 0.9867999999999868 Training Episodes: 9340
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [7, 10] Result: Lose 
17
127
Before
4_12
After
4_13
Epsilon: 0.9867799999999868 Training Episodes: 9339
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 3] Result: Win 
13
79
Before
0000000000_3
After
1_4
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 3, 7] Result: Lose 
20
86
Before
4_11
After
4_12
Epsilon: 0.9867599999999868 Training Episodes: 9338
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 4] Result: Win 
8
228
Before
12_12
After
13_13
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 4, 10] Result: Win 
18
238
Before
8_38
After
9_39
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 4, 10, 2] Result: Lose 
20
240
Before
2_42
After
2_43
Epsilon: 0.9867399999999867 Training Episodes: 9337
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [9, 10] Result: Lose 
19
129
Before
2_7
After
2_8
Epsilon: 0.9867199999999867 Training Episodes: 9336
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 10] Result: Win 
12
232
Before
15_24
After
16_25
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 10, 6] Result: Lose 
18
238
Before
9_39
After
9_40
Epsilon: 0.9866999999999867 Training Episodes: 9335
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [1, 1] Result: Win 
2
442
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [1, 1, 2] Result: Win 
4
444
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [1, 1, 2, 6] Result: Win 
10
208
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [1, 1, 2, 6, 4] Result: Lose 
14
212
Before
0000000000_1
After
0000000000_2
Epsilon: 0.9866799999999867 Training Episodes: 9334
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 7] Result: Lose 
17
237
Before
6_23
After
6_24
Epsilon: 0.9866599999999867 Training Episodes: 9333
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 2] Result: Win 
12
166
Before
2_3
After
3_4
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 2, 4] Result: Lose 
16
170
Before
0000000000_6
After
0000000000_7
Epsilon: 0.9866399999999866 Training Episodes: 9332
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [6, 10] Result: Lose 
16
214
Before
3_9
After
3_10
Epsilon: 0.9866199999999866 Training Episodes: 9331
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
2_43
After
2_44
Epsilon: 0.9865999999999866 Training Episodes: 9330
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 10] Result: Win 
19
239
Before
11_37
After
12_38
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 10, 1] Result: Lose 
20
240
Before
2_44
After
2_45
Epsilon: 0.9865799999999866 Training Episodes: 9329
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [2, 5] Result: Win 
7
73
Before
2_2
After
3_3
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [2, 5, 5] Result: Win 
12
78
Before
6_8
After
7_9
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [2, 5, 5, 8] Result: Win 
20
86
Before
4_12
After
5_13
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [2, 5, 5, 8, 1] Result: Lose 
21
87
Before
0000000000_12
After
0000000000_13
Epsilon: 0.9865599999999866 Training Episodes: 9328
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 4] Result: Win 
10
230
Before
14_14
After
15_15
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 4, 10] Result: Lose 
20
240
Before
2_45
After
2_46
Epsilon: 0.9865399999999865 Training Episodes: 9327
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 10] Result: Win 
20
174
Before
0000000000_4
After
1_5
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 10, 1] Result: Lose 
21
175
Before
0000000000_8
After
0000000000_9
Epsilon: 0.9865199999999865 Training Episodes: 9326
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [4, 10] Result: Win 
14
124
Before
6_8
After
7_9
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [4, 10, 6] Result: Lose 
20
130
Before
1_7
After
1_8
Epsilon: 0.9864999999999865 Training Episodes: 9325
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 5] Result: Win 
10
230
Before
15_15
After
16_16
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 5, 10] Result: Lose 
20
240
Before
2_46
After
2_47
Epsilon: 0.9864799999999865 Training Episodes: 9324
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 10] Result: Lose 
20
42
Before
1_8
After
1_9
Epsilon: 0.9864599999999865 Training Episodes: 9323
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [6, 1] Result: Win 
7
117
Before
2_2
After
3_3
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [6, 1, 5] Result: Win 
12
122
Before
7_9
After
8_10
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [6, 1, 5, 6] Result: Lose 
18
128
Before
2_11
After
2_12
Epsilon: 0.9864399999999864 Training Episodes: 9322
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 10] Result: Lose 
20
152
Before
1_17
After
1_18
Epsilon: 0.9864199999999864 Training Episodes: 9321
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [1, 3] Result: Win 
4
180
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [1, 3, 9] Result: Lose 
13
189
Before
1_6
After
1_7
Epsilon: 0.9863999999999864 Training Episodes: 9320
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
2_47
After
2_48
Epsilon: 0.9863799999999864 Training Episodes: 9319
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [8, 1] Result: Win 
9
141
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [8, 1, 10] Result: Lose 
19
151
Before
0000000000_3
After
0000000000_4
Epsilon: 0.9863599999999864 Training Episodes: 9318
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [4, 6] Result: Win 
10
32
Before
4_4
After
5_5
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [4, 6, 8] Result: Win 
18
40
Before
1_6
After
2_7
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [4, 6, 8, 1] Result: Lose 
19
41
Before
1_5
After
1_6
Epsilon: 0.9863399999999863 Training Episodes: 9317
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 9] Result: Lose 
18
238
Before
9_40
After
9_41
Epsilon: 0.9863199999999863 Training Episodes: 9316
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 3] Result: Lose 
13
123
Before
6_7
After
6_8
Epsilon: 0.9862999999999863 Training Episodes: 9315
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 1] Result: Win 
11
209
Before
3_3
After
4_4
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 1, 5] Result: Win 
16
214
Before
3_10
After
4_11
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 1, 5, 1] Result: Lose 
17
215
Before
2_8
After
2_9
Epsilon: 0.9862799999999863 Training Episodes: 9314
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [7, 4] Result: Win 
11
99
Before
6_6
After
7_7
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [7, 4, 3] Result: Lose 
14
102
Before
5_8
After
5_9
Epsilon: 0.9862599999999863 Training Episodes: 9313
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 6] Result: Lose 
16
104
Before
4_11
After
4_12
Epsilon: 0.9862399999999862 Training Episodes: 9312
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [6, 8] Result: Lose 
14
102
Before
5_9
After
5_10
Epsilon: 0.9862199999999862 Training Episodes: 9311
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 9] Result: Lose 
12
232
Before
16_25
After
16_26
Epsilon: 0.9861999999999862 Training Episodes: 9310
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [7, 10] Result: Win 
17
61
Before
3_8
After
4_9
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [7, 10, 2] Result: Lose 
19
63
Before
1_6
After
1_7
Epsilon: 0.9861799999999862 Training Episodes: 9309
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [6, 3] Result: Win 
9
97
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [6, 3, 5] Result: Win 
14
102
Before
5_10
After
6_11
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [6, 3, 5, 1] Result: Lose 
15
103
Before
3_9
After
3_10
Epsilon: 0.9861599999999862 Training Episodes: 9308
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 1] Result: Win 
10
230
Before
16_16
After
17_17
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 1, 3] Result: Lose 
13
233
Before
22_31
After
22_32
Epsilon: 0.9861399999999861 Training Episodes: 9307
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [10, 5] Result: Lose 
15
59
Before
2_9
After
2_10
Epsilon: 0.9861199999999861 Training Episodes: 9306
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [5, 10] Result: Lose 
15
59
Before
2_10
After
2_11
Epsilon: 0.9860999999999861 Training Episodes: 9305
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [8, 10] Result: Lose 
18
106
Before
3_11
After
3_12
Epsilon: 0.9860799999999861 Training Episodes: 9304
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 7] Result: Win 
14
234
Before
17_26
After
18_27
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 7, 4] Result: Lose 
18
238
Before
9_41
After
9_42
Epsilon: 0.9860599999999861 Training Episodes: 9303
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 8] Result: Lose 
18
106
Before
3_12
After
3_13
Epsilon: 0.986039999999986 Training Episodes: 9302
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [8, 2] Result: Win 
10
406
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [8, 2, 1] Result: Win 
11
165
Before
8_8
After
9_9
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [8, 2, 1, 9] Result: Lose 
20
174
Before
1_5
After
1_6
Epsilon: 0.986019999999986 Training Episodes: 9301
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [6, 10] Result: Lose 
16
214
Before
4_11
After
4_12
Epsilon: 0.985999999999986 Training Episodes: 9300
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [8, 3] Result: Win 
11
33
Before
6_6
After
7_7
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [8, 3, 10] Result: Lose 
21
43
Before
0000000000_7
After
0000000000_8
Epsilon: 0.985979999999986 Training Episodes: 9299
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 5] Result: Win 
15
235
Before
9_19
After
10_20
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 5, 6] Result: Lose 
21
241
Before
0000000000_34
After
0000000000_35
Epsilon: 0.985959999999986 Training Episodes: 9298
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 4] Result: Lose 
14
36
Before
4_9
After
4_10
Epsilon: 0.9859399999999859 Training Episodes: 9297
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 10] Result: Lose 
18
238
Before
9_42
After
9_43
Epsilon: 0.9859199999999859 Training Episodes: 9296
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 5] Result: Lose 
13
233
Before
22_32
After
22_33
Epsilon: 0.9858999999999859 Training Episodes: 9295
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
2_48
After
2_49
Epsilon: 0.9858799999999859 Training Episodes: 9294
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 2] Result: Win 
8
228
Before
13_13
After
14_14
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 2, 6] Result: Lose 
14
234
Before
18_27
After
18_28
Epsilon: 0.9858599999999859 Training Episodes: 9293
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 3] Result: Lose 
13
233
Before
22_33
After
22_34
Epsilon: 0.9858399999999858 Training Episodes: 9292
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 10] Result: Win 
11
231
Before
28_28
After
29_29
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 10, 10] Result: Lose 
21
241
Before
0000000000_35
After
0000000000_36
Epsilon: 0.9858199999999858 Training Episodes: 9291
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [8, 6] Result: Win 
14
36
Before
4_10
After
5_11
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [8, 6, 4] Result: Win 
18
40
Before
2_7
After
3_8
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [8, 6, 4, 2] Result: Lose 
20
42
Before
1_9
After
1_10
Epsilon: 0.9857999999999858 Training Episodes: 9290
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [7, 8] Result: Win 
15
81
Before
2_7
After
3_8
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [7, 8, 2] Result: Win 
17
83
Before
1_4
After
2_5
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [7, 8, 2, 3] Result: Lose 
20
86
Before
5_13
After
5_14
Epsilon: 0.9857799999999858 Training Episodes: 9289
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 7] Result: Win 
13
233
Before
22_34
After
23_35
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 7, 4] Result: Lose 
17
237
Before
6_24
After
6_25
Epsilon: 0.9857599999999858 Training Episodes: 9288
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 6] Result: Win 
16
236
Before
13_33
After
14_34
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 6, 1] Result: Lose 
17
237
Before
6_25
After
6_26
Epsilon: 0.9857399999999857 Training Episodes: 9287
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 3] Result: Lose 
13
233
Before
23_35
After
23_36
Epsilon: 0.9857199999999857 Training Episodes: 9286
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 8] Result: Win 
9
471
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 8, 2] Result: Win 
11
231
Before
29_29
After
30_30
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 8, 2, 6] Result: Lose 
17
237
Before
6_26
After
6_27
Epsilon: 0.9856999999999857 Training Episodes: 9285
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [4, 1] Result: Win 
5
401
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [4, 1, 1] Result: Win 
6
160
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [4, 1, 1, 10] Result: Lose 
16
170
Before
0000000000_7
After
0000000000_8
Epsilon: 0.9856799999999857 Training Episodes: 9284
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [7, 1] Result: Win 
8
206
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [7, 1, 10] Result: Lose 
18
216
Before
1_11
After
1_12
Epsilon: 0.9856599999999857 Training Episodes: 9283
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [9, 10] Result: Lose 
19
41
Before
1_6
After
1_7
Epsilon: 0.9856399999999856 Training Episodes: 9282
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [1, 10] Result: Win 
11
165
Before
9_9
After
10_10
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [1, 10, 10] Result: Lose 
21
175
Before
0000000000_9
After
0000000000_10
Epsilon: 0.9856199999999856 Training Episodes: 9281
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 6] Result: Win 
16
148
Before
2_6
After
3_7
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 6, 2] Result: Lose 
18
150
Before
0000000000_6
After
0000000000_7
Epsilon: 0.9855999999999856 Training Episodes: 9280
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [3, 4] Result: Win 
7
95
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [3, 4, 10] Result: Lose 
17
105
Before
2_10
After
2_11
Epsilon: 0.9855799999999856 Training Episodes: 9279
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 7] Result: Win 
17
127
Before
4_13
After
5_14
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 7, 3] Result: Lose 
20
130
Before
1_8
After
1_9
Epsilon: 0.9855599999999856 Training Episodes: 9278
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [9, 10] Result: Lose 
19
151
Before
0000000000_4
After
0000000000_5
Epsilon: 0.9855399999999855 Training Episodes: 9277
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [1, 10] Result: Win 
11
187
Before
7_7
After
8_8
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [1, 10, 10] Result: Lose 
21
197
Before
0000000000_3
After
0000000000_4
Epsilon: 0.9855199999999855 Training Episodes: 9276
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
2_49
After
2_50
Epsilon: 0.9854999999999855 Training Episodes: 9275
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [7, 7] Result: Win 
14
80
Before
5_11
After
6_12
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [7, 7, 2] Result: Lose 
16
82
Before
1_6
After
1_7
Epsilon: 0.9854799999999855 Training Episodes: 9274
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [6, 10] Result: Lose 
16
82
Before
1_7
After
1_8
Epsilon: 0.9854599999999855 Training Episodes: 9273
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 1] Result: Win 
11
231
Before
30_30
After
31_31
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 1, 9] Result: Lose 
20
240
Before
2_50
After
2_51
Epsilon: 0.9854399999999854 Training Episodes: 9272
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [4, 10] Result: Win 
14
124
Before
7_9
After
8_10
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [4, 10, 5] Result: Lose 
19
129
Before
2_8
After
2_9
Epsilon: 0.9854199999999854 Training Episodes: 9271
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [5, 2] Result: Win 
7
161
Before
2_2
After
3_3
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [5, 2, 2] Result: Win 
9
163
Before
4_4
After
5_5
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [5, 2, 2, 6] Result: Lose 
15
169
Before
5_10
After
5_11
Epsilon: 0.9853999999999854 Training Episodes: 9270
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 8] Result: Lose 
18
216
Before
1_12
After
1_13
Epsilon: 0.9853799999999854 Training Episodes: 9269
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 10] Result: Lose 
15
235
Before
10_20
After
10_21
Epsilon: 0.9853599999999854 Training Episodes: 9268
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [9, 2] Result: Win 
11
55
Before
5_5
After
6_6
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [9, 2, 10] Result: Lose 
21
65
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9853399999999853 Training Episodes: 9267
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 8] Result: Win 
12
232
Before
16_26
After
17_27
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 8, 4] Result: Lose 
16
236
Before
14_34
After
14_35
Epsilon: 0.9853199999999853 Training Episodes: 9266
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [9, 8] Result: Win 
17
61
Before
4_9
After
5_10
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [9, 8, 2] Result: Win 
19
63
Before
1_7
After
2_8
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [9, 8, 2, 2] Result: Lose 
21
65
Before
0000000000_3
After
0000000000_4
Epsilon: 0.9852999999999853 Training Episodes: 9265
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [5, 9] Result: Win 
14
190
Before
3_4
After
4_5
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [5, 9, 3] Result: Lose 
17
193
Before
2_7
After
2_8
Epsilon: 0.9852799999999853 Training Episodes: 9264
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 5] Result: Win 
15
103
Before
3_10
After
4_11
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 5, 2] Result: Win 
17
105
Before
2_11
After
3_12
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 5, 2, 4] Result: Lose 
21
109
Before
0000000000_7
After
0000000000_8
Epsilon: 0.9852599999999853 Training Episodes: 9263
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [3, 10] Result: Win 
13
35
Before
1_2
After
2_3
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [3, 10, 5] Result: Win 
18
40
Before
3_8
After
4_9
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [3, 10, 5, 1] Result: Lose 
19
41
Before
1_7
After
1_8
Epsilon: 0.9852399999999852 Training Episodes: 9262
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 5] Result: Win 
15
37
Before
1_4
After
2_5
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 5, 1] Result: Lose 
16
38
Before
0000000000_4
After
0000000000_5
Epsilon: 0.9852199999999852 Training Episodes: 9261
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 10] Result: Lose 
14
234
Before
18_28
After
18_29
Epsilon: 0.9851999999999852 Training Episodes: 9260
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 8] Result: Win 
13
233
Before
23_36
After
24_37
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 8, 6] Result: Lose 
19
239
Before
12_38
After
12_39
Epsilon: 0.9851799999999852 Training Episodes: 9259
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [9, 9] Result: Lose 
18
172
Before
1_4
After
1_5
Epsilon: 0.9851599999999852 Training Episodes: 9258
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [4, 2] Result: Win 
6
182
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [4, 2, 2] Result: Win 
8
184
Before
3_3
After
4_4
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [4, 2, 2, 10] Result: Lose 
18
194
Before
2_9
After
2_10
Epsilon: 0.9851399999999851 Training Episodes: 9257
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 10] Result: Lose 
18
238
Before
9_43
After
9_44
Epsilon: 0.9851199999999851 Training Episodes: 9256
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [1, 3] Result: Win 
4
70
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [1, 3, 10] Result: Lose 
14
80
Before
6_12
After
6_13
Epsilon: 0.9850999999999851 Training Episodes: 9255
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [6, 10] Result: Lose 
16
126
Before
5_10
After
5_11
Epsilon: 0.9850799999999851 Training Episodes: 9254
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [5, 4] Result: Win 
9
119
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [5, 4, 10] Result: Lose 
19
129
Before
2_9
After
2_10
Epsilon: 0.9850599999999851 Training Episodes: 9253
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 6] Result: Win 
16
148
Before
3_7
After
4_8
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 6, 2] Result: Lose 
18
150
Before
0000000000_7
After
0000000000_8
Epsilon: 0.985039999999985 Training Episodes: 9252
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [2, 3] Result: Win 
5
181
Before
2_2
After
3_3
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [2, 3, 10] Result: Win 
15
191
Before
1_3
After
2_4
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [2, 3, 10, 3] Result: Lose 
18
194
Before
2_10
After
2_11
Epsilon: 0.985019999999985 Training Episodes: 9251
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [7, 8] Result: Win 
15
37
Before
2_5
After
3_6
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [7, 8, 5] Result: Lose 
20
42
Before
1_10
After
1_11
Epsilon: 0.984999999999985 Training Episodes: 9250
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 6] Result: Win 
11
231
Before
31_31
After
32_32
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 6, 3] Result: Lose 
14
234
Before
18_29
After
18_30
Epsilon: 0.984979999999985 Training Episodes: 9249
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [9, 10] Result: Lose 
19
195
Before
0000000000_7
After
0000000000_8
Epsilon: 0.984959999999985 Training Episodes: 9248
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [7, 4] Result: Win 
11
99
Before
7_7
After
8_8
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [7, 4, 2] Result: Lose 
13
101
Before
3_10
After
3_11
Epsilon: 0.9849399999999849 Training Episodes: 9247
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [2, 7] Result: Win 
9
97
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [2, 7, 4] Result: Lose 
13
101
Before
3_11
After
3_12
Epsilon: 0.9849199999999849 Training Episodes: 9246
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 8] Result: Win 
14
234
Before
18_30
After
19_31
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 8, 7] Result: Lose 
21
241
Before
0000000000_36
After
0000000000_37
Epsilon: 0.9848999999999849 Training Episodes: 9245
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [3, 6] Result: Win 
9
97
Before
3_3
After
4_4
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [3, 6, 9] Result: Lose 
18
106
Before
3_13
After
3_14
Epsilon: 0.9848799999999849 Training Episodes: 9244
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 10] Result: Win 
11
231
Before
32_32
After
33_33
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 10, 10] Result: Lose 
21
241
Before
0000000000_37
After
0000000000_38
Epsilon: 0.9848599999999849 Training Episodes: 9243
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 7] Result: Lose 
17
105
Before
3_12
After
3_13
Epsilon: 0.9848399999999848 Training Episodes: 9242
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [5, 5] Result: Win 
10
142
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [5, 5, 9] Result: Win 
19
151
Before
0000000000_5
After
1_6
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [5, 5, 9, 1] Result: Lose 
20
152
Before
1_18
After
1_19
Epsilon: 0.9848199999999848 Training Episodes: 9241
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 7] Result: Win 
12
232
Before
17_27
After
18_28
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 7, 2] Result: Win 
14
234
Before
19_31
After
20_32
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 7, 2, 3] Result: Win 
17
237
Before
6_27
After
7_28
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 7, 2, 3, 4] Result: Lose 
21
241
Before
0000000000_38
After
0000000000_39
Epsilon: 0.9847999999999848 Training Episodes: 9240
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 7] Result: Lose 
14
234
Before
20_32
After
20_33
Epsilon: 0.9847799999999848 Training Episodes: 9239
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [8, 10] Result: Lose 
18
62
Before
1_5
After
1_6
Epsilon: 0.9847599999999848 Training Episodes: 9238
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [3, 8] Result: Win 
11
187
Before
8_8
After
9_9
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [3, 8, 10] Result: Lose 
21
197
Before
0000000000_4
After
0000000000_5
Epsilon: 0.9847399999999847 Training Episodes: 9237
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [4, 4] Result: Win 
8
206
Before
2_2
After
3_3
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [4, 4, 3] Result: Win 
11
209
Before
4_4
After
5_5
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [4, 4, 3, 8] Result: Lose 
19
217
Before
1_8
After
1_9
Epsilon: 0.9847199999999847 Training Episodes: 9236
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 4] Result: Win 
14
36
Before
5_11
After
6_12
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 4, 6] Result: Lose 
20
42
Before
1_11
After
1_12
Epsilon: 0.9846999999999847 Training Episodes: 9235
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [8, 10] Result: Lose 
18
172
Before
1_5
After
1_6
Epsilon: 0.9846799999999847 Training Episodes: 9234
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
2_51
After
2_52
Epsilon: 0.9846599999999847 Training Episodes: 9233
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 10] Result: Lose 
19
239
Before
12_39
After
12_40
Epsilon: 0.9846399999999846 Training Episodes: 9232
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [6, 1] Result: Win 
7
29
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [6, 1, 7] Result: Win 
14
36
Before
6_12
After
7_13
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [6, 1, 7, 1] Result: Win 
15
37
Before
3_6
After
4_7
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [6, 1, 7, 1, 1] Result: Win 
16
38
Before
0000000000_5
After
1_6
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [6, 1, 7, 1, 1, 3] Result: Lose 
19
41
Before
1_8
After
1_9
Epsilon: 0.9846199999999846 Training Episodes: 9231
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [8, 8] Result: Lose 
16
38
Before
1_6
After
1_7
Epsilon: 0.9845999999999846 Training Episodes: 9230
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 7] Result: Win 
17
237
Before
7_28
After
8_29
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 7, 4] Result: Lose 
21
241
Before
0000000000_39
After
0000000000_40
Epsilon: 0.9845799999999846 Training Episodes: 9229
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 10] Result: Lose 
19
239
Before
12_40
After
12_41
Epsilon: 0.9845599999999846 Training Episodes: 9228
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [2, 5] Result: Win 
7
205
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [2, 5, 10] Result: Lose 
17
215
Before
2_9
After
2_10
Epsilon: 0.9845399999999845 Training Episodes: 9227
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [4, 1] Result: Win 
5
93
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [4, 1, 8] Result: Win 
13
101
Before
3_12
After
4_13
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [4, 1, 8, 3] Result: Win 
16
104
Before
4_12
After
5_13
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [4, 1, 8, 3, 3] Result: Lose 
19
107
Before
2_9
After
2_10
Epsilon: 0.9845199999999845 Training Episodes: 9226
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 10] Result: Lose 
20
130
Before
1_9
After
1_10
Epsilon: 0.9844999999999845 Training Episodes: 9225
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 5] Result: Win 
12
232
Before
18_28
After
19_29
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 5, 4] Result: Lose 
16
236
Before
14_35
After
14_36
Epsilon: 0.9844799999999845 Training Episodes: 9224
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 3] Result: Win 
13
211
Before
2_4
After
3_5
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 3, 2] Result: Lose 
15
213
Before
3_7
After
3_8
Epsilon: 0.9844599999999845 Training Episodes: 9223
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [1, 5] Result: Win 
6
424
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [1, 5, 2] Result: Win 
8
184
Before
4_4
After
5_5
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [1, 5, 2, 10] Result: Lose 
18
194
Before
2_11
After
2_12
Epsilon: 0.9844399999999844 Training Episodes: 9222
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [7, 7] Result: Lose 
14
58
Before
2_4
After
2_5
Epsilon: 0.9844199999999844 Training Episodes: 9221
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [7, 1] Result: Win 
8
272
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [7, 1, 2] Result: Win 
10
32
Before
5_5
After
6_6
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [7, 1, 2, 7] Result: Lose 
17
39
Before
1_3
After
1_4
Epsilon: 0.9843999999999844 Training Episodes: 9220
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 10] Result: Lose 
16
236
Before
14_36
After
14_37
Epsilon: 0.9843799999999844 Training Episodes: 9219
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [6, 10] Result: Lose 
16
38
Before
1_7
After
1_8
Epsilon: 0.9843599999999844 Training Episodes: 9218
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [9, 4] Result: Win 
13
167
Before
5_8
After
6_9
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [9, 4, 5] Result: Lose 
18
172
Before
1_6
After
1_7
Epsilon: 0.9843399999999843 Training Episodes: 9217
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [8, 10] Result: Lose 
18
40
Before
4_9
After
4_10
Epsilon: 0.9843199999999843 Training Episodes: 9216
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [8, 10] Result: Lose 
18
150
Before
0000000000_8
After
0000000000_9
Epsilon: 0.9842999999999843 Training Episodes: 9215
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 4] Result: Lose 
12
232
Before
19_29
After
19_30
Epsilon: 0.9842799999999843 Training Episodes: 9214
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [8, 8] Result: Lose 
16
82
Before
1_8
After
1_9
Epsilon: 0.9842599999999843 Training Episodes: 9213
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [10, 10] Result: Lose 
20
64
Before
1_14
After
1_15
Epsilon: 0.9842399999999842 Training Episodes: 9212
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [4, 10] Result: Lose 
14
58
Before
2_5
After
2_6
Epsilon: 0.9842199999999842 Training Episodes: 9211
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [5, 5] Result: Win 
10
142
Before
2_2
After
3_3
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [5, 5, 2] Result: Win 
12
144
Before
3_7
After
4_8
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [5, 5, 2, 6] Result: Lose 
18
150
Before
0000000000_9
After
0000000000_10
Epsilon: 0.9841999999999842 Training Episodes: 9210
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 7] Result: Win 
17
39
Before
1_4
After
2_5
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 7, 4] Result: Lose 
21
43
Before
0000000000_8
After
0000000000_9
Epsilon: 0.9841799999999842 Training Episodes: 9209
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 9] Result: Win 
16
236
Before
14_37
After
15_38
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 9, 5] Result: Lose 
21
241
Before
0000000000_40
After
0000000000_41
Epsilon: 0.9841599999999842 Training Episodes: 9208
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 10] Result: Lose 
20
42
Before
1_12
After
1_13
Epsilon: 0.9841399999999841 Training Episodes: 9207
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [4, 8] Result: Lose 
12
122
Before
8_10
After
8_11
Epsilon: 0.9841199999999841 Training Episodes: 9206
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [1, 10] Result: Win 
11
143
Before
7_7
After
8_8
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [1, 10, 6] Result: Lose 
17
149
Before
1_7
After
1_8
Epsilon: 0.9840999999999841 Training Episodes: 9205
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 5] Result: Lose 
15
235
Before
10_21
After
10_22
Epsilon: 0.9840799999999841 Training Episodes: 9204
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [2, 4] Result: Win 
6
28
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [2, 4, 2] Result: Win 
8
272
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [2, 4, 2, 1] Result: Win 
9
31
Before
4_4
After
5_5
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [2, 4, 2, 1, 9] Result: Lose 
18
40
Before
4_10
After
4_11
Epsilon: 0.9840599999999841 Training Episodes: 9203
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 3] Result: Win 
13
233
Before
24_37
After
25_38
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 3, 3] Result: Lose 
16
236
Before
15_38
After
15_39
Epsilon: 0.984039999999984 Training Episodes: 9202
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
2_52
After
2_53
Epsilon: 0.984019999999984 Training Episodes: 9201
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [3, 7] Result: Win 
10
32
Before
6_6
After
7_7
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [3, 7, 10] Result: Lose 
20
42
Before
1_13
After
1_14
Epsilon: 0.983999999999984 Training Episodes: 9200
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 4] Result: Lose 
14
234
Before
20_33
After
20_34
Epsilon: 0.983979999999984 Training Episodes: 9199
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [5, 10] Result: Lose 
15
59
Before
2_11
After
2_12
Epsilon: 0.983959999999984 Training Episodes: 9198
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 4] Result: Win 
14
146
Before
5_7
After
6_8
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 4, 5] Result: Lose 
19
151
Before
1_6
After
1_7
Epsilon: 0.9839399999999839 Training Episodes: 9197
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 10] Result: Win 
11
231
Before
33_33
After
34_34
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 10, 7] Result: Lose 
18
238
Before
9_44
After
9_45
Epsilon: 0.9839199999999839 Training Episodes: 9196
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [4, 1] Result: Win 
5
291
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [4, 1, 6] Result: Win 
11
55
Before
6_6
After
7_7
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [4, 1, 6, 3] Result: Lose 
14
58
Before
2_6
After
2_7
Epsilon: 0.9838999999999839 Training Episodes: 9195
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 3] Result: Win 
12
232
Before
19_30
After
20_31
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 3, 5] Result: Lose 
17
237
Before
8_29
After
8_30
Epsilon: 0.9838799999999839 Training Episodes: 9194
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 7] Result: Lose 
16
236
Before
15_39
After
15_40
Epsilon: 0.9838599999999839 Training Episodes: 9193
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [6, 4] Result: Win 
10
208
Before
3_3
After
4_4
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [6, 4, 10] Result: Win 
20
218
Before
0000000000_10
After
1_11
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [6, 4, 10, 1] Result: Lose 
21
219
Before
0000000000_5
After
0000000000_6
Epsilon: 0.9838399999999838 Training Episodes: 9192
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10] Result: Lose 
13
233
Before
25_38
After
25_39
Epsilon: 0.9838199999999838 Training Episodes: 9191
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [7, 7] Result: Lose 
14
80
Before
6_13
After
6_14
Epsilon: 0.9837999999999838 Training Episodes: 9190
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 10] Result: Lose 
16
236
Before
15_40
After
15_41
Epsilon: 0.9837799999999838 Training Episodes: 9189
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 10] Result: Win 
20
108
Before
0000000000_7
After
1_8
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 10, 1] Result: Lose 
21
109
Before
0000000000_8
After
0000000000_9
Epsilon: 0.9837599999999838 Training Episodes: 9188
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [8, 7] Result: Win 
15
125
Before
4_10
After
5_11
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [8, 7, 1] Result: Win 
16
126
Before
5_11
After
6_12
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [8, 7, 1, 4] Result: Lose 
20
130
Before
1_10
After
1_11
Epsilon: 0.9837399999999837 Training Episodes: 9187
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [3, 6] Result: Win 
9
185
Before
2_2
After
3_3
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [3, 6, 8] Result: Win 
17
193
Before
2_8
After
3_9
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [3, 6, 8, 1] Result: Lose 
18
194
Before
2_12
After
2_13
Epsilon: 0.9837199999999837 Training Episodes: 9186
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [9, 3] Result: Win 
12
122
Before
8_11
After
9_12
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [9, 3, 8] Result: Lose 
20
130
Before
1_11
After
1_12
Epsilon: 0.9836999999999837 Training Episodes: 9185
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 5] Result: Lose 
15
125
Before
5_11
After
5_12
Epsilon: 0.9836799999999837 Training Episodes: 9184
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [7, 5] Result: Win 
12
188
Before
7_8
After
8_9
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [7, 5, 5] Result: Lose 
17
193
Before
3_9
After
3_10
Epsilon: 0.9836599999999837 Training Episodes: 9183
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
2_53
After
2_54
Epsilon: 0.9836399999999836 Training Episodes: 9182
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [1, 8] Result: Win 
9
383
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [1, 8, 1] Result: Win 
10
142
Before
3_3
After
4_4
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [1, 8, 1, 3] Result: Win 
13
145
Before
3_5
After
4_6
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [1, 8, 1, 3, 5] Result: Win 
18
150
Before
0000000000_10
After
1_11
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [1, 8, 1, 3, 5, 3] Result: Lose 
21
153
Before
0000000000_6
After
0000000000_7
Epsilon: 0.9836199999999836 Training Episodes: 9181
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 6] Result: Win 
16
126
Before
6_12
After
7_13
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 6, 2] Result: Lose 
18
128
Before
2_12
After
2_13
Epsilon: 0.9835999999999836 Training Episodes: 9180
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [5, 2] Result: Win 
7
117
Before
3_3
After
4_4
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [5, 2, 10] Result: Lose 
17
127
Before
5_14
After
5_15
Epsilon: 0.9835799999999836 Training Episodes: 9179
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [3, 10] Result: Lose 
13
211
Before
3_5
After
3_6
Epsilon: 0.9835599999999836 Training Episodes: 9178
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
2_54
After
2_55
Epsilon: 0.9835399999999835 Training Episodes: 9177
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 2] Result: Lose 
12
166
Before
3_4
After
3_5
Epsilon: 0.9835199999999835 Training Episodes: 9176
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [8, 10] Result: Win 
18
194
Before
2_13
After
3_14
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [8, 10, 1] Result: Lose 
19
195
Before
0000000000_8
After
0000000000_9
Epsilon: 0.9834999999999835 Training Episodes: 9175
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [5, 4] Result: Win 
9
141
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [5, 4, 6] Result: Lose 
15
147
Before
3_9
After
3_10
Epsilon: 0.9834799999999835 Training Episodes: 9174
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 1] Result: Win 
11
231
Before
34_34
After
35_35
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 1, 5] Result: Lose 
16
236
Before
15_41
After
15_42
Epsilon: 0.9834599999999835 Training Episodes: 9173
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 6] Result: Win 
16
170
Before
0000000000_8
After
1_9
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 6, 1] Result: Lose 
17
171
Before
3_11
After
3_12
Epsilon: 0.9834399999999834 Training Episodes: 9172
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [6, 5] Result: Win 
11
99
Before
8_8
After
9_9
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [6, 5, 8] Result: Lose 
19
107
Before
2_10
After
2_11
Epsilon: 0.9834199999999834 Training Episodes: 9171
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [1, 10] Result: Win 
11
121
Before
6_6
After
7_7
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [1, 10, 6] Result: Lose 
17
127
Before
5_15
After
5_16
Epsilon: 0.9833999999999834 Training Episodes: 9170
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 4] Result: Lose 
14
190
Before
4_5
After
4_6
Epsilon: 0.9833799999999834 Training Episodes: 9169
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [9, 10] Result: Lose 
19
85
Before
0000000000_6
After
0000000000_7
Epsilon: 0.9833599999999834 Training Episodes: 9168
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 1] Result: Win 
8
228
Before
14_14
After
15_15
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 1, 6] Result: Lose 
14
234
Before
20_34
After
20_35
Epsilon: 0.9833399999999833 Training Episodes: 9167
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 1] Result: Win 
10
230
Before
17_17
After
18_18
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 1, 5] Result: Win 
15
235
Before
10_22
After
11_23
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 1, 5, 2] Result: Lose 
17
237
Before
8_30
After
8_31
Epsilon: 0.9833199999999833 Training Episodes: 9166
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [7, 4] Result: Win 
11
55
Before
7_7
After
8_8
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [7, 4, 10] Result: Lose 
21
65
Before
0000000000_4
After
0000000000_5
Epsilon: 0.9832999999999833 Training Episodes: 9165
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [1, 10] Result: Win 
11
165
Before
10_10
After
11_11
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [1, 10, 6] Result: Lose 
17
171
Before
3_12
After
3_13
Epsilon: 0.9832799999999833 Training Episodes: 9164
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 5] Result: Win 
14
234
Before
20_35
After
21_36
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 5, 6] Result: Lose 
20
240
Before
2_55
After
2_56
Epsilon: 0.9832599999999833 Training Episodes: 9163
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [5, 10] Result: Lose 
15
191
Before
2_4
After
2_5
Epsilon: 0.9832399999999832 Training Episodes: 9162
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [4, 8] Result: Lose 
12
34
Before
2_4
After
2_5
Epsilon: 0.9832199999999832 Training Episodes: 9161
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 2] Result: Win 
12
122
Before
9_12
After
10_13
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 2, 3] Result: Lose 
15
125
Before
5_12
After
5_13
Epsilon: 0.9831999999999832 Training Episodes: 9160
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [3, 5] Result: Win 
8
184
Before
5_5
After
6_6
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [3, 5, 10] Result: Lose 
18
194
Before
3_14
After
3_15
Epsilon: 0.9831799999999832 Training Episodes: 9159
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 1] Result: Win 
11
231
Before
35_35
After
36_36
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 1, 7] Result: Lose 
18
238
Before
9_45
After
9_46
Epsilon: 0.9831599999999832 Training Episodes: 9158
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 7] Result: Lose 
17
215
Before
2_10
After
2_11
Epsilon: 0.9831399999999831 Training Episodes: 9157
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [7, 9] Result: Win 
16
82
Before
1_9
After
2_10
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [7, 9, 3] Result: Lose 
19
85
Before
0000000000_7
After
0000000000_8
Epsilon: 0.9831199999999831 Training Episodes: 9156
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 3] Result: Win 
12
232
Before
20_31
After
21_32
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 3, 1] Result: Win 
13
233
Before
25_39
After
26_40
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 3, 1, 7] Result: Win 
20
240
Before
2_56
After
3_57
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 3, 1, 7, 1] Result: Lose 
21
241
Before
0000000000_41
After
0000000000_42
Epsilon: 0.9830999999999831 Training Episodes: 9155
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10] Result: Win 
13
233
Before
26_40
After
27_41
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10, 7] Result: Lose 
20
240
Before
3_57
After
3_58
Epsilon: 0.9830799999999831 Training Episodes: 9154
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 6] Result: Win 
16
236
Before
15_42
After
16_43
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 6, 1] Result: Lose 
17
237
Before
8_31
After
8_32
Epsilon: 0.9830599999999831 Training Episodes: 9153
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [8, 10] Result: Lose 
18
128
Before
2_13
After
2_14
Epsilon: 0.983039999999983 Training Episodes: 9152
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [9, 7] Result: Win 
16
38
Before
1_8
After
2_9
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [9, 7, 3] Result: Lose 
19
41
Before
1_9
After
1_10
Epsilon: 0.983019999999983 Training Episodes: 9151
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [1, 8] Result: Win 
9
31
Before
5_5
After
6_6
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [1, 8, 3] Result: Win 
12
34
Before
2_5
After
3_6
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [1, 8, 3, 7] Result: Win 
19
41
Before
1_10
After
2_11
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [1, 8, 3, 7, 1] Result: Lose 
20
42
Before
1_14
After
1_15
Epsilon: 0.982999999999983 Training Episodes: 9150
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [8, 3] Result: Win 
11
187
Before
9_9
After
10_10
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [8, 3, 8] Result: Win 
19
195
Before
0000000000_9
After
1_10
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [8, 3, 8, 2] Result: Lose 
21
197
Before
0000000000_5
After
0000000000_6
Epsilon: 0.982979999999983 Training Episodes: 9149
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 10] Result: Lose 
20
152
Before
1_19
After
1_20
Epsilon: 0.982959999999983 Training Episodes: 9148
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 2] Result: Lose 
12
232
Before
21_32
After
21_33
Epsilon: 0.9829399999999829 Training Episodes: 9147
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [1, 10] Result: Win 
11
143
Before
8_8
After
9_9
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [1, 10, 3] Result: Win 
14
146
Before
6_8
After
7_9
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [1, 10, 3, 6] Result: Lose 
20
152
Before
1_20
After
1_21
Epsilon: 0.9829199999999829 Training Episodes: 9146
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
3_58
After
3_59
Epsilon: 0.9828999999999829 Training Episodes: 9145
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [7, 5] Result: Win 
12
78
Before
7_9
After
8_10
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [7, 5, 4] Result: Lose 
16
82
Before
2_10
After
2_11
Epsilon: 0.9828799999999829 Training Episodes: 9144
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 9] Result: Lose 
19
41
Before
2_11
After
2_12
Epsilon: 0.9828599999999829 Training Episodes: 9143
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [1, 5] Result: Win 
6
72
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [1, 5, 10] Result: Win 
16
82
Before
2_11
After
3_12
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [1, 5, 10, 3] Result: Lose 
19
85
Before
0000000000_8
After
0000000000_9
Epsilon: 0.9828399999999828 Training Episodes: 9142
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [9, 1] Result: Win 
10
120
Before
8_8
After
9_9
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [9, 1, 8] Result: Lose 
18
128
Before
2_14
After
2_15
Epsilon: 0.9828199999999828 Training Episodes: 9141
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 10] Result: Win 
11
231
Before
36_36
After
37_37
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 10, 3] Result: Lose 
14
234
Before
21_36
After
21_37
Epsilon: 0.9827999999999828 Training Episodes: 9140
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 3] Result: Win 
13
233
Before
27_41
After
28_42
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 3, 3] Result: Lose 
16
236
Before
16_43
After
16_44
Epsilon: 0.9827799999999828 Training Episodes: 9139
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [7, 1] Result: Win 
8
140
Before
4_4
After
5_5
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [7, 1, 7] Result: Win 
15
147
Before
3_10
After
4_11
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [7, 1, 7, 6] Result: Lose 
21
153
Before
0000000000_7
After
0000000000_8
Epsilon: 0.9827599999999828 Training Episodes: 9138
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [3, 10] Result: Win 
13
79
Before
1_4
After
2_5
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [3, 10, 6] Result: Lose 
19
85
Before
0000000000_9
After
0000000000_10
Epsilon: 0.9827399999999827 Training Episodes: 9137
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [7, 7] Result: Lose 
14
168
Before
1_4
After
1_5
Epsilon: 0.9827199999999827 Training Episodes: 9136
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 2] Result: Win 
8
228
Before
15_15
After
16_16
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 2, 6] Result: Lose 
14
234
Before
21_37
After
21_38
Epsilon: 0.9826999999999827 Training Episodes: 9135
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [5, 3] Result: Win 
8
52
Before
2_2
After
3_3
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [5, 3, 3] Result: Win 
11
55
Before
8_8
After
9_9
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [5, 3, 3, 10] Result: Lose 
21
65
Before
0000000000_5
After
0000000000_6
Epsilon: 0.9826799999999827 Training Episodes: 9134
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [4, 10] Result: Win 
14
168
Before
1_5
After
2_6
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [4, 10, 1] Result: Win 
15
169
Before
5_11
After
6_12
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [4, 10, 1, 3] Result: Lose 
18
172
Before
1_7
After
1_8
Epsilon: 0.9826599999999827 Training Episodes: 9133
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 4] Result: Lose 
14
102
Before
6_11
After
6_12
Epsilon: 0.9826399999999826 Training Episodes: 9132
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 6] Result: Win 
8
228
Before
16_16
After
17_17
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 6, 2] Result: Win 
10
472
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 6, 2, 1] Result: Win 
11
231
Before
37_37
After
38_38
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 6, 2, 1, 7] Result: Lose 
18
238
Before
9_46
After
9_47
Epsilon: 0.9826199999999826 Training Episodes: 9131
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [7, 5] Result: Win 
12
56
Before
3_6
After
4_7
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [7, 5, 7] Result: Lose 
19
63
Before
2_8
After
2_9
Epsilon: 0.9825999999999826 Training Episodes: 9130
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [5, 6] Result: Win 
11
77
Before
7_7
After
8_8
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [5, 6, 6] Result: Lose 
17
83
Before
2_5
After
2_6
Epsilon: 0.9825799999999826 Training Episodes: 9129
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [3, 9] Result: Win 
12
210
Before
3_5
After
4_6
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [3, 9, 3] Result: Win 
15
213
Before
3_8
After
4_9
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [3, 9, 3, 3] Result: Win 
18
216
Before
1_13
After
2_14
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [3, 9, 3, 3, 3] Result: Lose 
21
219
Before
0000000000_6
After
0000000000_7
Epsilon: 0.9825599999999826 Training Episodes: 9128
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [4, 1] Result: Win 
5
203
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [4, 1, 10] Result: Lose 
15
213
Before
4_9
After
4_10
Epsilon: 0.9825399999999825 Training Episodes: 9127
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [2, 10] Result: Lose 
12
34
Before
3_6
After
3_7
Epsilon: 0.9825199999999825 Training Episodes: 9126
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [4, 6] Result: Win 
10
142
Before
4_4
After
5_5
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [4, 6, 10] Result: Lose 
20
152
Before
1_21
After
1_22
Epsilon: 0.9824999999999825 Training Episodes: 9125
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [6, 10] Result: Lose 
16
38
Before
2_9
After
2_10
Epsilon: 0.9824799999999825 Training Episodes: 9124
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 5] Result: Lose 
12
232
Before
21_33
After
21_34
Epsilon: 0.9824599999999825 Training Episodes: 9123
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10] Result: Win 
13
233
Before
28_42
After
29_43
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10, 5] Result: Lose 
18
238
Before
9_47
After
9_48
Epsilon: 0.9824399999999824 Training Episodes: 9122
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 5] Result: Win 
7
227
Before
6_6
After
7_7
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 5, 10] Result: Lose 
17
237
Before
8_32
After
8_33
Epsilon: 0.9824199999999824 Training Episodes: 9121
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [9, 10] Result: Lose 
19
129
Before
2_10
After
2_11
Epsilon: 0.9823999999999824 Training Episodes: 9120
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [3, 8] Result: Win 
11
165
Before
11_11
After
12_12
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [3, 8, 4] Result: Win 
15
169
Before
6_12
After
7_13
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [3, 8, 4, 4] Result: Lose 
19
173
Before
0000000000_5
After
0000000000_6
Epsilon: 0.9823799999999824 Training Episodes: 9119
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [2, 9] Result: Win 
11
99
Before
9_9
After
10_10
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [2, 9, 9] Result: Lose 
20
108
Before
1_8
After
1_9
Epsilon: 0.9823599999999824 Training Episodes: 9118
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 1] Result: Win 
7
227
Before
7_7
After
8_8
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 1, 10] Result: Lose 
17
237
Before
8_33
After
8_34
Epsilon: 0.9823399999999823 Training Episodes: 9117
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [4, 3] Result: Win 
7
73
Before
3_3
After
4_4
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [4, 3, 5] Result: Win 
12
78
Before
8_10
After
9_11
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [4, 3, 5, 4] Result: Win 
16
82
Before
3_12
After
4_13
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [4, 3, 5, 4, 1] Result: Lose 
17
83
Before
2_6
After
2_7
Epsilon: 0.9823199999999823 Training Episodes: 9116
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 10] Result: Lose 
20
108
Before
1_9
After
1_10
Epsilon: 0.9822999999999823 Training Episodes: 9115
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 7] Result: Win 
9
471
Before
2_2
After
3_3
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 7, 1] Result: Win 
10
230
Before
18_18
After
19_19
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 7, 1, 8] Result: Win 
18
238
Before
9_48
After
10_49
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 7, 1, 8, 2] Result: Lose 
20
240
Before
3_59
After
3_60
Epsilon: 0.9822799999999823 Training Episodes: 9114
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [5, 10] Result: Lose 
15
169
Before
7_13
After
7_14
Epsilon: 0.9822599999999823 Training Episodes: 9113
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 10] Result: Lose 
16
236
Before
16_44
After
16_45
Epsilon: 0.9822399999999822 Training Episodes: 9112
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 2] Result: Win 
12
188
Before
8_9
After
9_10
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 2, 5] Result: Lose 
17
193
Before
3_10
After
3_11
Epsilon: 0.9822199999999822 Training Episodes: 9111
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 10] Result: Lose 
20
130
Before
1_12
After
1_13
Epsilon: 0.9821999999999822 Training Episodes: 9110
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 4] Result: Lose 
14
146
Before
7_9
After
7_10
Epsilon: 0.9821799999999822 Training Episodes: 9109
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 10] Result: Lose 
16
236
Before
16_45
After
16_46
Epsilon: 0.9821599999999822 Training Episodes: 9108
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 5] Result: Lose 
15
81
Before
3_8
After
3_9
Epsilon: 0.9821399999999821 Training Episodes: 9107
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
3_60
After
3_61
Epsilon: 0.9821199999999821 Training Episodes: 9106
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [8, 10] Result: Lose 
18
172
Before
1_8
After
1_9
Epsilon: 0.9820999999999821 Training Episodes: 9105
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 1] Result: Win 
11
99
Before
10_10
After
11_11
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 1, 7] Result: Lose 
18
106
Before
3_14
After
3_15
Epsilon: 0.9820799999999821 Training Episodes: 9104
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 2] Result: Lose 
12
232
Before
21_34
After
21_35
Epsilon: 0.9820599999999821 Training Episodes: 9103
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [1, 4] Result: Win 
5
401
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [1, 4, 1] Result: Win 
6
160
Before
2_2
After
3_3
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [1, 4, 1, 10] Result: Win 
16
170
Before
1_9
After
2_10
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [1, 4, 1, 10, 1] Result: Win 
17
171
Before
3_13
After
4_14
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [1, 4, 1, 10, 1, 2] Result: Lose 
19
173
Before
0000000000_6
After
0000000000_7
Epsilon: 0.982039999999982 Training Episodes: 9102
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [5, 2] Result: Win 
7
73
Before
4_4
After
5_5
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [5, 2, 10] Result: Lose 
17
83
Before
2_7
After
2_8
Epsilon: 0.982019999999982 Training Episodes: 9101
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [3, 2] Result: Win 
5
159
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [3, 2, 10] Result: Lose 
15
169
Before
7_14
After
7_15
Epsilon: 0.981999999999982 Training Episodes: 9100
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [4, 5] Result: Win 
9
97
Before
4_4
After
5_5
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [4, 5, 4] Result: Lose 
13
101
Before
4_13
After
4_14
Epsilon: 0.981979999999982 Training Episodes: 9099
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [7, 4] Result: Win 
11
143
Before
9_9
After
10_10
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [7, 4, 10] Result: Lose 
21
153
Before
0000000000_8
After
0000000000_9
Epsilon: 0.981959999999982 Training Episodes: 9098
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 10] Result: Win 
20
108
Before
1_10
After
2_11
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 10, 1] Result: Lose 
21
109
Before
0000000000_9
After
0000000000_10
Epsilon: 0.9819399999999819 Training Episodes: 9097
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 10] Result: Lose 
20
42
Before
1_15
After
1_16
Epsilon: 0.9819199999999819 Training Episodes: 9096
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [1, 4] Result: Win 
5
335
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [1, 4, 3] Result: Win 
8
96
Before
5_5
After
6_6
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [1, 4, 3, 10] Result: Lose 
18
106
Before
3_15
After
3_16
Epsilon: 0.9818999999999819 Training Episodes: 9095
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [2, 1] Result: Win 
3
399
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [2, 1, 3] Result: Win 
6
160
Before
3_3
After
4_4
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [2, 1, 3, 10] Result: Win 
16
170
Before
2_10
After
3_11
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [2, 1, 3, 10, 3] Result: Lose 
19
173
Before
0000000000_7
After
0000000000_8
Epsilon: 0.9818799999999819 Training Episodes: 9094
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 10] Result: Lose 
20
218
Before
1_11
After
1_12
Epsilon: 0.9818599999999819 Training Episodes: 9093
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [9, 1] Result: Win 
10
384
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [9, 1, 1] Result: Win 
11
143
Before
10_10
After
11_11
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [9, 1, 1, 10] Result: Lose 
21
153
Before
0000000000_9
After
0000000000_10
Epsilon: 0.9818399999999818 Training Episodes: 9092
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 5] Result: Win 
13
233
Before
29_43
After
30_44
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 5, 2] Result: Win 
15
235
Before
11_23
After
12_24
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 5, 2, 5] Result: Lose 
20
240
Before
3_61
After
3_62
Epsilon: 0.9818199999999818 Training Episodes: 9091
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [7, 10] Result: Lose 
17
171
Before
4_14
After
4_15
Epsilon: 0.9817999999999818 Training Episodes: 9090
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
3_62
After
3_63
Epsilon: 0.9817799999999818 Training Episodes: 9089
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 2] Result: Win 
12
100
Before
3_4
After
4_5
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 2, 9] Result: Lose 
21
109
Before
0000000000_10
After
0000000000_11
Epsilon: 0.9817599999999818 Training Episodes: 9088
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 10] Result: Lose 
20
174
Before
1_6
After
1_7
Epsilon: 0.9817399999999817 Training Episodes: 9087
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 10] Result: Lose 
18
238
Before
10_49
After
10_50
Epsilon: 0.9817199999999817 Training Episodes: 9086
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [6, 3] Result: Win 
9
53
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [6, 3, 10] Result: Lose 
19
63
Before
2_9
After
2_10
Epsilon: 0.9816999999999817 Training Episodes: 9085
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [1, 3] Result: Win 
4
202
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [1, 3, 10] Result: Lose 
14
212
Before
0000000000_2
After
0000000000_3
Epsilon: 0.9816799999999817 Training Episodes: 9084
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [5, 1] Result: Win 
6
446
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [5, 1, 4] Result: Win 
10
208
Before
4_4
After
5_5
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [5, 1, 4, 4] Result: Win 
14
212
Before
0000000000_3
After
1_4
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [5, 1, 4, 4, 3] Result: Lose 
17
215
Before
2_11
After
2_12
Epsilon: 0.9816599999999817 Training Episodes: 9083
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 10] Result: Lose 
20
174
Before
1_7
After
1_8
Epsilon: 0.9816399999999816 Training Episodes: 9082
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 9] Result: Lose 
19
239
Before
12_41
After
12_42
Epsilon: 0.9816199999999816 Training Episodes: 9081
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [4, 10] Result: Lose 
14
58
Before
2_7
After
2_8
Epsilon: 0.9815999999999816 Training Episodes: 9080
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 10] Result: Lose 
16
236
Before
16_46
After
16_47
Epsilon: 0.9815799999999816 Training Episodes: 9079
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [9, 8] Result: Lose 
17
83
Before
2_8
After
2_9
Epsilon: 0.9815599999999816 Training Episodes: 9078
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 2] Result: Win 
7
227
Before
8_8
After
9_9
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 2, 7] Result: Win 
14
234
Before
21_38
After
22_39
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 2, 7, 3] Result: Win 
17
237
Before
8_34
After
9_35
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 2, 7, 3, 3] Result: Lose 
20
240
Before
3_63
After
3_64
Epsilon: 0.9815399999999815 Training Episodes: 9077
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [2, 7] Result: Win 
9
185
Before
3_3
After
4_4
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [2, 7, 2] Result: Win 
11
187
Before
10_10
After
11_11
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [2, 7, 2, 10] Result: Lose 
21
197
Before
0000000000_6
After
0000000000_7
Epsilon: 0.9815199999999815 Training Episodes: 9076
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [8, 10] Result: Lose 
18
194
Before
3_15
After
3_16
Epsilon: 0.9814999999999815 Training Episodes: 9075
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [6, 7] Result: Lose 
13
211
Before
3_6
After
3_7
Epsilon: 0.9814799999999815 Training Episodes: 9074
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [4, 3] Result: Win 
7
95
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [4, 3, 6] Result: Win 
13
101
Before
4_14
After
5_15
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [4, 3, 6, 3] Result: Win 
16
104
Before
5_13
After
6_14
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [4, 3, 6, 3, 5] Result: Lose 
21
109
Before
0000000000_11
After
0000000000_12
Epsilon: 0.9814599999999815 Training Episodes: 9073
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [5, 10] Result: Win 
15
37
Before
4_7
After
5_8
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [5, 10, 5] Result: Win 
20
42
Before
1_16
After
2_17
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [5, 10, 5, 1] Result: Lose 
21
43
Before
0000000000_9
After
0000000000_10
Epsilon: 0.9814399999999814 Training Episodes: 9072
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 2] Result: Win 
11
231
Before
38_38
After
39_39
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 2, 6] Result: Lose 
17
237
Before
9_35
After
9_36
Epsilon: 0.9814199999999814 Training Episodes: 9071
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [5, 7] Result: Lose 
12
56
Before
4_7
After
4_8
Epsilon: 0.9813999999999814 Training Episodes: 9070
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 9] Result: Lose 
19
85
Before
0000000000_10
After
0000000000_11
Epsilon: 0.9813799999999814 Training Episodes: 9069
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 7] Result: Lose 
12
232
Before
21_35
After
21_36
Epsilon: 0.9813599999999814 Training Episodes: 9068
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 7] Result: Lose 
16
236
Before
16_47
After
16_48
Epsilon: 0.9813399999999813 Training Episodes: 9067
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [1, 10] Result: Win 
11
33
Before
7_7
After
8_8
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [1, 10, 2] Result: Win 
13
35
Before
2_3
After
3_4
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [1, 10, 2, 3] Result: Lose 
16
38
Before
2_10
After
2_11
Epsilon: 0.9813199999999813 Training Episodes: 9066
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 5] Result: Win 
8
228
Before
17_17
After
18_18
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 5, 10] Result: Lose 
18
238
Before
10_50
After
10_51
Epsilon: 0.9812999999999813 Training Episodes: 9065
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 4] Result: Win 
10
230
Before
19_19
After
20_20
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 4, 7] Result: Lose 
17
237
Before
9_36
After
9_37
Epsilon: 0.9812799999999813 Training Episodes: 9064
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 3] Result: Win 
7
227
Before
9_9
After
10_10
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 3, 7] Result: Lose 
14
234
Before
22_39
After
22_40
Epsilon: 0.9812599999999813 Training Episodes: 9063
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [6, 10] Result: Lose 
16
38
Before
2_11
After
2_12
Epsilon: 0.9812399999999812 Training Episodes: 9062
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 2] Result: Lose 
12
232
Before
21_36
After
21_37
Epsilon: 0.9812199999999812 Training Episodes: 9061
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 5] Result: Win 
15
169
Before
7_15
After
8_16
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 5, 3] Result: Lose 
18
172
Before
1_9
After
1_10
Epsilon: 0.9811999999999812 Training Episodes: 9060
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [7, 10] Result: Lose 
17
171
Before
4_15
After
4_16
Epsilon: 0.9811799999999812 Training Episodes: 9059
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [4, 10] Result: Win 
14
102
Before
6_12
After
7_13
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [4, 10, 6] Result: Win 
20
108
Before
2_11
After
3_12
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [4, 10, 6, 1] Result: Lose 
21
109
Before
0000000000_12
After
0000000000_13
Epsilon: 0.9811599999999812 Training Episodes: 9058
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
3_64
After
3_65
Epsilon: 0.9811399999999811 Training Episodes: 9057
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 4] Result: Win 
8
470
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 4, 1] Result: Win 
9
471
Before
3_3
After
4_4
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 4, 1, 2] Result: Win 
11
231
Before
39_39
After
40_40
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 4, 1, 2, 8] Result: Lose 
19
239
Before
12_42
After
12_43
Epsilon: 0.9811199999999811 Training Episodes: 9056
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 10] Result: Lose 
16
236
Before
16_48
After
16_49
Epsilon: 0.9810999999999811 Training Episodes: 9055
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [6, 2] Result: Win 
8
184
Before
6_6
After
7_7
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [6, 2, 10] Result: Win 
18
194
Before
3_16
After
4_17
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [6, 2, 10, 3] Result: Lose 
21
197
Before
0000000000_7
After
0000000000_8
Epsilon: 0.9810799999999811 Training Episodes: 9054
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [6, 7] Result: Win 
13
211
Before
3_7
After
4_8
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [6, 7, 3] Result: Lose 
16
214
Before
4_12
After
4_13
Epsilon: 0.9810599999999811 Training Episodes: 9053
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [8, 6] Result: Win 
14
146
Before
7_10
After
8_11
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [8, 6, 5] Result: Lose 
19
151
Before
1_7
After
1_8
Epsilon: 0.981039999999981 Training Episodes: 9052
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 9] Result: Lose 
19
85
Before
0000000000_11
After
0000000000_12
Epsilon: 0.981019999999981 Training Episodes: 9051
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [5, 10] Result: Win 
15
125
Before
5_13
After
6_14
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [5, 10, 5] Result: Lose 
20
130
Before
1_13
After
1_14
Epsilon: 0.980999999999981 Training Episodes: 9050
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 10] Result: Lose 
20
86
Before
5_14
After
5_15
Epsilon: 0.980979999999981 Training Episodes: 9049
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [9, 4] Result: Win 
13
79
Before
2_5
After
3_6
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [9, 4, 6] Result: Win 
19
85
Before
0000000000_12
After
1_13
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [9, 4, 6, 2] Result: Lose 
21
87
Before
0000000000_13
After
0000000000_14
Epsilon: 0.980959999999981 Training Episodes: 9048
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [3, 9] Result: Win 
12
78
Before
9_11
After
10_12
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [3, 9, 3] Result: Lose 
15
81
Before
3_9
After
3_10
Epsilon: 0.9809399999999809 Training Episodes: 9047
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 4] Result: Win 
14
80
Before
6_14
After
7_15
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 4, 6] Result: Lose 
20
86
Before
5_15
After
5_16
Epsilon: 0.9809199999999809 Training Episodes: 9046
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 3] Result: Win 
13
35
Before
3_4
After
4_5
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 3, 4] Result: Lose 
17
39
Before
2_5
After
2_6
Epsilon: 0.9808999999999809 Training Episodes: 9045
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [2, 2] Result: Win 
4
92
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [2, 2, 9] Result: Lose 
13
101
Before
5_15
After
5_16
Epsilon: 0.9808799999999809 Training Episodes: 9044
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [9, 10] Result: Lose 
19
173
Before
0000000000_8
After
0000000000_9
Epsilon: 0.9808599999999809 Training Episodes: 9043
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [1, 9] Result: Win 
10
186
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [1, 9, 2] Result: Win 
12
188
Before
9_10
After
10_11
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [1, 9, 2, 3] Result: Win 
15
191
Before
2_5
After
3_6
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [1, 9, 2, 3, 6] Result: Lose 
21
197
Before
0000000000_8
After
0000000000_9
Epsilon: 0.9808399999999808 Training Episodes: 9042
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [2, 4] Result: Win 
6
28
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [2, 4, 8] Result: Lose 
14
36
Before
7_13
After
7_14
Epsilon: 0.9808199999999808 Training Episodes: 9041
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [4, 8] Result: Lose 
12
144
Before
4_8
After
4_9
Epsilon: 0.9807999999999808 Training Episodes: 9040
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 7] Result: Lose 
17
171
Before
4_16
After
4_17
Epsilon: 0.9807799999999808 Training Episodes: 9039
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10] Result: Win 
13
233
Before
30_44
After
31_45
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10, 8] Result: Lose 
21
241
Before
0000000000_42
After
0000000000_43
Epsilon: 0.9807599999999808 Training Episodes: 9038
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 5] Result: Win 
15
169
Before
8_16
After
9_17
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 5, 6] Result: Lose 
21
175
Before
0000000000_10
After
0000000000_11
Epsilon: 0.9807399999999807 Training Episodes: 9037
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 4] Result: Lose 
13
233
Before
31_45
After
31_46
Epsilon: 0.9807199999999807 Training Episodes: 9036
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 5] Result: Win 
8
228
Before
18_18
After
19_19
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 5, 10] Result: Lose 
18
238
Before
10_51
After
10_52
Epsilon: 0.9806999999999807 Training Episodes: 9035
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [2, 9] Result: Win 
11
55
Before
9_9
After
10_10
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [2, 9, 10] Result: Lose 
21
65
Before
0000000000_6
After
0000000000_7
Epsilon: 0.9806799999999807 Training Episodes: 9034
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 4] Result: Win 
14
234
Before
22_40
After
23_41
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 4, 3] Result: Lose 
17
237
Before
9_37
After
9_38
Epsilon: 0.9806599999999807 Training Episodes: 9033
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 6] Result: Lose 
16
126
Before
7_13
After
7_14
Epsilon: 0.9806399999999806 Training Episodes: 9032
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [7, 4] Result: Win 
11
121
Before
7_7
After
8_8
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [7, 4, 10] Result: Lose 
21
131
Before
0000000000_17
After
0000000000_18
Epsilon: 0.9806199999999806 Training Episodes: 9031
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [2, 9] Result: Win 
11
55
Before
10_10
After
11_11
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [2, 9, 6] Result: Lose 
17
61
Before
5_10
After
5_11
Epsilon: 0.9805999999999806 Training Episodes: 9030
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [9, 8] Result: Win 
17
193
Before
3_11
After
4_12
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [9, 8, 3] Result: Lose 
20
196
Before
0000000000_7
After
0000000000_8
Epsilon: 0.9805799999999806 Training Episodes: 9029
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [9, 10] Result: Lose 
19
195
Before
1_10
After
1_11
Epsilon: 0.9805599999999806 Training Episodes: 9028
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10] Result: Lose 
13
233
Before
31_46
After
31_47
Epsilon: 0.9805399999999805 Training Episodes: 9027
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [8, 9] Result: Lose 
17
83
Before
2_9
After
2_10
Epsilon: 0.9805199999999805 Training Episodes: 9026
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [8, 7] Result: Win 
15
37
Before
5_8
After
6_9
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [8, 7, 3] Result: Lose 
18
40
Before
4_11
After
4_12
Epsilon: 0.9804999999999805 Training Episodes: 9025
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 9] Result: Win 
16
236
Before
16_49
After
17_50
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 9, 4] Result: Win 
20
240
Before
3_65
After
4_66
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 9, 4, 1] Result: Lose 
21
241
Before
0000000000_43
After
0000000000_44
Epsilon: 0.9804799999999805 Training Episodes: 9024
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 6] Result: Win 
16
236
Before
17_50
After
18_51
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 6, 1] Result: Lose 
17
237
Before
9_38
After
9_39
Epsilon: 0.9804599999999805 Training Episodes: 9023
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [4, 1] Result: Win 
5
291
Before
2_2
After
3_3
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [4, 1, 4] Result: Win 
9
53
Before
3_3
After
4_4
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [4, 1, 4, 10] Result: Lose 
19
63
Before
2_10
After
2_11
Epsilon: 0.9804399999999804 Training Episodes: 9022
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [7, 10] Result: Win 
17
193
Before
4_12
After
5_13
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [7, 10, 3] Result: Lose 
20
196
Before
0000000000_8
After
0000000000_9
Epsilon: 0.9804199999999804 Training Episodes: 9021
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
4_66
After
4_67
Epsilon: 0.9803999999999804 Training Episodes: 9020
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [5, 8] Result: Lose 
13
189
Before
1_7
After
1_8
Epsilon: 0.9803799999999804 Training Episodes: 9019
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [8, 10] Result: Lose 
18
84
Before
1_6
After
1_7
Epsilon: 0.9803599999999804 Training Episodes: 9018
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 7] Result: Lose 
17
237
Before
9_39
After
9_40
Epsilon: 0.9803399999999803 Training Episodes: 9017
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [3, 6] Result: Win 
9
97
Before
5_5
After
6_6
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [3, 6, 6] Result: Win 
15
103
Before
4_11
After
5_12
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [3, 6, 6, 1] Result: Win 
16
104
Before
6_14
After
7_15
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [3, 6, 6, 1, 5] Result: Lose 
21
109
Before
0000000000_13
After
0000000000_14
Epsilon: 0.9803199999999803 Training Episodes: 9016
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [2, 1] Result: Win 
3
267
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [2, 1, 8] Result: Win 
11
33
Before
8_8
After
9_9
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [2, 1, 8, 7] Result: Lose 
18
40
Before
4_12
After
4_13
Epsilon: 0.9802999999999803 Training Episodes: 9015
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [5, 10] Result: Lose 
15
169
Before
9_17
After
9_18
Epsilon: 0.9802799999999803 Training Episodes: 9014
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [2, 6] Result: Win 
8
206
Before
3_3
After
4_4
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [2, 6, 10] Result: Lose 
18
216
Before
2_14
After
2_15
Epsilon: 0.9802599999999803 Training Episodes: 9013
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 9] Result: Lose 
15
235
Before
12_24
After
12_25
Epsilon: 0.9802399999999802 Training Episodes: 9012
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 10] Result: Win 
12
232
Before
21_37
After
22_38
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 10, 8] Result: Lose 
20
240
Before
4_67
After
4_68
Epsilon: 0.9802199999999802 Training Episodes: 9011
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 1] Result: Win 
3
465
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 1, 7] Result: Win 
10
230
Before
20_20
After
21_21
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 1, 7, 3] Result: Lose 
13
233
Before
31_47
After
31_48
Epsilon: 0.9801999999999802 Training Episodes: 9010
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10] Result: Lose 
13
233
Before
31_48
After
31_49
Epsilon: 0.9801799999999802 Training Episodes: 9009
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [1, 10] Result: Win 
11
143
Before
11_11
After
12_12
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [1, 10, 3] Result: Win 
14
146
Before
8_11
After
9_12
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [1, 10, 3, 7] Result: Lose 
21
153
Before
0000000000_10
After
0000000000_11
Epsilon: 0.9801599999999802 Training Episodes: 9008
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
4_68
After
4_69
Epsilon: 0.9801399999999801 Training Episodes: 9007
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 8] Result: Lose 
18
238
Before
10_52
After
10_53
Epsilon: 0.9801199999999801 Training Episodes: 9006
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 10] Result: Lose 
16
236
Before
18_51
After
18_52
Epsilon: 0.9800999999999801 Training Episodes: 9005
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10] Result: Win 
13
233
Before
31_49
After
32_50
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10, 5] Result: Lose 
18
238
Before
10_53
After
10_54
Epsilon: 0.9800799999999801 Training Episodes: 9004
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 9] Result: Lose 
19
217
Before
1_9
After
1_10
Epsilon: 0.9800599999999801 Training Episodes: 9003
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 4] Result: Lose 
14
190
Before
4_6
After
4_7
Epsilon: 0.98003999999998 Training Episodes: 9002
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 9] Result: Lose 
19
239
Before
12_43
After
12_44
Epsilon: 0.98001999999998 Training Episodes: 9001
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 4] Result: Win 
6
226
Before
5_5
After
6_6
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 4, 10] Result: Win 
16
236
Before
18_52
After
19_53
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 4, 10, 3] Result: Lose 
19
239
Before
12_44
After
12_45
Epsilon: 0.97999999999998 Training Episodes: 9000
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [8, 1] Result: Win 
9
119
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [8, 1, 8] Result: Lose 
17
127
Before
5_16
After
5_17
Epsilon: 0.97997999999998 Training Episodes: 8999
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [1, 10] Result: Win 
11
121
Before
8_8
After
9_9
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [1, 10, 1] Result: Lose 
12
122
Before
10_13
After
10_14
Epsilon: 0.97995999999998 Training Episodes: 8998
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 5] Result: Lose 
12
232
Before
22_38
After
22_39
Epsilon: 0.9799399999999799 Training Episodes: 8997
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [10, 10] Result: Lose 
20
64
Before
1_15
After
1_16
Epsilon: 0.9799199999999799 Training Episodes: 8996
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
4_69
After
4_70
Epsilon: 0.9798999999999799 Training Episodes: 8995
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [3, 2] Result: Win 
5
137
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [3, 2, 5] Result: Win 
10
142
Before
5_5
After
6_6
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [3, 2, 5, 3] Result: Lose 
13
145
Before
4_6
After
4_7
Epsilon: 0.9798799999999799 Training Episodes: 8994
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
4_70
After
4_71
Epsilon: 0.9798599999999799 Training Episodes: 8993
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [2, 4] Result: Win 
6
160
Before
4_4
After
5_5
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [2, 4, 10] Result: Lose 
16
170
Before
3_11
After
3_12
Epsilon: 0.9798399999999798 Training Episodes: 8992
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [3, 10] Result: Win 
13
35
Before
4_5
After
5_6
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [3, 10, 2] Result: Lose 
15
37
Before
6_9
After
6_10
Epsilon: 0.9798199999999798 Training Episodes: 8991
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [9, 1] Result: Win 
10
32
Before
7_7
After
8_8
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [9, 1, 6] Result: Win 
16
38
Before
2_12
After
3_13
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [9, 1, 6, 5] Result: Lose 
21
43
Before
0000000000_10
After
0000000000_11
Epsilon: 0.9797999999999798 Training Episodes: 8990
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [3, 10] Result: Lose 
13
57
Before
3_6
After
3_7
Epsilon: 0.9797799999999798 Training Episodes: 8989
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [2, 1] Result: Win 
3
399
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [2, 1, 8] Result: Win 
11
165
Before
12_12
After
13_13
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [2, 1, 8, 7] Result: Lose 
18
172
Before
1_10
After
1_11
Epsilon: 0.9797599999999798 Training Episodes: 8988
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [1, 10] Result: Win 
11
209
Before
5_5
After
6_6
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [1, 10, 10] Result: Lose 
21
219
Before
0000000000_7
After
0000000000_8
Epsilon: 0.9797399999999797 Training Episodes: 8987
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 3] Result: Win 
7
227
Before
10_10
After
11_11
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 3, 7] Result: Lose 
14
234
Before
23_41
After
23_42
Epsilon: 0.9797199999999797 Training Episodes: 8986
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
4_71
After
4_72
Epsilon: 0.9796999999999797 Training Episodes: 8985
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [8, 10] Result: Lose 
18
128
Before
2_15
After
2_16
Epsilon: 0.9796799999999797 Training Episodes: 8984
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [10, 9] Result: Lose 
19
63
Before
2_11
After
2_12
Epsilon: 0.9796599999999797 Training Episodes: 8983
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [2, 1] Result: Win 
3
333
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [2, 1, 5] Result: Win 
8
96
Before
6_6
After
7_7
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [2, 1, 5, 10] Result: Lose 
18
106
Before
3_16
After
3_17
Epsilon: 0.9796399999999796 Training Episodes: 8982
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 10] Result: Win 
20
86
Before
5_16
After
6_17
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 10, 1] Result: Lose 
21
87
Before
0000000000_14
After
0000000000_15
Epsilon: 0.9796199999999796 Training Episodes: 8981
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
4_72
After
4_73
Epsilon: 0.9795999999999796 Training Episodes: 8980
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [4, 10] Result: Win 
14
146
Before
9_12
After
10_13
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [4, 10, 4] Result: Lose 
18
150
Before
1_11
After
1_12
Epsilon: 0.9795799999999796 Training Episodes: 8979
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 10] Result: Lose 
20
174
Before
1_8
After
1_9
Epsilon: 0.9795599999999796 Training Episodes: 8978
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 9] Result: Win 
14
234
Before
23_42
After
24_43
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 9, 2] Result: Lose 
16
236
Before
19_53
After
19_54
Epsilon: 0.9795399999999795 Training Episodes: 8977
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [3, 10] Result: Win 
13
123
Before
6_8
After
7_9
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [3, 10, 2] Result: Win 
15
125
Before
6_14
After
7_15
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [3, 10, 2, 1] Result: Lose 
16
126
Before
7_14
After
7_15
Epsilon: 0.9795199999999795 Training Episodes: 8976
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 2] Result: Win 
11
231
Before
40_40
After
41_41
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 2, 2] Result: Lose 
13
233
Before
32_50
After
32_51
Epsilon: 0.9794999999999795 Training Episodes: 8975
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [9, 10] Result: Lose 
19
217
Before
1_10
After
1_11
Epsilon: 0.9794799999999795 Training Episodes: 8974
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 2] Result: Win 
10
472
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 2, 1] Result: Win 
11
231
Before
41_41
After
42_42
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 2, 1, 10] Result: Lose 
21
241
Before
0000000000_44
After
0000000000_45
Epsilon: 0.9794599999999795 Training Episodes: 8973
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [7, 5] Result: Win 
12
210
Before
4_6
After
5_7
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [7, 5, 5] Result: Lose 
17
215
Before
2_12
After
2_13
Epsilon: 0.9794399999999794 Training Episodes: 8972
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 5] Result: Win 
15
235
Before
12_25
After
13_26
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 5, 4] Result: Lose 
19
239
Before
12_45
After
12_46
Epsilon: 0.9794199999999794 Training Episodes: 8971
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [10, 7] Result: Lose 
17
61
Before
5_11
After
5_12
Epsilon: 0.9793999999999794 Training Episodes: 8970
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 10] Result: Lose 
20
152
Before
1_22
After
1_23
Epsilon: 0.9793799999999794 Training Episodes: 8969
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [1, 6] Result: Win 
7
161
Before
3_3
After
4_4
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [1, 6, 10] Result: Lose 
17
171
Before
4_17
After
4_18
Epsilon: 0.9793599999999794 Training Episodes: 8968
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 4] Result: Lose 
14
234
Before
24_43
After
24_44
Epsilon: 0.9793399999999793 Training Episodes: 8967
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 10] Result: Lose 
12
232
Before
22_39
After
22_40
Epsilon: 0.9793199999999793 Training Episodes: 8966
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [1, 8] Result: Win 
9
53
Before
4_4
After
5_5
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [1, 8, 9] Result: Lose 
18
62
Before
1_6
After
1_7
Epsilon: 0.9792999999999793 Training Episodes: 8965
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [6, 7] Result: Win 
13
145
Before
4_7
After
5_8
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [6, 7, 2] Result: Win 
15
147
Before
4_11
After
5_12
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [6, 7, 2, 4] Result: Lose 
19
151
Before
1_8
After
1_9
Epsilon: 0.9792799999999793 Training Episodes: 8964
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [2, 1] Result: Win 
3
25
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [2, 1, 10] Result: Lose 
13
35
Before
5_6
After
5_7
Epsilon: 0.9792599999999793 Training Episodes: 8963
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 6] Result: Win 
7
227
Before
11_11
After
12_12
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 6, 10] Result: Lose 
17
237
Before
9_40
After
9_41
Epsilon: 0.9792399999999792 Training Episodes: 8962
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [6, 8] Result: Win 
14
124
Before
8_10
After
9_11
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [6, 8, 7] Result: Lose 
21
131
Before
0000000000_18
After
0000000000_19
Epsilon: 0.9792199999999792 Training Episodes: 8961
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [6, 4] Result: Win 
10
98
Before
8_8
After
9_9
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [6, 4, 2] Result: Win 
12
100
Before
4_5
After
5_6
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [6, 4, 2, 8] Result: Lose 
20
108
Before
3_12
After
3_13
Epsilon: 0.9791999999999792 Training Episodes: 8960
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [1, 2] Result: Win 
3
267
Before
2_2
After
3_3
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [1, 2, 6] Result: Win 
9
31
Before
6_6
After
7_7
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [1, 2, 6, 10] Result: Lose 
19
41
Before
2_12
After
2_13
Epsilon: 0.9791799999999792 Training Episodes: 8959
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 6] Result: Lose 
15
235
Before
13_26
After
13_27
Epsilon: 0.9791599999999792 Training Episodes: 8958
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 2] Result: Win 
10
230
Before
21_21
After
22_22
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 2, 4] Result: Win 
14
234
Before
24_44
After
25_45
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 2, 4, 6] Result: Win 
20
240
Before
4_73
After
5_74
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 2, 4, 6, 1] Result: Lose 
21
241
Before
0000000000_45
After
0000000000_46
Epsilon: 0.9791399999999791 Training Episodes: 8957
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [5, 8] Result: Lose 
13
79
Before
3_6
After
3_7
Epsilon: 0.9791199999999791 Training Episodes: 8956
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [6, 10] Result: Lose 
16
148
Before
4_8
After
4_9
Epsilon: 0.9790999999999791 Training Episodes: 8955
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 6] Result: Lose 
16
236
Before
19_54
After
19_55
Epsilon: 0.9790799999999791 Training Episodes: 8954
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 1] Result: Win 
4
466
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 1, 7] Result: Win 
11
231
Before
42_42
After
43_43
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 1, 7, 10] Result: Lose 
21
241
Before
0000000000_46
After
0000000000_47
Epsilon: 0.9790599999999791 Training Episodes: 8953
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 8] Result: Lose 
18
128
Before
2_16
After
2_17
Epsilon: 0.979039999999979 Training Episodes: 8952
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [7, 4] Result: Win 
11
209
Before
6_6
After
7_7
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [7, 4, 5] Result: Lose 
16
214
Before
4_13
After
4_14
Epsilon: 0.979019999999979 Training Episodes: 8951
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [1, 10] Result: Win 
11
121
Before
9_9
After
10_10
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [1, 10, 2] Result: Win 
13
123
Before
7_9
After
8_10
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [1, 10, 2, 1] Result: Lose 
14
124
Before
9_11
After
9_12
Epsilon: 0.978999999999979 Training Episodes: 8950
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [1, 10] Result: Win 
11
143
Before
12_12
After
13_13
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [1, 10, 10] Result: Lose 
21
153
Before
0000000000_11
After
0000000000_12
Epsilon: 0.978979999999979 Training Episodes: 8949
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [7, 5] Result: Win 
12
56
Before
4_8
After
5_9
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [7, 5, 1] Result: Win 
13
57
Before
3_7
After
4_8
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [7, 5, 1, 7] Result: Lose 
20
64
Before
1_16
After
1_17
Epsilon: 0.978959999999979 Training Episodes: 8948
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 10] Result: Lose 
20
108
Before
3_13
After
3_14
Epsilon: 0.9789399999999789 Training Episodes: 8947
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 8] Result: Lose 
18
238
Before
10_54
After
10_55
Epsilon: 0.9789199999999789 Training Episodes: 8946
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [5, 8] Result: Win 
13
35
Before
5_7
After
6_8
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [5, 8, 4] Result: Lose 
17
39
Before
2_6
After
2_7
Epsilon: 0.9788999999999789 Training Episodes: 8945
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 10] Result: Lose 
15
235
Before
13_27
After
13_28
Epsilon: 0.9788799999999789 Training Episodes: 8944
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [3, 1] Result: Win 
4
378
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [3, 1, 3] Result: Win 
7
381
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [3, 1, 3, 2] Result: Win 
9
141
Before
3_3
After
4_4
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [3, 1, 3, 2, 4] Result: Lose 
13
145
Before
5_8
After
5_9
Epsilon: 0.9788599999999789 Training Episodes: 8943
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 8] Result: Lose 
17
237
Before
9_41
After
9_42
Epsilon: 0.9788399999999788 Training Episodes: 8942
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [9, 6] Result: Lose 
15
213
Before
4_10
After
4_11
Epsilon: 0.9788199999999788 Training Episodes: 8941
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [4, 4] Result: Win 
8
52
Before
3_3
After
4_4
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [4, 4, 8] Result: Win 
16
60
Before
6_9
After
7_10
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [4, 4, 8, 5] Result: Lose 
21
65
Before
0000000000_7
After
0000000000_8
Epsilon: 0.9787999999999788 Training Episodes: 8940
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 2] Result: Win 
9
229
Before
16_16
After
17_17
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 2, 8] Result: Lose 
17
237
Before
9_42
After
9_43
Epsilon: 0.9787799999999788 Training Episodes: 8939
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 10] Result: Lose 
20
218
Before
1_12
After
1_13
Epsilon: 0.9787599999999788 Training Episodes: 8938
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 9] Result: Lose 
19
107
Before
2_11
After
2_12
Epsilon: 0.9787399999999787 Training Episodes: 8937
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [8, 4] Result: Win 
12
78
Before
10_12
After
11_13
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [8, 4, 3] Result: Win 
15
81
Before
3_10
After
4_11
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [8, 4, 3, 2] Result: Lose 
17
83
Before
2_10
After
2_11
Epsilon: 0.9787199999999787 Training Episodes: 8936
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [7, 3] Result: Win 
10
120
Before
9_9
After
10_10
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [7, 3, 7] Result: Lose 
17
127
Before
5_17
After
5_18
Epsilon: 0.9786999999999787 Training Episodes: 8935
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [6, 3] Result: Win 
9
185
Before
4_4
After
5_5
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [6, 3, 7] Result: Win 
16
192
Before
2_3
After
3_4
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [6, 3, 7, 1] Result: Win 
17
193
Before
5_13
After
6_14
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [6, 3, 7, 1, 3] Result: Lose 
20
196
Before
0000000000_9
After
0000000000_10
Epsilon: 0.9786799999999787 Training Episodes: 8934
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [8, 8] Result: Lose 
16
148
Before
4_9
After
4_10
Epsilon: 0.9786599999999787 Training Episodes: 8933
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 10] Result: Win 
14
234
Before
25_45
After
26_46
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 10, 2] Result: Lose 
16
236
Before
19_55
After
19_56
Epsilon: 0.9786399999999786 Training Episodes: 8932
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 1] Result: Win 
3
465
Before
2_2
After
3_3
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 1, 5] Result: Win 
8
228
Before
19_19
After
20_20
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 1, 5, 4] Result: Win 
12
232
Before
22_40
After
23_41
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 1, 5, 4, 1] Result: Win 
13
233
Before
32_51
After
33_52
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 1, 5, 4, 1, 6] Result: Lose 
19
239
Before
12_46
After
12_47
Epsilon: 0.9786199999999786 Training Episodes: 8931
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 9] Result: Lose 
19
41
Before
2_13
After
2_14
Epsilon: 0.9785999999999786 Training Episodes: 8930
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 3] Result: Lose 
13
189
Before
1_8
After
1_9
Epsilon: 0.9785799999999786 Training Episodes: 8929
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [1, 7] Result: Win 
8
74
Before
3_3
After
4_4
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [1, 7, 10] Result: Lose 
18
84
Before
1_7
After
1_8
Epsilon: 0.9785599999999786 Training Episodes: 8928
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [3, 10] Result: Lose 
13
189
Before
1_9
After
1_10
Epsilon: 0.9785399999999785 Training Episodes: 8927
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 10] Result: Lose 
18
238
Before
10_55
After
10_56
Epsilon: 0.9785199999999785 Training Episodes: 8926
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [2, 7] Result: Win 
9
141
Before
4_4
After
5_5
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [2, 7, 5] Result: Lose 
14
146
Before
10_13
After
10_14
Epsilon: 0.9784999999999785 Training Episodes: 8925
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 1] Result: Win 
11
165
Before
13_13
After
14_14
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 1, 10] Result: Lose 
21
175
Before
0000000000_11
After
0000000000_12
Epsilon: 0.9784799999999785 Training Episodes: 8924
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [5, 6] Result: Win 
11
33
Before
9_9
After
10_10
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [5, 6, 10] Result: Lose 
21
43
Before
0000000000_11
After
0000000000_12
Epsilon: 0.9784599999999785 Training Episodes: 8923
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [3, 4] Result: Win 
7
161
Before
4_4
After
5_5
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [3, 4, 2] Result: Win 
9
163
Before
5_5
After
6_6
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [3, 4, 2, 3] Result: Win 
12
166
Before
3_5
After
4_6
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [3, 4, 2, 3, 2] Result: Win 
14
168
Before
2_6
After
3_7
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [3, 4, 2, 3, 2, 3] Result: Lose 
17
171
Before
4_18
After
4_19
Epsilon: 0.9784399999999784 Training Episodes: 8922
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 10] Result: Lose 
20
196
Before
0000000000_10
After
0000000000_11
Epsilon: 0.9784199999999784 Training Episodes: 8921
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [1, 1] Result: Win 
2
200
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [1, 1, 10] Result: Win 
12
210
Before
5_7
After
6_8
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [1, 1, 10, 3] Result: Lose 
15
213
Before
4_11
After
4_12
Epsilon: 0.9783999999999784 Training Episodes: 8920
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 10] Result: Win 
12
232
Before
23_41
After
24_42
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 10, 6] Result: Win 
18
238
Before
10_56
After
11_57
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 10, 6, 2] Result: Lose 
20
240
Before
5_74
After
5_75
Epsilon: 0.9783799999999784 Training Episodes: 8919
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [2, 10] Result: Win 
12
78
Before
11_13
After
12_14
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [2, 10, 3] Result: Lose 
15
81
Before
4_11
After
4_12
Epsilon: 0.9783599999999784 Training Episodes: 8918
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [7, 8] Result: Win 
15
59
Before
2_12
After
3_13
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [7, 8, 5] Result: Lose 
20
64
Before
1_17
After
1_18
Epsilon: 0.9783399999999783 Training Episodes: 8917
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [2, 8] Result: Win 
10
98
Before
9_9
After
10_10
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [2, 8, 9] Result: Lose 
19
107
Before
2_12
After
2_13
Epsilon: 0.9783199999999783 Training Episodes: 8916
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 2] Result: Win 
6
226
Before
6_6
After
7_7
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 2, 7] Result: Win 
13
233
Before
33_52
After
34_53
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 2, 7, 3] Result: Lose 
16
236
Before
19_56
After
19_57
Epsilon: 0.9782999999999783 Training Episodes: 8915
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [2, 10] Result: Win 
12
100
Before
5_6
After
6_7
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [2, 10, 7] Result: Lose 
19
107
Before
2_13
After
2_14
Epsilon: 0.9782799999999783 Training Episodes: 8914
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [1, 6] Result: Win 
7
183
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [1, 6, 10] Result: Lose 
17
193
Before
6_14
After
6_15
Epsilon: 0.9782599999999783 Training Episodes: 8913
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [9, 10] Result: Lose 
19
129
Before
2_11
After
2_12
Epsilon: 0.9782399999999782 Training Episodes: 8912
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [6, 1] Result: Win 
7
381
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [6, 1, 3] Result: Win 
10
142
Before
6_6
After
7_7
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [6, 1, 3, 5] Result: Win 
15
147
Before
5_12
After
6_13
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [6, 1, 3, 5, 1] Result: Win 
16
148
Before
4_10
After
5_11
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [6, 1, 3, 5, 1, 2] Result: Win 
18
150
Before
1_12
After
2_13
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [6, 1, 3, 5, 1, 2, 3] Result: Lose 
21
153
Before
0000000000_12
After
0000000000_13
Epsilon: 0.9782199999999782 Training Episodes: 8911
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [8, 5] Result: Win 
13
101
Before
5_16
After
6_17
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [8, 5, 8] Result: Lose 
21
109
Before
0000000000_14
After
0000000000_15
Epsilon: 0.9781999999999782 Training Episodes: 8910
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 6] Result: Lose 
16
236
Before
19_57
After
19_58
Epsilon: 0.9781799999999782 Training Episodes: 8909
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 2] Result: Win 
12
122
Before
10_14
After
11_15
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 2, 5] Result: Lose 
17
127
Before
5_18
After
5_19
Epsilon: 0.9781599999999782 Training Episodes: 8908
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [1, 9] Result: Win 
10
98
Before
10_10
After
11_11
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [1, 9, 10] Result: Lose 
20
108
Before
3_14
After
3_15
Epsilon: 0.9781399999999781 Training Episodes: 8907
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 4] Result: Win 
14
102
Before
7_13
After
8_14
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 4, 6] Result: Lose 
20
108
Before
3_15
After
3_16
Epsilon: 0.9781199999999781 Training Episodes: 8906
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 5] Result: Win 
11
231
Before
43_43
After
44_44
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 5, 2] Result: Lose 
13
233
Before
34_53
After
34_54
Epsilon: 0.9780999999999781 Training Episodes: 8905
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [2, 7] Result: Win 
9
31
Before
7_7
After
8_8
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [2, 7, 7] Result: Win 
16
38
Before
3_13
After
4_14
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [2, 7, 7, 1] Result: Win 
17
39
Before
2_7
After
3_8
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [2, 7, 7, 1, 1] Result: Lose 
18
40
Before
4_13
After
4_14
Epsilon: 0.9780799999999781 Training Episodes: 8904
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [6, 4] Result: Win 
10
32
Before
8_8
After
9_9
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [6, 4, 2] Result: Win 
12
34
Before
3_7
After
4_8
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [6, 4, 2, 4] Result: Lose 
16
38
Before
4_14
After
4_15
Epsilon: 0.9780599999999781 Training Episodes: 8903
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [1, 4] Result: Win 
5
71
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [1, 4, 8] Result: Win 
13
79
Before
3_7
After
4_8
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [1, 4, 8, 2] Result: Lose 
15
81
Before
4_12
After
4_13
Epsilon: 0.978039999999978 Training Episodes: 8902
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 2] Result: Win 
3
465
Before
3_3
After
4_4
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 2, 1] Result: Win 
4
466
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 2, 1, 5] Result: Win 
9
471
Before
4_4
After
5_5
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 2, 1, 5, 2] Result: Win 
11
231
Before
44_44
After
45_45
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 2, 1, 5, 2, 3] Result: Lose 
14
234
Before
26_46
After
26_47
Epsilon: 0.978019999999978 Training Episodes: 8901
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 6] Result: Win 
7
227
Before
12_12
After
13_13
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 6, 6] Result: Lose 
13
233
Before
34_54
After
34_55
Epsilon: 0.977999999999978 Training Episodes: 8900
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 2] Result: Lose 
12
144
Before
4_9
After
4_10
Epsilon: 0.977979999999978 Training Episodes: 8899
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 7] Result: Win 
14
234
Before
26_47
After
27_48
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 7, 1] Result: Lose 
15
235
Before
13_28
After
13_29
Epsilon: 0.977959999999978 Training Episodes: 8898
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 4] Result: Win 
14
102
Before
8_14
After
9_15
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 4, 1] Result: Win 
15
103
Before
5_12
After
6_13
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 4, 1, 3] Result: Lose 
18
106
Before
3_17
After
3_18
Epsilon: 0.9779399999999779 Training Episodes: 8897
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 8] Result: Win 
18
216
Before
2_15
After
3_16
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 8, 3] Result: Lose 
21
219
Before
0000000000_8
After
0000000000_9
Epsilon: 0.9779199999999779 Training Episodes: 8896
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 4] Result: Win 
8
470
Before
2_2
After
3_3
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 4, 1] Result: Win 
9
229
Before
17_17
After
18_18
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 4, 1, 5] Result: Win 
14
234
Before
27_48
After
28_49
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 4, 1, 5, 3] Result: Win 
17
237
Before
9_43
After
10_44
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 4, 1, 5, 3, 1] Result: Lose 
18
238
Before
11_57
After
11_58
Epsilon: 0.9778999999999779 Training Episodes: 8895
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [4, 5] Result: Win 
9
75
Before
4_4
After
5_5
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [4, 5, 10] Result: Lose 
19
85
Before
1_13
After
1_14
Epsilon: 0.9778799999999779 Training Episodes: 8894
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [4, 10] Result: Lose 
14
190
Before
4_7
After
4_8
Epsilon: 0.9778599999999779 Training Episodes: 8893
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
5_75
After
5_76
Epsilon: 0.9778399999999778 Training Episodes: 8892
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 8] Result: Win 
18
40
Before
4_14
After
5_15
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 8, 3] Result: Lose 
21
43
Before
0000000000_12
After
0000000000_13
Epsilon: 0.9778199999999778 Training Episodes: 8891
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [8, 6] Result: Lose 
14
212
Before
1_4
After
1_5
Epsilon: 0.9777999999999778 Training Episodes: 8890
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [10, 10] Result: Lose 
20
64
Before
1_18
After
1_19
Epsilon: 0.9777799999999778 Training Episodes: 8889
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [2, 7] Result: Win 
9
163
Before
6_6
After
7_7
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [2, 7, 10] Result: Lose 
19
173
Before
0000000000_9
After
0000000000_10
Epsilon: 0.9777599999999778 Training Episodes: 8888
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [5, 1] Result: Win 
6
402
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [5, 1, 2] Result: Win 
8
162
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [5, 1, 2, 9] Result: Win 
17
171
Before
4_19
After
5_20
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [5, 1, 2, 9, 3] Result: Lose 
20
174
Before
1_9
After
1_10
Epsilon: 0.9777399999999777 Training Episodes: 8887
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 6] Result: Win 
13
233
Before
34_55
After
35_56
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 6, 4] Result: Lose 
17
237
Before
10_44
After
10_45
Epsilon: 0.9777199999999777 Training Episodes: 8886
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [1, 10] Result: Win 
11
187
Before
11_11
After
12_12
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [1, 10, 8] Result: Lose 
19
195
Before
1_11
After
1_12
Epsilon: 0.9776999999999777 Training Episodes: 8885
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [3, 3] Result: Win 
6
72
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [3, 3, 5] Result: Win 
11
77
Before
8_8
After
9_9
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [3, 3, 5, 10] Result: Lose 
21
87
Before
0000000000_15
After
0000000000_16
Epsilon: 0.9776799999999777 Training Episodes: 8884
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [1, 3] Result: Win 
4
378
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [1, 3, 3] Result: Win 
7
139
Before
2_2
After
3_3
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [1, 3, 3, 8] Result: Win 
15
147
Before
6_13
After
7_14
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [1, 3, 3, 8, 6] Result: Lose 
21
153
Before
0000000000_13
After
0000000000_14
Epsilon: 0.9776599999999777 Training Episodes: 8883
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 2] Result: Win 
12
188
Before
10_11
After
11_12
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 2, 8] Result: Lose 
20
196
Before
0000000000_11
After
0000000000_12
Epsilon: 0.9776399999999776 Training Episodes: 8882
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [8, 4] Result: Lose 
12
210
Before
6_8
After
6_9
Epsilon: 0.9776199999999776 Training Episodes: 8881
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [10, 10] Result: Lose 
20
64
Before
1_19
After
1_20
Epsilon: 0.9775999999999776 Training Episodes: 8880
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [6, 1] Result: Win 
7
95
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [6, 1, 7] Result: Lose 
14
102
Before
9_15
After
9_16
Epsilon: 0.9775799999999776 Training Episodes: 8879
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 3] Result: Win 
11
231
Before
45_45
After
46_46
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 3, 6] Result: Win 
17
237
Before
10_45
After
11_46
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 3, 6, 3] Result: Win 
20
240
Before
5_76
After
6_77
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 3, 6, 3, 1] Result: Lose 
21
241
Before
0000000000_47
After
0000000000_48
Epsilon: 0.9775599999999776 Training Episodes: 8878
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 7] Result: Win 
11
231
Before
46_46
After
47_47
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 7, 6] Result: Lose 
17
237
Before
11_46
After
11_47
Epsilon: 0.9775399999999775 Training Episodes: 8877
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 7] Result: Lose 
17
193
Before
6_15
After
6_16
Epsilon: 0.9775199999999775 Training Episodes: 8876
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [8, 10] Result: Lose 
18
62
Before
1_7
After
1_8
Epsilon: 0.9774999999999775 Training Episodes: 8875
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [6, 10] Result: Win 
16
214
Before
4_14
After
5_15
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [6, 10, 2] Result: Win 
18
216
Before
3_16
After
4_17
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [6, 10, 2, 1] Result: Lose 
19
217
Before
1_11
After
1_12
Epsilon: 0.9774799999999775 Training Episodes: 8874
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 2] Result: Win 
9
229
Before
18_18
After
19_19
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 2, 10] Result: Win 
19
239
Before
12_47
After
13_48
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 2, 10, 1] Result: Win 
20
240
Before
6_77
After
7_78
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 2, 10, 1, 1] Result: Lose 
21
241
Before
0000000000_48
After
0000000000_49
Epsilon: 0.9774599999999775 Training Episodes: 8873
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [8, 4] Result: Win 
12
144
Before
4_10
After
5_11
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [8, 4, 7] Result: Lose 
19
151
Before
1_9
After
1_10
Epsilon: 0.9774399999999774 Training Episodes: 8872
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 6] Result: Win 
8
228
Before
20_20
After
21_21
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 6, 10] Result: Lose 
18
238
Before
11_58
After
11_59
Epsilon: 0.9774199999999774 Training Episodes: 8871
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [10, 10] Result: Lose 
20
64
Before
1_20
After
1_21
Epsilon: 0.9773999999999774 Training Episodes: 8870
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 8] Result: Lose 
18
238
Before
11_59
After
11_60
Epsilon: 0.9773799999999774 Training Episodes: 8869
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10] Result: Win 
13
233
Before
35_56
After
36_57
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10, 7] Result: Lose 
20
240
Before
7_78
After
7_79
Epsilon: 0.9773599999999774 Training Episodes: 8868
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [9, 3] Result: Lose 
12
34
Before
4_8
After
4_9
Epsilon: 0.9773399999999773 Training Episodes: 8867
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 7] Result: Lose 
17
193
Before
6_16
After
6_17
Epsilon: 0.9773199999999773 Training Episodes: 8866
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 10] Result: Lose 
20
218
Before
1_13
After
1_14
Epsilon: 0.9772999999999773 Training Episodes: 8865
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 9] Result: Win 
10
472
Before
2_2
After
3_3
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 9, 1] Result: Win 
11
231
Before
47_47
After
48_48
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 9, 1, 10] Result: Lose 
21
241
Before
0000000000_49
After
0000000000_50
Epsilon: 0.9772799999999773 Training Episodes: 8864
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [7, 5] Result: Lose 
12
144
Before
5_11
After
5_12
Epsilon: 0.9772599999999773 Training Episodes: 8863
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 9] Result: Lose 
19
239
Before
13_48
After
13_49
Epsilon: 0.9772399999999772 Training Episodes: 8862
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 2] Result: Win 
12
144
Before
5_12
After
6_13
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 2, 8] Result: Lose 
20
152
Before
1_23
After
1_24
Epsilon: 0.9772199999999772 Training Episodes: 8861
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 9] Result: Lose 
19
41
Before
2_14
After
2_15
Epsilon: 0.9771999999999772 Training Episodes: 8860
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [4, 6] Result: Win 
10
142
Before
7_7
After
8_8
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [4, 6, 8] Result: Win 
18
150
Before
2_13
After
3_14
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [4, 6, 8, 3] Result: Lose 
21
153
Before
0000000000_14
After
0000000000_15
Epsilon: 0.9771799999999772 Training Episodes: 8859
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [2, 7] Result: Win 
9
207
Before
6_6
After
7_7
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [2, 7, 10] Result: Win 
19
217
Before
1_12
After
2_13
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [2, 7, 10, 1] Result: Lose 
20
218
Before
1_14
After
1_15
Epsilon: 0.9771599999999772 Training Episodes: 8858
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 2] Result: Win 
6
226
Before
7_7
After
8_8
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 2, 8] Result: Win 
14
234
Before
28_49
After
29_50
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 2, 8, 2] Result: Lose 
16
236
Before
19_58
After
19_59
Epsilon: 0.9771399999999771 Training Episodes: 8857
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [4, 1] Result: Win 
5
159
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [4, 1, 10] Result: Lose 
15
169
Before
9_18
After
9_19
Epsilon: 0.9771199999999771 Training Episodes: 8856
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [5, 8] Result: Lose 
13
145
Before
5_9
After
5_10
Epsilon: 0.9770999999999771 Training Episodes: 8855
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [7, 10] Result: Win 
17
215
Before
2_13
After
3_14
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [7, 10, 4] Result: Lose 
21
219
Before
0000000000_9
After
0000000000_10
Epsilon: 0.9770799999999771 Training Episodes: 8854
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 5] Result: Lose 
15
37
Before
6_10
After
6_11
Epsilon: 0.9770599999999771 Training Episodes: 8853
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [4, 10] Result: Lose 
14
102
Before
9_16
After
9_17
Epsilon: 0.977039999999977 Training Episodes: 8852
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [2, 9] Result: Win 
11
165
Before
14_14
After
15_15
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [2, 9, 10] Result: Lose 
21
175
Before
0000000000_12
After
0000000000_13
Epsilon: 0.977019999999977 Training Episodes: 8851
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 5] Result: Lose 
15
191
Before
3_6
After
3_7
Epsilon: 0.976999999999977 Training Episodes: 8850
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [4, 10] Result: Lose 
14
168
Before
3_7
After
3_8
Epsilon: 0.976979999999977 Training Episodes: 8849
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
7_79
After
7_80
Epsilon: 0.976959999999977 Training Episodes: 8848
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 10] Result: Lose 
20
130
Before
1_14
After
1_15
Epsilon: 0.9769399999999769 Training Episodes: 8847
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [3, 9] Result: Win 
12
56
Before
5_9
After
6_10
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [3, 9, 5] Result: Lose 
17
61
Before
5_12
After
5_13
Epsilon: 0.9769199999999769 Training Episodes: 8846
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [7, 10] Result: Lose 
17
61
Before
5_13
After
5_14
Epsilon: 0.9768999999999769 Training Episodes: 8845
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [6, 9] Result: Lose 
15
147
Before
7_14
After
7_15
Epsilon: 0.9768799999999769 Training Episodes: 8844
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [2, 10] Result: Win 
12
144
Before
6_13
After
7_14
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [2, 10, 7] Result: Lose 
19
151
Before
1_10
After
1_11
Epsilon: 0.9768599999999769 Training Episodes: 8843
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 10] Result: Lose 
20
108
Before
3_16
After
3_17
Epsilon: 0.9768399999999768 Training Episodes: 8842
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 6] Result: Lose 
16
126
Before
7_15
After
7_16
Epsilon: 0.9768199999999768 Training Episodes: 8841
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [4, 7] Result: Win 
11
187
Before
12_12
After
13_13
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [4, 7, 2] Result: Win 
13
189
Before
1_10
After
2_11
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [4, 7, 2, 4] Result: Lose 
17
193
Before
6_17
After
6_18
Epsilon: 0.9767999999999768 Training Episodes: 8840
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 3] Result: Win 
6
226
Before
8_8
After
9_9
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 3, 5] Result: Win 
11
231
Before
48_48
After
49_49
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 3, 5, 4] Result: Lose 
15
235
Before
13_29
After
13_30
Epsilon: 0.9767799999999768 Training Episodes: 8839
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 7] Result: Win 
8
470
Before
3_3
After
4_4
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 7, 3] Result: Win 
11
231
Before
49_49
After
50_50
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 7, 3, 10] Result: Lose 
21
241
Before
0000000000_50
After
0000000000_51
Epsilon: 0.9767599999999768 Training Episodes: 8838
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 1] Result: Win 
7
227
Before
13_13
After
14_14
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 1, 9] Result: Lose 
16
236
Before
19_59
After
19_60
Epsilon: 0.9767399999999767 Training Episodes: 8837
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 6] Result: Win 
13
233
Before
36_57
After
37_58
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 6, 7] Result: Lose 
20
240
Before
7_80
After
7_81
Epsilon: 0.9767199999999767 Training Episodes: 8836
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [6, 2] Result: Win 
8
30
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [6, 2, 3] Result: Win 
11
33
Before
10_10
After
11_11
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [6, 2, 3, 10] Result: Lose 
21
43
Before
0000000000_13
After
0000000000_14
Epsilon: 0.9766999999999767 Training Episodes: 8835
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [6, 4] Result: Win 
10
450
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [6, 4, 1] Result: Win 
11
209
Before
7_7
After
8_8
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [6, 4, 1, 5] Result: Win 
16
214
Before
5_15
After
6_16
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [6, 4, 1, 5, 1] Result: Lose 
17
215
Before
3_14
After
3_15
Epsilon: 0.9766799999999767 Training Episodes: 8834
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 10] Result: Lose 
12
232
Before
24_42
After
24_43
Epsilon: 0.9766599999999767 Training Episodes: 8833
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [2, 6] Result: Win 
8
96
Before
7_7
After
8_8
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [2, 6, 10] Result: Lose 
18
106
Before
3_18
After
3_19
Epsilon: 0.9766399999999766 Training Episodes: 8832
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 7] Result: Lose 
17
83
Before
2_11
After
2_12
Epsilon: 0.9766199999999766 Training Episodes: 8831
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [9, 7] Result: Lose 
16
170
Before
3_12
After
3_13
Epsilon: 0.9765999999999766 Training Episodes: 8830
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [7, 10] Result: Lose 
17
83
Before
2_12
After
2_13
Epsilon: 0.9765799999999766 Training Episodes: 8829
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [4, 4] Result: Win 
8
30
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [4, 4, 6] Result: Lose 
14
36
Before
7_14
After
7_15
Epsilon: 0.9765599999999766 Training Episodes: 8828
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 8] Result: Win 
10
230
Before
22_22
After
23_23
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 8, 7] Result: Win 
17
237
Before
11_47
After
12_48
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 8, 7, 2] Result: Lose 
19
239
Before
13_49
After
13_50
Epsilon: 0.9765399999999765 Training Episodes: 8827
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 5] Result: Win 
15
125
Before
7_15
After
8_16
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 5, 1] Result: Lose 
16
126
Before
7_16
After
7_17
Epsilon: 0.9765199999999765 Training Episodes: 8826
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 10] Result: Win 
11
231
Before
50_50
After
51_51
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 10, 10] Result: Lose 
21
241
Before
0000000000_51
After
0000000000_52
Epsilon: 0.9764999999999765 Training Episodes: 8825
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 7] Result: Lose 
17
127
Before
5_19
After
5_20
Epsilon: 0.9764799999999765 Training Episodes: 8824
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [10, 1] Result: Win 
11
55
Before
11_11
After
12_12
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [10, 1, 6] Result: Lose 
17
61
Before
5_14
After
5_15
Epsilon: 0.9764599999999765 Training Episodes: 8823
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 1] Result: Win 
6
468
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 1, 5] Result: Win 
11
231
Before
51_51
After
52_52
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 1, 5, 7] Result: Lose 
18
238
Before
11_60
After
11_61
Epsilon: 0.9764399999999764 Training Episodes: 8822
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [3, 4] Result: Win 
7
117
Before
4_4
After
5_5
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [3, 4, 6] Result: Win 
13
123
Before
8_10
After
9_11
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [3, 4, 6, 5] Result: Lose 
18
128
Before
2_17
After
2_18
Epsilon: 0.9764199999999764 Training Episodes: 8821
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [4, 2] Result: Win 
6
116
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [4, 2, 7] Result: Lose 
13
123
Before
9_11
After
9_12
Epsilon: 0.9763999999999764 Training Episodes: 8820
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 10] Result: Win 
12
232
Before
24_43
After
25_44
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 10, 3] Result: Lose 
15
235
Before
13_30
After
13_31
Epsilon: 0.9763799999999764 Training Episodes: 8819
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 3] Result: Win 
13
233
Before
37_58
After
38_59
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 3, 2] Result: Lose 
15
235
Before
13_31
After
13_32
Epsilon: 0.9763599999999764 Training Episodes: 8818
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 5] Result: Lose 
15
235
Before
13_32
After
13_33
Epsilon: 0.9763399999999763 Training Episodes: 8817
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 8] Result: Lose 
14
234
Before
29_50
After
29_51
Epsilon: 0.9763199999999763 Training Episodes: 8816
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 10] Result: Lose 
20
130
Before
1_15
After
1_16
Epsilon: 0.9762999999999763 Training Episodes: 8815
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 4] Result: Win 
5
225
Before
4_4
After
5_5
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 4, 7] Result: Lose 
12
232
Before
25_44
After
25_45
Epsilon: 0.9762799999999763 Training Episodes: 8814
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [9, 2] Result: Win 
11
143
Before
13_13
After
14_14
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [9, 2, 8] Result: Lose 
19
151
Before
1_11
After
1_12
Epsilon: 0.9762599999999763 Training Episodes: 8813
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 10] Result: Lose 
20
196
Before
0000000000_12
After
0000000000_13
Epsilon: 0.9762399999999762 Training Episodes: 8812
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [10, 5] Result: Lose 
15
59
Before
3_13
After
3_14
Epsilon: 0.9762199999999762 Training Episodes: 8811
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [7, 4] Result: Win 
11
99
Before
11_11
After
12_12
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [7, 4, 10] Result: Lose 
21
109
Before
0000000000_15
After
0000000000_16
Epsilon: 0.9761999999999762 Training Episodes: 8810
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 3] Result: Win 
11
231
Before
52_52
After
53_53
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 3, 4] Result: Win 
15
235
Before
13_33
After
14_34
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 3, 4, 6] Result: Lose 
21
241
Before
0000000000_52
After
0000000000_53
Epsilon: 0.9761799999999762 Training Episodes: 8809
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [3, 7] Result: Win 
10
76
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [3, 7, 10] Result: Lose 
20
86
Before
6_17
After
6_18
Epsilon: 0.9761599999999762 Training Episodes: 8808
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [2, 4] Result: Win 
6
204
Before
3_3
After
4_4
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [2, 4, 9] Result: Win 
15
213
Before
4_12
After
5_13
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [2, 4, 9, 5] Result: Lose 
20
218
Before
1_15
After
1_16
Epsilon: 0.9761399999999761 Training Episodes: 8807
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [10, 4] Result: Lose 
14
58
Before
2_8
After
2_9
Epsilon: 0.9761199999999761 Training Episodes: 8806
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 7] Result: Lose 
17
193
Before
6_18
After
6_19
Epsilon: 0.9760999999999761 Training Episodes: 8805
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [9, 10] Result: Lose 
19
107
Before
2_14
After
2_15
Epsilon: 0.9760799999999761 Training Episodes: 8804
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [5, 9] Result: Win 
14
102
Before
9_17
After
10_18
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [5, 9, 5] Result: Lose 
19
107
Before
2_15
After
2_16
Epsilon: 0.9760599999999761 Training Episodes: 8803
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [3, 10] Result: Lose 
13
189
Before
2_11
After
2_12
Epsilon: 0.976039999999976 Training Episodes: 8802
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [8, 10] Result: Win 
18
150
Before
3_14
After
4_15
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [8, 10, 2] Result: Lose 
20
152
Before
1_24
After
1_25
Epsilon: 0.976019999999976 Training Episodes: 8801
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 10] Result: Lose 
17
237
Before
12_48
After
12_49
Epsilon: 0.975999999999976 Training Episodes: 8800
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 3] Result: Win 
13
233
Before
38_59
After
39_60
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 3, 6] Result: Win 
19
239
Before
13_50
After
14_51
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 3, 6, 1] Result: Lose 
20
240
Before
7_81
After
7_82
Epsilon: 0.975979999999976 Training Episodes: 8799
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [3, 3] Result: Win 
6
94
Before
2_2
After
3_3
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [3, 3, 9] Result: Win 
15
103
Before
6_13
After
7_14
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [3, 3, 9, 2] Result: Lose 
17
105
Before
3_13
After
3_14
Epsilon: 0.975959999999976 Training Episodes: 8798
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [3, 4] Result: Win 
7
117
Before
5_5
After
6_6
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [3, 4, 3] Result: Win 
10
120
Before
10_10
After
11_11
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [3, 4, 3, 7] Result: Win 
17
127
Before
5_20
After
6_21
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [3, 4, 3, 7, 4] Result: Lose 
21
131
Before
0000000000_19
After
0000000000_20
Epsilon: 0.9759399999999759 Training Episodes: 8797
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [7, 4] Result: Win 
11
33
Before
11_11
After
12_12
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [7, 4, 7] Result: Lose 
18
40
Before
5_15
After
5_16
Epsilon: 0.9759199999999759 Training Episodes: 8796
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 10] Result: Lose 
20
130
Before
1_16
After
1_17
Epsilon: 0.9758999999999759 Training Episodes: 8795
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [1, 10] Result: Win 
11
209
Before
8_8
After
9_9
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [1, 10, 10] Result: Lose 
21
219
Before
0000000000_10
After
0000000000_11
Epsilon: 0.9758799999999759 Training Episodes: 8794
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [5, 5] Result: Win 
10
208
Before
5_5
After
6_6
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [5, 5, 2] Result: Win 
12
210
Before
6_9
After
7_10
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [5, 5, 2, 2] Result: Lose 
14
212
Before
1_5
After
1_6
Epsilon: 0.9758599999999759 Training Episodes: 8793
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [5, 10] Result: Win 
15
37
Before
6_11
After
7_12
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [5, 10, 6] Result: Lose 
21
43
Before
0000000000_14
After
0000000000_15
Epsilon: 0.9758399999999758 Training Episodes: 8792
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 10] Result: Lose 
20
86
Before
6_18
After
6_19
Epsilon: 0.9758199999999758 Training Episodes: 8791
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [4, 9] Result: Lose 
13
101
Before
6_17
After
6_18
Epsilon: 0.9757999999999758 Training Episodes: 8790
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 3] Result: Win 
12
232
Before
25_45
After
26_46
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 3, 9] Result: Lose 
21
241
Before
0000000000_53
After
0000000000_54
Epsilon: 0.9757799999999758 Training Episodes: 8789
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [3, 6] Result: Win 
9
207
Before
7_7
After
8_8
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [3, 6, 3] Result: Lose 
12
210
Before
7_10
After
7_11
Epsilon: 0.9757599999999758 Training Episodes: 8788
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 10] Result: Lose 
20
218
Before
1_16
After
1_17
Epsilon: 0.9757399999999757 Training Episodes: 8787
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [9, 10] Result: Lose 
19
151
Before
1_12
After
1_13
Epsilon: 0.9757199999999757 Training Episodes: 8786
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 10] Result: Win 
20
196
Before
0000000000_13
After
1_14
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 10, 1] Result: Lose 
21
197
Before
0000000000_9
After
0000000000_10
Epsilon: 0.9756999999999757 Training Episodes: 8785
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [3, 10] Result: Win 
13
167
Before
6_9
After
7_10
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [3, 10, 1] Result: Win 
14
168
Before
3_8
After
4_9
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [3, 10, 1, 5] Result: Win 
19
173
Before
0000000000_10
After
1_11
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [3, 10, 1, 5, 1] Result: Lose 
20
174
Before
1_10
After
1_11
Epsilon: 0.9756799999999757 Training Episodes: 8784
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [10, 10] Result: Lose 
20
64
Before
1_21
After
1_22
Epsilon: 0.9756599999999757 Training Episodes: 8783
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 10] Result: Lose 
19
239
Before
14_51
After
14_52
Epsilon: 0.9756399999999756 Training Episodes: 8782
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [9, 4] Result: Win 
13
211
Before
4_8
After
5_9
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [9, 4, 4] Result: Win 
17
215
Before
3_15
After
4_16
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [9, 4, 4, 3] Result: Lose 
20
218
Before
1_17
After
1_18
Epsilon: 0.9756199999999756 Training Episodes: 8781
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 10] Result: Lose 
20
42
Before
2_17
After
2_18
Epsilon: 0.9755999999999756 Training Episodes: 8780
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [8, 4] Result: Lose 
12
210
Before
7_11
After
7_12
Epsilon: 0.9755799999999756 Training Episodes: 8779
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [1, 7] Result: Win 
8
426
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [1, 7, 1] Result: Win 
9
185
Before
5_5
After
6_6
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [1, 7, 1, 4] Result: Lose 
13
189
Before
2_12
After
2_13
Epsilon: 0.9755599999999756 Training Episodes: 8778
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [5, 9] Result: Win 
14
102
Before
10_18
After
11_19
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [5, 9, 3] Result: Win 
17
105
Before
3_14
After
4_15
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [5, 9, 3, 4] Result: Lose 
21
109
Before
0000000000_16
After
0000000000_17
Epsilon: 0.9755399999999755 Training Episodes: 8777
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [9, 10] Result: Win 
19
173
Before
1_11
After
2_12
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [9, 10, 1] Result: Lose 
20
174
Before
1_11
After
1_12
Epsilon: 0.9755199999999755 Training Episodes: 8776
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [4, 6] Result: Win 
10
76
Before
3_3
After
4_4
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [4, 6, 3] Result: Lose 
13
79
Before
4_8
After
4_9
Epsilon: 0.9754999999999755 Training Episodes: 8775
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 10] Result: Win 
15
235
Before
14_34
After
15_35
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 10, 6] Result: Lose 
21
241
Before
0000000000_54
After
0000000000_55
Epsilon: 0.9754799999999755 Training Episodes: 8774
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [1, 4] Result: Win 
5
137
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [1, 4, 7] Result: Win 
12
144
Before
7_14
After
8_15
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [1, 4, 7, 3] Result: Win 
15
147
Before
7_15
After
8_16
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [1, 4, 7, 3, 3] Result: Lose 
18
150
Before
4_15
After
4_16
Epsilon: 0.9754599999999755 Training Episodes: 8773
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 10] Result: Lose 
19
239
Before
14_52
After
14_53
Epsilon: 0.9754399999999754 Training Episodes: 8772
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [6, 8] Result: Lose 
14
102
Before
11_19
After
11_20
Epsilon: 0.9754199999999754 Training Episodes: 8771
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 3] Result: Win 
8
228
Before
21_21
After
22_22
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 3, 10] Result: Lose 
18
238
Before
11_61
After
11_62
Epsilon: 0.9753999999999754 Training Episodes: 8770
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [9, 10] Result: Lose 
19
173
Before
2_12
After
2_13
Epsilon: 0.9753799999999754 Training Episodes: 8769
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 5] Result: Win 
15
213
Before
5_13
After
6_14
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 5, 4] Result: Lose 
19
217
Before
2_13
After
2_14
Epsilon: 0.9753599999999754 Training Episodes: 8768
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 9] Result: Lose 
19
239
Before
14_53
After
14_54
Epsilon: 0.9753399999999753 Training Episodes: 8767
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [4, 10] Result: Lose 
14
58
Before
2_9
After
2_10
Epsilon: 0.9753199999999753 Training Episodes: 8766
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [1, 10] Result: Win 
11
55
Before
12_12
After
13_13
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [1, 10, 10] Result: Lose 
21
65
Before
0000000000_8
After
0000000000_9
Epsilon: 0.9752999999999753 Training Episodes: 8765
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 10] Result: Lose 
12
232
Before
26_46
After
26_47
Epsilon: 0.9752799999999753 Training Episodes: 8764
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
7_82
After
7_83
Epsilon: 0.9752599999999753 Training Episodes: 8763
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [8, 6] Result: Win 
14
58
Before
2_10
After
3_11
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [8, 6, 1] Result: Lose 
15
59
Before
3_14
After
3_15
Epsilon: 0.9752399999999752 Training Episodes: 8762
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 1] Result: Win 
11
33
Before
12_12
After
13_13
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 1, 6] Result: Win 
17
39
Before
3_8
After
4_9
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 1, 6, 1] Result: Win 
18
40
Before
5_16
After
6_17
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 1, 6, 1, 2] Result: Lose 
20
42
Before
2_18
After
2_19
Epsilon: 0.9752199999999752 Training Episodes: 8761
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 2] Result: Win 
12
166
Before
4_6
After
5_7
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 2, 9] Result: Lose 
21
175
Before
0000000000_13
After
0000000000_14
Epsilon: 0.9751999999999752 Training Episodes: 8760
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 10] Result: Lose 
20
130
Before
1_17
After
1_18
Epsilon: 0.9751799999999752 Training Episodes: 8759
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [9, 7] Result: Lose 
16
104
Before
7_15
After
7_16
Epsilon: 0.9751599999999752 Training Episodes: 8758
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [4, 10] Result: Win 
14
212
Before
1_6
After
2_7
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [4, 10, 7] Result: Lose 
21
219
Before
0000000000_11
After
0000000000_12
Epsilon: 0.9751399999999751 Training Episodes: 8757
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [7, 9] Result: Lose 
16
60
Before
7_10
After
7_11
Epsilon: 0.9751199999999751 Training Episodes: 8756
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [1, 8] Result: Win 
9
53
Before
5_5
After
6_6
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [1, 8, 4] Result: Win 
13
57
Before
4_8
After
5_9
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [1, 8, 4, 5] Result: Lose 
18
62
Before
1_8
After
1_9
Epsilon: 0.9750999999999751 Training Episodes: 8755
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10] Result: Lose 
13
233
Before
39_60
After
39_61
Epsilon: 0.9750799999999751 Training Episodes: 8754
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [9, 3] Result: Lose 
12
34
Before
4_9
After
4_10
Epsilon: 0.9750599999999751 Training Episodes: 8753
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
7_83
After
7_84
Epsilon: 0.975039999999975 Training Episodes: 8752
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [1, 3] Result: Win 
4
290
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [1, 3, 2] Result: Win 
6
50
Before
4_4
After
5_5
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [1, 3, 2, 10] Result: Lose 
16
60
Before
7_11
After
7_12
Epsilon: 0.975019999999975 Training Episodes: 8751
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 1] Result: Win 
5
225
Before
5_5
After
6_6
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 1, 9] Result: Lose 
14
234
Before
29_51
After
29_52
Epsilon: 0.974999999999975 Training Episodes: 8750
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [8, 10] Result: Lose 
18
106
Before
3_19
After
3_20
Epsilon: 0.974979999999975 Training Episodes: 8749
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 5] Result: Win 
15
81
Before
4_13
After
5_14
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 5, 4] Result: Lose 
19
85
Before
1_14
After
1_15
Epsilon: 0.974959999999975 Training Episodes: 8748
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [10, 5] Result: Lose 
15
213
Before
6_14
After
6_15
Epsilon: 0.9749399999999749 Training Episodes: 8747
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 6] Result: Lose 
16
104
Before
7_16
After
7_17
Epsilon: 0.9749199999999749 Training Episodes: 8746
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 1] Result: Win 
11
231
Before
53_53
After
54_54
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 1, 7] Result: Lose 
18
238
Before
11_62
After
11_63
Epsilon: 0.9748999999999749 Training Episodes: 8745
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [3, 6] Result: Win 
9
119
Before
3_3
After
4_4
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [3, 6, 7] Result: Lose 
16
126
Before
7_17
After
7_18
Epsilon: 0.9748799999999749 Training Episodes: 8744
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 10] Result: Lose 
20
196
Before
1_14
After
1_15
Epsilon: 0.9748599999999749 Training Episodes: 8743
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [6, 10] Result: Lose 
16
170
Before
3_13
After
3_14
Epsilon: 0.9748399999999748 Training Episodes: 8742
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 8] Result: Win 
18
172
Before
1_11
After
2_12
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 8, 2] Result: Lose 
20
174
Before
1_12
After
1_13
Epsilon: 0.9748199999999748 Training Episodes: 8741
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
7_84
After
7_85
Epsilon: 0.9747999999999748 Training Episodes: 8740
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 2] Result: Win 
12
34
Before
4_10
After
5_11
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 2, 5] Result: Lose 
17
39
Before
4_9
After
4_10
Epsilon: 0.9747799999999748 Training Episodes: 8739
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [5, 8] Result: Lose 
13
57
Before
5_9
After
5_10
Epsilon: 0.9747599999999748 Training Episodes: 8738
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 4] Result: Win 
10
472
Before
3_3
After
4_4
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 4, 1] Result: Win 
11
231
Before
54_54
After
55_55
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 4, 1, 6] Result: Lose 
17
237
Before
12_49
After
12_50
Epsilon: 0.9747399999999747 Training Episodes: 8737
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [4, 7] Result: Win 
11
77
Before
9_9
After
10_10
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [4, 7, 5] Result: Lose 
16
82
Before
4_13
After
4_14
Epsilon: 0.9747199999999747 Training Episodes: 8736
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 3] Result: Win 
13
35
Before
6_8
After
7_9
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 3, 4] Result: Lose 
17
39
Before
4_10
After
4_11
Epsilon: 0.9746999999999747 Training Episodes: 8735
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [5, 8] Result: Lose 
13
79
Before
4_9
After
4_10
Epsilon: 0.9746799999999747 Training Episodes: 8734
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 9] Result: Lose 
12
232
Before
26_47
After
26_48
Epsilon: 0.9746599999999747 Training Episodes: 8733
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [6, 9] Result: Lose 
15
37
Before
7_12
After
7_13
Epsilon: 0.9746399999999746 Training Episodes: 8732
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [9, 10] Result: Lose 
19
173
Before
2_13
After
2_14
Epsilon: 0.9746199999999746 Training Episodes: 8731
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 10] Result: Win 
11
231
Before
55_55
After
56_56
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [1, 10, 5] Result: Lose 
16
236
Before
19_60
After
19_61
Epsilon: 0.9745999999999746 Training Episodes: 8730
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 2] Result: Win 
5
225
Before
6_6
After
7_7
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 2, 10] Result: Win 
15
235
Before
15_35
After
16_36
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 2, 10, 3] Result: Lose 
18
238
Before
11_63
After
11_64
Epsilon: 0.9745799999999746 Training Episodes: 8729
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 9] Result: Lose 
19
107
Before
2_16
After
2_17
Epsilon: 0.9745599999999746 Training Episodes: 8728
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [1, 2] Result: Win 
3
267
Before
3_3
After
4_4
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [1, 2, 1] Result: Win 
4
268
Before
0000000000_0000000000
After
1_1
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [1, 2, 1, 7] Result: Win 
11
33
Before
13_13
After
14_14
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [1, 2, 1, 7, 10] Result: Lose 
21
43
Before
0000000000_15
After
0000000000_16
Epsilon: 0.9745399999999745 Training Episodes: 8727
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [4, 10] Result: Lose 
14
58
Before
3_11
After
3_12
Epsilon: 0.9745199999999745 Training Episodes: 8726
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 6] Result: Lose 
16
236
Before
19_61
After
19_62
Epsilon: 0.9744999999999745 Training Episodes: 8725
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [1, 10] Result: Win 
11
143
Before
14_14
After
15_15
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [1, 10, 10] Result: Lose 
21
153
Before
0000000000_15
After
0000000000_16
Epsilon: 0.9744799999999745 Training Episodes: 8724
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [8, 3] Result: Win 
11
121
Before
10_10
After
11_11
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [8, 3, 10] Result: Lose 
21
131
Before
0000000000_20
After
0000000000_21
Epsilon: 0.9744599999999745 Training Episodes: 8723
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [6, 6] Result: Win 
12
34
Before
5_11
After
6_12
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [6, 6, 8] Result: Lose 
20
42
Before
2_19
After
2_20
Epsilon: 0.9744399999999744 Training Episodes: 8722
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [9, 1] Result: Win 
10
142
Before
8_8
After
9_9
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [9, 1, 10] Result: Lose 
20
152
Before
1_25
After
1_26
Epsilon: 0.9744199999999744 Training Episodes: 8721
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 8] Result: Win 
11
231
Before
56_56
After
57_57
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 8, 8] Result: Lose 
19
239
Before
14_54
After
14_55
Epsilon: 0.9743999999999744 Training Episodes: 8720
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
7_85
After
7_86
Epsilon: 0.9743799999999744 Training Episodes: 8719
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 2] Result: Win 
7
227
Before
14_14
After
15_15
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 2, 10] Result: Lose 
17
237
Before
12_50
After
12_51
Epsilon: 0.9743599999999744 Training Episodes: 8718
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 3] Result: Lose 
13
233
Before
39_61
After
39_62
Epsilon: 0.9743399999999743 Training Episodes: 8717
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [6, 4] Result: Win 
10
164
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [6, 4, 10] Result: Lose 
20
174
Before
1_13
After
1_14
Epsilon: 0.9743199999999743 Training Episodes: 8716
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 2] Result: Win 
12
34
Before
6_12
After
7_13
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 2, 7] Result: Lose 
19
41
Before
2_15
After
2_16
Epsilon: 0.9742999999999743 Training Episodes: 8715
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [2, 3] Result: Win 
5
115
Before
1_1
After
2_2
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [2, 3, 5] Result: Win 
10
120
Before
11_11
After
12_12
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [2, 3, 5, 10] Result: Lose 
20
130
Before
1_18
After
1_19
Epsilon: 0.9742799999999743 Training Episodes: 8714
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [3, 8] Result: Win 
11
55
Before
13_13
After
14_14
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [3, 8, 5] Result: Lose 
16
60
Before
7_12
After
7_13
Epsilon: 0.9742599999999743 Training Episodes: 8713
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 6] Result: Win 
16
170
Before
3_14
After
4_15
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 6, 5] Result: Lose 
21
175
Before
0000000000_14
After
0000000000_15
Epsilon: 0.9742399999999742 Training Episodes: 8712
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [6, 5] Result: Win 
11
121
Before
11_11
After
12_12
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [6, 5, 8] Result: Lose 
19
129
Before
2_12
After
2_13
Epsilon: 0.9742199999999742 Training Episodes: 8711
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [7, 7] Result: Win 
14
124
Before
9_12
After
10_13
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [7, 7, 7] Result: Lose 
21
131
Before
0000000000_21
After
0000000000_22
Epsilon: 0.9741999999999742 Training Episodes: 8710
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
7_86
After
7_87
Epsilon: 0.9741799999999742 Training Episodes: 8709
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [4, 2] Result: Win 
6
204
Before
4_4
After
5_5
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [4, 2, 10] Result: Lose 
16
214
Before
6_16
After
6_17
Epsilon: 0.9741599999999742 Training Episodes: 8708
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [3, 4] Result: Win 
7
117
Before
6_6
After
7_7
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [3, 4, 8] Result: Lose 
15
125
Before
8_16
After
8_17
Epsilon: 0.9741399999999741 Training Episodes: 8707
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [1, 10] Result: Win 
11
77
Before
10_10
After
11_11
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [1, 10, 10] Result: Lose 
21
87
Before
0000000000_16
After
0000000000_17
Epsilon: 0.9741199999999741 Training Episodes: 8706
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 7] Result: Win 
12
232
Before
26_48
After
27_49
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 7, 6] Result: Win 
18
238
Before
11_64
After
12_65
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 7, 6, 3] Result: Lose 
21
241
Before
0000000000_55
After
0000000000_56
Epsilon: 0.9740999999999741 Training Episodes: 8705
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 7] Result: Win 
15
235
Before
16_36
After
17_37
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 7, 5] Result: Lose 
20
240
Before
7_87
After
7_88
Epsilon: 0.9740799999999741 Training Episodes: 8704
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [2, 3] Result: Win 
5
27
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [2, 3, 10] Result: Lose 
15
37
Before
7_13
After
7_14
Epsilon: 0.9740599999999741 Training Episodes: 8703
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [5, 4] Result: Win 
9
163
Before
7_7
After
8_8
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [5, 4, 10] Result: Lose 
19
173
Before
2_14
After
2_15
Epsilon: 0.974039999999974 Training Episodes: 8702
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [7, 6] Result: Win 
13
57
Before
5_10
After
6_11
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [7, 6, 2] Result: Lose 
15
59
Before
3_15
After
3_16
Epsilon: 0.974019999999974 Training Episodes: 8701
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [4, 6] Result: Win 
10
32
Before
9_9
After
10_10
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [4, 6, 3] Result: Lose 
13
35
Before
7_9
After
7_10
Epsilon: 0.973999999999974 Training Episodes: 8700
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [5, 4] Result: Win 
9
31
Before
8_8
After
9_9
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [5, 4, 7] Result: Lose 
16
38
Before
4_15
After
4_16
Epsilon: 0.973979999999974 Training Episodes: 8699
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [8, 10] Result: Win 
18
194
Before
4_17
After
5_18
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [8, 10, 1] Result: Lose 
19
195
Before
1_12
After
1_13
Epsilon: 0.973959999999974 Training Episodes: 8698
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 6] Result: Win 
11
231
Before
57_57
After
58_58
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 6, 5] Result: Win 
16
236
Before
19_62
After
20_63
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [5, 6, 5, 2] Result: Lose 
18
238
Before
12_65
After
12_66
Epsilon: 0.9739399999999739 Training Episodes: 8697
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 5] Result: Win 
13
233
Before
39_62
After
40_63
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 5, 8] Result: Lose 
21
241
Before
0000000000_56
After
0000000000_57
Epsilon: 0.9739199999999739 Training Episodes: 8696
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [10, 7] Result: Lose 
17
149
Before
1_8
After
1_9
Epsilon: 0.9738999999999739 Training Episodes: 8695
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 8] Result: Lose 
18
106
Before
3_20
After
3_21
Epsilon: 0.9738799999999739 Training Episodes: 8694
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 9] Result: Win 
13
233
Before
40_63
After
41_64
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 9, 7] Result: Lose 
20
240
Before
7_88
After
7_89
Epsilon: 0.9738599999999739 Training Episodes: 8693
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 10] Result: Lose 
20
196
Before
1_15
After
1_16
Epsilon: 0.9738399999999738 Training Episodes: 8692
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [9, 5] Result: Lose 
14
212
Before
2_7
After
2_8
Epsilon: 0.9738199999999738 Training Episodes: 8691
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [4, 1] Result: Win 
5
93
Before
1_1
After
2_2
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [4, 1, 10] Result: Lose 
15
103
Before
7_14
After
7_15
Epsilon: 0.9737999999999738 Training Episodes: 8690
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 5] Result: Lose 
15
169
Before
9_19
After
9_20
Epsilon: 0.9737799999999738 Training Episodes: 8689
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [4, 10] Result: Win 
14
80
Before
7_15
After
8_16
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [4, 10, 5] Result: Lose 
19
85
Before
1_15
After
1_16
Epsilon: 0.9737599999999738 Training Episodes: 8688
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 5] Result: Win 
15
191
Before
3_7
After
4_8
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 5, 5] Result: Lose 
20
196
Before
1_16
After
1_17
Epsilon: 0.9737399999999737 Training Episodes: 8687
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [7, 2] Result: Win 
9
163
Before
8_8
After
9_9
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [7, 2, 3] Result: Lose 
12
166
Before
5_7
After
5_8
Epsilon: 0.9737199999999737 Training Episodes: 8686
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 1] Result: Win 
11
99
Before
12_12
After
13_13
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 1, 5] Result: Lose 
16
104
Before
7_17
After
7_18
Epsilon: 0.9736999999999737 Training Episodes: 8685
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 8] Result: Lose 
18
128
Before
2_18
After
2_19
Epsilon: 0.9736799999999737 Training Episodes: 8684
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 1] Result: Win 
8
470
Before
4_4
After
5_5
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 1, 2] Result: Win 
10
230
Before
23_23
After
24_24
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 1, 2, 10] Result: Lose 
20
240
Before
7_89
After
7_90
Epsilon: 0.9736599999999737 Training Episodes: 8683
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 7] Result: Lose 
15
235
Before
17_37
After
17_38
Epsilon: 0.9736399999999736 Training Episodes: 8682
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [2, 2] Result: Win 
4
92
Before
2_2
After
3_3
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [2, 2, 7] Result: Win 
11
99
Before
13_13
After
14_14
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [2, 2, 7, 5] Result: Lose 
16
104
Before
7_18
After
7_19
Epsilon: 0.9736199999999736 Training Episodes: 8681
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 3] Result: Win 
11
231
Before
58_58
After
59_59
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 3, 5] Result: Lose 
16
236
Before
20_63
After
20_64
Epsilon: 0.9735999999999736 Training Episodes: 8680
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [9, 7] Result: Win 
16
104
Before
7_19
After
8_20
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [9, 7, 5] Result: Lose 
21
109
Before
0000000000_17
After
0000000000_18
Epsilon: 0.9735799999999736 Training Episodes: 8679
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [9, 10] Result: Lose 
19
129
Before
2_13
After
2_14
Epsilon: 0.9735599999999736 Training Episodes: 8678
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 10] Result: Win 
17
237
Before
12_51
After
13_52
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [7, 10, 4] Result: Lose 
21
241
Before
0000000000_57
After
0000000000_58
Epsilon: 0.9735399999999735 Training Episodes: 8677
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [9, 10] Result: Lose 
19
217
Before
2_14
After
2_15
Epsilon: 0.9735199999999735 Training Episodes: 8676
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 1] Result: Win 
10
472
Before
4_4
After
5_5
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 1, 1] Result: Win 
11
231
Before
59_59
After
60_60
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 1, 1, 6] Result: Lose 
17
237
Before
13_52
After
13_53
Epsilon: 0.9734999999999735 Training Episodes: 8675
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [7, 10] Result: Win 
17
83
Before
2_13
After
3_14
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [7, 10, 1] Result: Lose 
18
84
Before
1_8
After
1_9
Epsilon: 0.9734799999999735 Training Episodes: 8674
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 10] Result: Win 
20
42
Before
2_20
After
3_21
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [10, 10, 1] Result: Lose 
21
43
Before
0000000000_16
After
0000000000_17
Epsilon: 0.9734599999999735 Training Episodes: 8673
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [1, 5] Result: Win 
6
160
Before
5_5
After
6_6
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [1, 5, 10] Result: Lose 
16
170
Before
4_15
After
4_16
Epsilon: 0.9734399999999734 Training Episodes: 8672
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 6] Result: Win 
8
228
Before
22_22
After
23_23
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 6, 4] Result: Lose 
12
232
Before
27_49
After
27_50
Epsilon: 0.9734199999999734 Training Episodes: 8671
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [3, 2] Result: Win 
5
181
Before
3_3
After
4_4
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [3, 2, 6] Result: Win 
11
187
Before
13_13
After
14_14
Win
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [3, 2, 6, 3] Result: Win 
14
190
Before
4_8
After
5_9
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [3, 2, 6, 3, 1] Result: Lose 
15
191
Before
4_8
After
4_9
Epsilon: 0.9733999999999734 Training Episodes: 8670
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 7] Result: Win 
17
237
Before
13_53
After
14_54
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 7, 3] Result: Lose 
20
240
Before
7_90
After
7_91
Epsilon: 0.9733799999999734 Training Episodes: 8669
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [2, 9] Result: Win 
11
77
Before
11_11
After
12_12
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [2, 9, 3] Result: Win 
14
80
Before
8_16
After
9_17
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [2, 9, 3, 6] Result: Lose 
20
86
Before
6_19
After
6_20
Epsilon: 0.9733599999999734 Training Episodes: 8668
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 10] Result: Win 
12
232
Before
27_50
After
28_51
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 10, 9] Result: Lose 
21
241
Before
0000000000_58
After
0000000000_59
Epsilon: 0.9733399999999733 Training Episodes: 8667
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 9] Result: Win 
11
231
Before
60_60
After
61_61
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [2, 9, 9] Result: Lose 
20
240
Before
7_91
After
7_92
Epsilon: 0.9733199999999733 Training Episodes: 8666
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 2] Result: Win 
6
226
Before
9_9
After
10_10
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 2, 6] Result: Win 
12
232
Before
28_51
After
29_52
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 2, 6, 9] Result: Lose 
21
241
Before
0000000000_59
After
0000000000_60
Epsilon: 0.9732999999999733 Training Episodes: 8665
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 2] Result: Win 
6
226
Before
10_10
After
11_11
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 2, 10] Result: Win 
16
236
Before
20_64
After
21_65
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 2, 10, 5] Result: Lose 
21
241
Before
0000000000_60
After
0000000000_61
Epsilon: 0.9732799999999733 Training Episodes: 8664
Win
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [6, 8] Result: Win 
14
36
Before
7_15
After
8_16
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [6, 8, 7] Result: Lose 
21
43
Before
0000000000_17
After
0000000000_18
Epsilon: 0.9732599999999733 Training Episodes: 8663
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 9] Result: Win 
13
233
Before
41_64
After
42_65
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [4, 9, 3] Result: Lose 
16
236
Before
21_65
After
21_66
Epsilon: 0.9732399999999732 Training Episodes: 8662
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [3, 8] Result: Win 
11
121
Before
12_12
After
13_13
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [3, 8, 9] Result: Lose 
20
130
Before
1_19
After
1_20
Epsilon: 0.9732199999999732 Training Episodes: 8661
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 10] Result: Lose 
16
236
Before
21_66
After
21_67
Epsilon: 0.9731999999999732 Training Episodes: 8660
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 7] Result: Lose 
17
237
Before
14_54
After
14_55
Epsilon: 0.9731799999999732 Training Episodes: 8659
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [2, 10] Result: Lose 
12
56
Before
6_10
After
6_11
Epsilon: 0.9731599999999732 Training Episodes: 8658
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 4] Result: Win 
14
124
Before
10_13
After
11_14
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [10, 4, 3] Result: Lose 
17
127
Before
6_21
After
6_22
Epsilon: 0.9731399999999731 Training Episodes: 8657
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 10] Result: Win 
16
236
Before
21_67
After
22_68
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [6, 10, 1] Result: Lose 
17
237
Before
14_55
After
14_56
Epsilon: 0.9731199999999731 Training Episodes: 8656
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [3, 3] Result: Win 
6
116
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [3, 3, 10] Result: Lose 
16
126
Before
7_18
After
7_19
Epsilon: 0.9730999999999731 Training Episodes: 8655
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [10, 10] Result: Lose 
20
86
Before
6_20
After
6_21
Epsilon: 0.9730799999999731 Training Episodes: 8654
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [2, 3] Result: Win 
5
115
Before
2_2
After
3_3
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [2, 3, 10] Result: Lose 
15
125
Before
8_17
After
8_18
Epsilon: 0.9730599999999731 Training Episodes: 8653
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [5, 10] Result: Lose 
15
213
Before
6_15
After
6_16
Epsilon: 0.973039999999973 Training Episodes: 8652
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 10] Result: Lose 
20
240
Before
7_92
After
7_93
Epsilon: 0.973019999999973 Training Episodes: 8651
Win
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [7, 9] Result: Win 
16
214
Before
6_17
After
7_18
Lose
function called, Action decided: Hit Dealer Cards: 9 Player Cards: [7, 9, 5] Result: Lose 
21
219
Before
0000000000_12
After
0000000000_13
Epsilon: 0.972999999999973 Training Episodes: 8650
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [2, 10] Result: Lose 
12
188
Before
11_12
After
11_13
Epsilon: 0.972979999999973 Training Episodes: 8649
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [8, 10] Result: Lose 
18
40
Before
6_17
After
6_18
Epsilon: 0.972959999999973 Training Episodes: 8648
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [5, 10] Result: Win 
15
147
Before
8_16
After
9_17
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [5, 10, 2] Result: Lose 
17
149
Before
1_9
After
1_10
Epsilon: 0.9729399999999729 Training Episodes: 8647
Win
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [4, 6] Result: Win 
10
76
Before
4_4
After
5_5
Lose
function called, Action decided: Hit Dealer Cards: 3 Player Cards: [4, 6, 9] Result: Lose 
19
85
Before
1_16
After
1_17
Epsilon: 0.9729199999999729 Training Episodes: 8646
Win
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 10] Result: Win 
20
174
Before
1_14
After
2_15
Lose
function called, Action decided: Hit Dealer Cards: 7 Player Cards: [10, 10, 1] Result: Lose 
21
175
Before
0000000000_15
After
0000000000_16
Epsilon: 0.9728999999999729 Training Episodes: 8645
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [3, 10] Result: Lose 
13
233
Before
42_65
After
42_66
Epsilon: 0.9728799999999729 Training Episodes: 8644
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 1] Result: Win 
10
230
Before
24_24
After
25_25
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 1, 9] Result: Win 
19
239
Before
14_55
After
15_56
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 1, 9, 1] Result: Win 
20
240
Before
7_93
After
8_94
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [9, 1, 9, 1, 1] Result: Lose 
21
241
Before
0000000000_61
After
0000000000_62
Epsilon: 0.9728599999999729 Training Episodes: 8643
Win
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 9] Result: Win 
19
107
Before
2_17
After
3_18
Lose
function called, Action decided: Hit Dealer Cards: 4 Player Cards: [10, 9, 1] Result: Lose 
20
108
Before
3_17
After
3_18
Epsilon: 0.9728399999999728 Training Episodes: 8642
Lose
function called, Action decided: Hit Dealer Cards: 1 Player Cards: [2, 10] Result: Lose 
12
34
Before
7_13
After
7_14
Epsilon: 0.9728199999999728 Training Episodes: 8641
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [2, 10] Result: Lose 
12
144
Before
8_15
After
8_16
Epsilon: 0.9727999999999728 Training Episodes: 8640
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [5, 6] Result: Win 
11
121
Before
13_13
After
14_14
Win
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [5, 6, 1] Result: Win 
12
122
Before
11_15
After
12_16
Lose
function called, Action decided: Hit Dealer Cards: 5 Player Cards: [5, 6, 1, 9] Result: Lose 
21
131
Before
0000000000_22
After
0000000000_23
Epsilon: 0.9727799999999728 Training Episodes: 8639
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 10] Result: Lose 
20
196
Before
1_17
After
1_18
Epsilon: 0.9727599999999728 Training Episodes: 8638
Win
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [6, 10] Result: Win 
16
148
Before
5_11
After
6_12
Lose
function called, Action decided: Hit Dealer Cards: 6 Player Cards: [6, 10, 4] Result: Lose 
20
152
Before
1_26
After
1_27
Epsilon: 0.9727399999999727 Training Episodes: 8637
Win
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [3, 1] Result: Win 
4
48
Before
0000000000_0000000000
After
1_1
Lose
function called, Action decided: Hit Dealer Cards: 2 Player Cards: [3, 1, 10] Result: Lose 
14
58
Before
3_12
After
3_13
Epsilon: 0.9727199999999727 Training Episodes: 8636
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 7] Result: Lose 
17
237
Before
14_56
After
14_57
Epsilon: 0.9726999999999727 Training Episodes: 8635
Lose
function called, Action decided: Hit Dealer Cards: 8 Player Cards: [10, 4] Result: Lose 
14
190
Before
5_9
After
5_10
Epsilon: 0.9726799999999727 Training Episodes: 8634
Win
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 7] Result: Win 
17
237
Before
14_57
After
15_58
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [10, 7, 4] Result: Lose 
21
241
Before
0000000000_62
After
0000000000_63
Epsilon: 0.9726599999999727 Training Episodes: 8633
Lose
function called, Action decided: Hit Dealer Cards: 10 Player Cards: [8, 7] Result: Lose 
15
235
Before
17_38
After
17_39
Epsilon: 0.9726399999999726 Training Episodes: 8632
Wi

About

Python code to create blackjack player that uses reinforecement learnig

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published