EpsilonGreedyRunner(bandit_returns:List[float], epsilon:float=0.2, batch_size:int=10000, batches:int=10, simulations:int=100)
Class that is used to run simulations of Thompson sampling tests.
Attributes:
bandit_returns: List of average returns per bandit.
epsilon: Percentage of exploration.
batch_size: Number of examples per batch.
batches: Number of batches.
simulations: Number of simulations.
Methods:
init_bandits: Prepares everything for new simulation.
run: Runs the simulations and tracks performance.