Pricing Policy Evaluation and Comparison with Temporal Difference Learning
There are two main kinds of problems in Reinforcement Learning (RL), evaluation of a policy aka the prediction problem and finding the best policy aka the control problem. A policy dictates the action to be taken given [...]

