Reinforcement Learning

April 20, 2016

The whole activity of having a models function with rewards and identifying the policy is called as planning.
Reinforcement Learning is learning in which the learner takes in transitions with source,desitination,action and rewards and identify the optimal policy in which we get the maximum reward.

States to action- Policy Search algorithm trying to find policy –> Direct Use but indirect Learning
Utility to State - Value function based approaches are used for observe the values for each state and actions.
Transition Function/Model - Predicts next states or rewards - Model Based –> Indirect Use but direct learning