Semester7

Notes of courses done/attended in semester 7 in college

Lecture 29

Lecture 29

Q-learning

come up with reward and policy
take seq of decision
so that it is maximized by Q function
reward depends upon game

Inverse RL

learning reward by observing an agent who is doing things in right way
and ek bar reward a gaya, tab policy nikal sakta

Supervised Learning

student ka ht,wt,age tuple vs time to run 100m ka data hai
given a new student classify how good he is
this is supervised
classification
regression
trying to learn a function to do classification or regression