CS-500: Multiagent Reinforcement Learning

Rutgers University
Spring 2007
Michael L. Littman, Enrique Munoz de Cote


Spring 2007
Time: Fridays 2pm - 3:30pm
Place: CoRE B (CoRE 305)

Description

We'll be meeting to read and discuss papers on multiagent reinf orcement learnin g (M ARL). Specifically , we'll take a single agent perspective on what policy i t sho uld learn when the envir omnent it interacts with is composed by many other learning agents. Assuming self-interested agents, and focusing on self play w e will focus on three types of papers . The first t y pe deal with agents that explicitly learn equilibria , the sec ond deal with a gents that learn a best response to the joint action s o f its ad versaries (teamma tes) an d the third learn policies that sa tisfy multiple ot h er criteria (not just equ ilibria or best response).

Schedule

1/26/07: M. Littman, 1994
2/02/07: Hu and Wellman, 1998, M. Littman, 2001
2/09/07: Shoham and Powers, 2003
2/16/07: Claus and Boutilier, 1998, Littman and Stone, 2001
2/23/07: Michael's talk
3/02/07: Weinberg and Rosenschein, 2004
3/09/07: Zinkevich et al., 2005 (presented by Pavel)
3/16/07: **SPRING BREAK**
3/23/07: Bowling and Veloso, 2001 (presented by John)
3/30/07: Banerjee and Peng, 2003 (presented by Rhonda)
4/06/07: Powers and Shoham, 2004 (presented by Chris)
4/13/07: Crandall and Goodrich, 2005 (presented by Mangesh)
4/20/07: Munoz de Cote et al.,2006(presented by Robert) && Powers et al., 2006 (presented by Monica)
4/27/07: Greenwald et al.,  (presented by Ali) && Bowling, 2004 (presented by Bert)
5/04/07: Greenwald et al., 2002 (presented by Marwan)

Paper Bank

1. Equilibrium learners
(a) M. Littman, 1994
(b) Hu and Wellman, 1998
(c) M. Littman, 2001
(d) Greenwald et al., 2002
2. Best response learners
(a) [Uther and Veloso, 2003], Claus and Boutilier, 1998
(b) Littman and Stone, 2001 (this is built on an asymmetric setting -non self play)
(c) Weinberg and Rosenschein, 2004
(d) Zinkevich et al., 2005
3. Multiple criteria learners (security and convergence)
(a) Bowling and Veloso, 2001, [Veloso and Bowling, 2002], Banerjee and Peng, 2003
(b) Bowling, 2004
(c) Shoham and Powers, 2003
(d) Crandall and Goodrich, 2005
(e) Munoz de Cote et al.,2006, Powers et al., 2006

Please contact Enrique Munoz de Cote (jemc AT ecs.soton.ac.uk) with any questions