|
|
||||
|
|
Modular Reinforcement Learning (MRL)
Typical multiple-goal agent formulations decompose an agent into sub-agents with possibly different state spaces (to represent their different concerns) but shared action spaces (to represent that they are part of a single agent executing single actions). Previous work in multiple goal RL has taken the approach of arbitrating the preferences of sub-agents and selecting one of the sub-agents' preferred actions as the "winner" (Sprague and Ballard, 2003). While such an approach is intuitively appealing, we have shown that ideal arbitration satisfying a few reasonable requirements (universality, unanimity, independence of irrelevant attributes, scale invariance, and non-dictatorship) is impossible in general (Bhat, et.al., 2006). We are currently developing a meta-learning algorithm that performs arbitration of sub-agent action preferences. Our work is focusing on relaxing the non-dictatorship property of ideal arbitrators and providing the arbitrator itself with a reward signal. In addition to making arbitration possible, our arbitrator function and its reward signal could encode agent preferences, leaving sub-agents to be coded "selfishly," with their own local reward signals and state abstractions, ignoring any other subgoals an agent might have. This subgoal independence would facilitate transfer and modularity, allowing a subgoal coded for one agent to be reused in another agent.
|
|||
Copyright © 2008 by Chris Simpkins
<simpkins at cc dot gatech dot edu>Made with BlogMax
|
||||