A multi-agent cooperative reinforcement learning model using a hierarchy of consultants, tutors and workers

This paper proposes an algorithm for cooperative policy construction for independent learners, named Q-learning with aggregation (QA-learning). The algorithm is based on a distributed hierarchical learning model and utilises three specialisations of agents: Workers, tutors and consultants.

Từ khóa: Reinforcement learning, Q-Learning, Multi-agent system, Distributed system, Markov decision process, Factored Markov decision process