A multi-agent cooperative reinforcement learning model using a hierarchy of consultants, tutors and workers

This paper proposes an algorithm for cooperative policy construction for independent learners, named Q-learning with aggregation (QA-learning). The algorithm is based on a distributed hierarchical learning model and utilises three specialisations of agents: Workers, tutors and consultants.