Transfer in reinforcement learning is a novel research area that focuses on the development of methods to transfer knowledge from a set of source tasks to a target task. Whenever the tasks are similar, the transferred knowledge can be used by a learning algorithm to solve the target task and significantly improve its performance (e.g., by reducing the number of samples needed to achieve a nearly optimal performance). In this chapter we provide a formalization of the general transfer problem, we identify the main settings which have been investigated so far, and we review the most important approaches to transfer in reinforcement learning.
- This paper presents a formal framework for transfer in reinforcement learning differentiating the algorithmic approaches by the knowledge transferred into instances, representation, and parameters.
- Goal: “identify the characteristics shared by the different approaches of transfer in RL and classify them into large families”
- Proposes a taxonomy – Setting, transferred knowledge, and the objective.
- Transfer from source task to target task with fixed domain
- Transfer across tasks with fixed domain
- Transfer across tasks with different domains (INTERESTING)
Re: transfer across tasks with different domains:
Most of the transfer approaches in this case consider
the source-target scenario and focus on how to define a mapping between the source state-action variables and the target variables so as to obtain an effective transfer of knowledge.
Re: Categories of transfer
Instance Transfer – reuse of samples collected from [different] source tasks
Representation transfer – abstraction of some representation of the task/solution
Parameter transfer – parameters define the init and behavior of the algorithm. Parameters in the target task may be initialized based on information from the source task[s].
Re: Transfer across tasks with a fixed state-action space
Although not all the approaches reviewed in the next section Ω , they all rely on the implicit assumption that all explicitly define a distribution Ω , the tasks involved in the transfer problem share some characteristics in the dynam-ics and reward function and that by observing a number of source tasks, the transfer algorithm is able to generalize well across all the tasks in M.