This generally outlines my development plan. The plan is composed around two objectives, as follows:
- TrAE – Develop a framework to learn an intertask mapping function similar to the Reinforcement Transfer Learning framework [Ammar] based on building common feature subspaces between tasks/domains. Rather than using a three-way Restricted Boltzmann Machine where training is based on an energy function, use a denoising autoencoder where training is based on reconstruction error.
- Extend the use of this framework towards cross-modal transfer using a custom set of tasks.
Towards Objective 1
Source Learning Agents and Basic Tasks
- Setup Mountain Car and Cart Pole tasks, and
- Setup an agent that can learn these. This agent will provide training and transfer samples for the TrAE (Transfer AutoEncoder).
Intertask Mapping – TrAE – Transfer Agent
- Setup the pipeline so that samples from the learning agents can be consumed by the training phase of a Three-way or Relational AutoEncoder.
- Develop or appropriate a relational denoising autoencoder
Transfer – Target Learning Agent
- Setup an RL agent that uses as instance-based algorithm (Least Squares Policy Iteration or Fitted-Q Iteration has been used).
Objective 1 provides the basic framework for completing Objective 2. It provides the mechanisms for RL and the framework for learning an intertask mapping function.
Towards Objective 2
- PUSH Task – Agent must learn to push a block to a goal
- PUSH STICKS – Agent must learn to use sticks as extensions to push a block to a goal
- HERD Task – Agent (shephard) must learn to use commands to direct another agent (herding dog) to herd other agents (sheep)
Other Objectives or shortcomings that it would be nice to address.
- Include the reward component in instance samples.
- methods of transfer (instance, parameter, representation) must match the target learning agent. It would be nice to find some way to generalize the transfer.