Development Plan

Home / Research / Dissertation / Development Plan

This generally outlines my development plan.  The plan is composed around two objectives, as follows:

  1. TrAE – Develop a framework to learn an intertask mapping function similar to the Reinforcement Transfer Learning framework [Ammar] based on building common feature subspaces between tasks/domains.  Rather than using a three-way Restricted Boltzmann Machine where training is based on an energy function, use a denoising autoencoder where training is based on reconstruction error.
  2. Extend the use of this framework towards cross-modal transfer using a custom set of tasks.

Towards Objective 1

Source Learning Agents and Basic Tasks

  1. Setup Mountain Car and Cart Pole tasks, and
  2. Setup an agent that can learn these.  This agent will provide training and transfer samples for the TrAE (Transfer AutoEncoder).

Intertask Mapping – TrAE – Transfer Agent

  1. Setup the pipeline so that samples from the learning agents can be consumed by the training phase of a Three-way or Relational AutoEncoder.
  2. Develop or appropriate a relational denoising autoencoder

Transfer – Target Learning Agent

  1. Setup an RL agent that uses as instance-based algorithm (Least Squares Policy Iteration or Fitted-Q Iteration has been used).

Objective 1 provides the basic framework for completing Objective 2.  It provides the mechanisms for RL and the framework for learning an intertask mapping function.

Towards Objective 2

  1. PUSH Task – Agent must learn to push a block to a goal
  2. PUSH STICKS – Agent must learn to use sticks as extensions to push a block to a goal
  3. HERD Task – Agent (shephard) must learn to use commands to direct another agent (herding dog) to herd other agents (sheep)

Other Objectives or shortcomings that it would be nice to address.  

  1. Include the reward component in instance samples.
  2. methods of transfer (instance, parameter, representation) must match the target learning agent.  It would be nice to find some way to generalize the transfer.


%d bloggers like this: