Draft of Analysis of Relational Autoencoder

Draft of Analysis of Relational Autoencoder

Dissertation, Updates
From code used in the paper "Gradient-based learning of higher-order image features" by Roland Memisevic, I've diagrammed the structure of the relational autoencoder.  Note that input is in the form of corrupted samples from two different sources (X and Y).   These are mapped, via 3rd-order tensor (W decomposed into wxf, wyf, and whf_in) to a hidden layer.  On the other side of the hidden layer, the activations of the hidden layer are split according to the activations of the inputs and their weights.  The actual output (corresponding to the input) is the dot product of the multiplicative activation of hidden and input with the transpose of the weights for that input.  Reconstructed output is based on the type of output needed (i.e. binary or real).     [1] R. Memisevic, “Gradient-based learning of higher-order image features,”…
Read More
Notes from Lazaric – Transfer in Reinforcement Learning

Notes from Lazaric – Transfer in Reinforcement Learning

Annotations, Dissertation
See more bibliography here Abstract Transfer in reinforcement learning is a novel research area that focuses on the development of methods to transfer knowledge from a set of source tasks to a target task. Whenever the tasks are similar, the transferred knowledge can be used by a learning algorithm to solve the target task and significantly improve its performance (e.g., by reducing the number of samples needed to achieve a nearly optimal performance). In this chapter we provide a formalization of the general transfer problem, we identify the main settings which have been investigated so far, and we review the most important approaches to transfer in reinforcement learning. Finds This paper presents a formal framework for transfer in reinforcement learning differentiating the algorithmic approaches by the knowledge transferred into instances, representation, and parameters. Goal: "identify the characteristics…
Read More

Development Plan

Dissertation, Updates
This generally outlines my development plan.  The plan is composed around two objectives, as follows: TrAE - Develop a framework to learn an intertask mapping function similar to the Reinforcement Transfer Learning framework [Ammar] based on building common feature subspaces between tasks/domains.  Rather than using a three-way Restricted Boltzmann Machine where training is based on an energy function, use a denoising autoencoder where training is based on reconstruction error. Extend the use of this framework towards cross-modal transfer using a custom set of tasks. Towards Objective 1 Source Learning Agents and Basic Tasks Setup Mountain Car and Cart Pole tasks, and Setup an agent that can learn these.  This agent will provide training and transfer samples for the TrAE (Transfer AutoEncoder). Intertask Mapping - TrAE - Transfer Agent Setup the…
Read More
Modality for Reinforcement Learning

Modality for Reinforcement Learning

Definitions, Dissertation, Research
In this post, I define modality within the context of a reinforcement learning agent.  As a neurophysiological concept, sensory modalities are fairly intuitive.  Hearing and vision are a typical example of two different sensory modalities.  Motor modalities are a little less intuitive, but not difficult to understand.  A motor modality may simply be any particular pattern of activated motor neurons during a movement. But, when talking about this in terms of a Reinforcement Learning agent, we have to talk about it in the appropriate formalism.  This formalism for a reinforcement learning agent is the Markov Decision Process.  In an MDP, the analogues for sensors and motors are within the definition of states and actions. In a MDP, states and actions are typically defined as vectors and the components of the vectors are called variables or features.  I will use the…
Read More

Thoughts on Modality in RL

Dissertation, Updates
I've been thinking about what modality means in terms of a reinforcement learning agent. I had initially thought about state features and action features subsets but this doesn't cover changes in kinematic structure or transition functions. But perhaps this is okay.  If I consider this from a purely neurophysiological influence then the analogues are straightforward.
Read More
Status of code writing for Dissertation

Status of code writing for Dissertation

Dissertation, Updates
Current Status 2015-06-24 : I've got the Gated AutoEncoder (3way) script currently running on their example data.  (http://www.cs.toronto.edu/~rfm/code/rae/index.html)  I need to get familiar with this code. 2015-06-21 : 22:49 hrs : I found a java package from Brown called BURLAP that will work with my current setup (i.e. the RL-Glue arrangement).  It has Fitted Value Iteration as well as Least Squares Policy Iteration.  LSPI was used by Ammar in his research on learning an autonomous intertask mapping function using a Three-way RBM.  This paper is foundationally a jump off point.  My research will partly be in the use 3-way Denoising Autoencoder to build a common feature subspace between two tasks.  This will play a major role as a shared feature node in a hierarchical structure. I also have found a 3-way denoising auto-encoder…
Read More