Notes from Ammar/Mocanu – Automatically Mapped Transfer Between Reinforcement Learning Tasks via Three-Way Restricted Boltzmann Machines

Notes from Ammar/Mocanu – Automatically Mapped Transfer Between Reinforcement Learning Tasks via Three-Way Restricted Boltzmann Machines

Annotations
Citation H. Ammar and D. Mocanu, “Automatically Mapped Transfer Between Reinforcement Learning Tasks via Three-Way Restricted Boltzmann Machines,” Mach. Learn. …, 2013. Abstract Existing reinforcement learning approaches are often hampered by learning tabula rasa.  Transfer for reinforcement learning tackles this problem by enabling the reuse of previously learned results, but may require an inter-task mapping to encode how the previously learned task and the new task are related.  This paper presents an autonomous framework for learning inter-task mappings based on an adaptation of restricted Boltzmann machines.  Both a full model and a computationally efficient factored model are introduced and shown to be effective in multiple transfer learning scenarios. Quotes & Notes Re:Random or not Unfortunately, learning in this model cannot be done with normal CD. The main reason is that if…
Read More
Random or Not?

Random or Not?

Updates
One of the basic questions that needs to be answered about using the autoencoder architecture to learn a mapping function between two domains is a question of randomness and of what model the autoencoder is learning. Do I have to pair correlated SARS samples together for input or can I, as with a probabilistic model  (See Ammar for TrRBM.), introduce pairs randomly?  
Read More
Notes from Memisevic – Gradient-based learning of higher-order image features

Notes from Memisevic – Gradient-based learning of higher-order image features

Annotations, Dissertation, Research
Citation Memisevic, Roland. “Gradient-Based Learning of Higher-Order Image Features.” Proceedings of the IEEE International Conference on Computer Vision (November 2011): 1591–1598. doi:10.1109/ICCV.2011.6126419. Abstract Recent work on unsupervised feature learning has shown that learning on polynomial expansions of input patches, such as on pair-wise products of pixel intensities, can improve the performance of feature learners and extend their applicability to spatio-temporal problems, such as human action recognition or learning of image transformations.  Learning of such higher order features, however, has been much more difficult than standard dictionary learning, because of the high dimensionality and because standard learning criteria are not applicable.  here, we show how one can cast the problem of learning higher-order features as the problem of learning a parametric family of manifolds.  This allows us to apply a variant…
Read More
Status of code writing for Dissertation

Status of code writing for Dissertation

Dissertation, Updates
Current Status 2015-06-24 : I've got the Gated AutoEncoder (3way) script currently running on their example data.  (http://www.cs.toronto.edu/~rfm/code/rae/index.html)  I need to get familiar with this code. 2015-06-21 : 22:49 hrs : I found a java package from Brown called BURLAP that will work with my current setup (i.e. the RL-Glue arrangement).  It has Fitted Value Iteration as well as Least Squares Policy Iteration.  LSPI was used by Ammar in his research on learning an autonomous intertask mapping function using a Three-way RBM.  This paper is foundationally a jump off point.  My research will partly be in the use 3-way Denoising Autoencoder to build a common feature subspace between two tasks.  This will play a major role as a shared feature node in a hierarchical structure. I also have found a 3-way denoising auto-encoder…
Read More