Status of code writing for Dissertation

Home / Research / Dissertation / Status of code writing for Dissertation

Current Status

  • 2015-06-24 : I’ve got the Gated AutoEncoder (3way) script currently running on their example data.  (http://www.cs.toronto.edu/~rfm/code/rae/index.html)  I need to get familiar with this code.
  • 2015-06-21 : 22:49 hrs : I found a java package from Brown called BURLAP that will work with my current setup (i.e. the RL-Glue arrangement).  It has Fitted Value Iteration as well as Least Squares Policy Iteration.  LSPI was used by Ammar in his research on learning an autonomous intertask mapping function using a Three-way RBM.  This paper is foundationally a jump off point.  My research will partly be in the use 3-way Denoising Autoencoder to build a common feature subspace between two tasks.  This will play a major role as a shared feature node in a hierarchical structure.
  • I also have found a 3-way denoising auto-encoder Python script.  It was used to work with two images,.

History

  • 2015-05-24 : 21:52 hrs : I was able to get the Sarsa(lambda) w/ Fourier series up and running.  It appears to solve the Mountain Car task within a few episodes.  This is consistent with the graph on the wiki page for the agent.
  • 2015-05-24 : 21:30 hrs : Using code from RL Library / RL Glue, I’m able to successfully capture any number of <s,a,s’,r> quartuples from the Mountain Car and Cart Pole environments.

Next Steps

  1. Investigate BURLAP and implement a LSPI learning agent.
  2. Need to be able to save an agent after it has learned a good policy (or save the policy) so that I can retrieve samples on that policy.
  3. Get a learning agent to learn in the MC and CP tasks to some threshold.
  4. Implement a relational auto-encoder and train on samples from MC and CP tasks.
  5. Setup a pipeline for Learning -> Sampling -> Transfer Function Learning ->Transfer -> Learning

Goals

My ultimate goal is to create a working transfer learning framework between RL in one modality to RL in another.   I have some ideas about what modality means, and that will necessarily direct the tasks involved in testing.  But, I haven’t solidified the formality of modality, yet.

Ultimately, I want to build an autonomous Cross-modal Reinforcement Transfer Learning agent.

Informational

I’ve been working on getting code setup for doing reinforcement transfer learning.  I last worked on being able to retrieve random samples from a task.  It could be any task, but the one’s I’ve selected are standard tasks.

MountainCar-Envirornment

These are the Mountain Car and Cart Pole or Inverted Pendulum tasks.

Notes

These are mainly for me so I don’t lose my way.

  • Code for MountainCar can be found my Dropbox under Code/dissertation/MountainCar.  It will output the <s,a,s’,r> quartuple.
  • Code for the SimpleExperiment can be found under Code/dissertation/SimpleExperiment.  This is responsible for running the experiment.  For random sampling, we tell the other components to initialize and run one episode.
  • Code for the RandomAgent can be found under Code/dissertation/RandomAgent.  As the name suggests, the agent simply picks a random action and returns it.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">

%d bloggers like this: