- 2015-06-24 : I’ve got the Gated AutoEncoder (3way) script currently running on their example data. (http://www.cs.toronto.edu/~rfm/code/rae/index.html) I need to get familiar with this code.
- 2015-06-21 : 22:49 hrs : I found a java package from Brown called BURLAP that will work with my current setup (i.e. the RL-Glue arrangement). It has Fitted Value Iteration as well as Least Squares Policy Iteration. LSPI was used by Ammar in his research on learning an autonomous intertask mapping function using a Three-way RBM. This paper is foundationally a jump off point. My research will partly be in the use 3-way Denoising Autoencoder to build a common feature subspace between two tasks. This will play a major role as a shared feature node in a hierarchical structure.
- I also have found a 3-way denoising auto-encoder Python script. It was used to work with two images,.
- 2015-05-24 : 21:52 hrs : I was able to get the Sarsa(lambda) w/ Fourier series up and running. It appears to solve the Mountain Car task within a few episodes. This is consistent with the graph on the wiki page for the agent.
- 2015-05-24 : 21:30 hrs : Using code from RL Library / RL Glue, I’m able to successfully capture any number of <s,a,s’,r> quartuples from the Mountain Car and Cart Pole environments.
- Investigate BURLAP and implement a LSPI learning agent.
- Need to be able to save an agent after it has learned a good policy (or save the policy) so that I can retrieve samples on that policy.
- Get a learning agent to learn in the MC and CP tasks to some threshold.
- Implement a relational auto-encoder and train on samples from MC and CP tasks.
- Setup a pipeline for Learning -> Sampling -> Transfer Function Learning ->Transfer -> Learning
My ultimate goal is to create a working transfer learning framework between RL in one modality to RL in another. I have some ideas about what modality means, and that will necessarily direct the tasks involved in testing. But, I haven’t solidified the formality of modality, yet.
Ultimately, I want to build an autonomous Cross-modal Reinforcement Transfer Learning agent.
I’ve been working on getting code setup for doing reinforcement transfer learning. I last worked on being able to retrieve random samples from a task. It could be any task, but the one’s I’ve selected are standard tasks.
These are mainly for me so I don’t lose my way.
- Code for MountainCar can be found my Dropbox under Code/dissertation/MountainCar. It will output the <s,a,s’,r> quartuple.
- Code for the SimpleExperiment can be found under Code/dissertation/SimpleExperiment. This is responsible for running the experiment. For random sampling, we tell the other components to initialize and run one episode.
- Code for the RandomAgent can be found under Code/dissertation/RandomAgent. As the name suggests, the agent simply picks a random action and returns it.