Blog

Home / Blog

Transfer Learning for SDAR models from small datasets

Dissertation, Machine Learning, Research
Developing representational and predictive models for SDAR (structural descriptor - activity relationships) on small datasets is a problem for in-silico modeling of compound efficacies in drug discovery and design. While there are large sets of toxicity data available, the information about the effect of a compound when related to a human activity endpoint (e.g. reduction of symptoms) comes from clinical trials data and reports in the market. The relative number of data points for efficacy is low compared to toxicity due, in part, to the relatively small number of drugs making it to market. The limited number of examples makes it difficult to train robust machine learning models especially with techniques that traditionally require many observations. Using such techniques; however, is desirable because of potential non-linearities in the relationships. Therefore,…
Read More

A Test for Developer Candidates

javascript, mean, nodejs, UXUI, Work
This was developed for testing out candidates for a junior developer position with MEAN stack like skills.   Introduction Much of what we do is take lots of information and display it in some meaningful way.  Sometimes that information will be from different sources and needs to be filtered, transformed, combined, and so on.     For this exercise, we’ll look at your ability to consume information from web services, process it, and display it to a user. The Domain The NIH and the National Library of Medicine provide a web service called PUG that we can use for free to pull information about millions of different chemical compounds and substances.  The tutorial for it can be found at the following URL: https://pubchem.ncbi.nlm.nih.gov/pug_rest/PUG_REST_Tutorial.html. In short, you can construct a URL to…
Read More

Yet Another BBQ Place in Conway

General, Restaurant Reviews
I like BBQ.  I like a sharp smoke flavor.  You can slice it, shred it, make it so tender that it falls off the bone, and I'll eat it up.  Dry-rubbed, sauced-up, spicy, tangy, sweet.  You name it - I like it.  If you do it right, present it right, and give me a good atmosphere to eat it - I'll be sure to sing your praises.  If you don't, then you're likely to get one of these; as it is with Fat Daddy's, the new BBQ restaurant that's opened up in Conway, Arkansas' downtown area.  Fat Daddy's comes to us from Russellville where they've enjoyed some success.  It was on the advice of an extended family member, and resident of that town, that I found myself trying it out.…
Read More
Categorizing Job Orders with a Naive Bayes Classifier

Categorizing Job Orders with a Naive Bayes Classifier

datascience, Machine Learning
Meridian Staffing has about 12,000 job orders from 2010 to present and each is assigned zero or more categories such as "Application Developer", "Project Manager", "Network Engineer", etc.  We regularly extract this job information from our Applicant Tracking System (Bullhorn) and load it into our Posse Analytics server for data analysis and reporting. Unfortunately, nearly 50% of these jobs are either not categorized or categorized as "Other Area(s)".  As MSS moves towards being a data-driven organization, categorization will inform activities like capacity and candidate pipeline planning.  As such, having good, clean data becomes  more and more important and we need to mitigate this issue. Naturally, the first line of approach is to address the source of the data.  But, while we may fix import processes and train people to correctly assign categories when entering…
Read More
JS : Splitting an array into batches

JS : Splitting an array into batches

Work
In order to meet the limits on a REST API call, I needed to split a batch of record IDs into batches the size of the call's limit.  Since I was doing this in NodeJS, I worked it out functionally with the .reduce() method.  [The reduce() method applies a function against an accumulator and each value of the array (from left-to-right) to reduce it to a single value.]  In this case, my single value target was an array of arrays. There trick here was use a little index magic and pass in an array of empty arrays as the initial value parameter where the number of empty arrays equaled the desired number of batches. The effect is that 1, 3, 5, and 7 go into the first batch and 2, 4, 6, and 8 go into the…
Read More

Dissertation update – October 2015

Dissertation, Updates
Quick review: I have a working version of a relational auto-encoder and have used it to learn a transfer function between two reinforcement learning tasks.  As has been done in other research, I've used state-action-state triplets as the training data.  My hypothesis is that a relational auto-encoder will build a common feature space for the transition dynamics between the two different reinforcement learning domains. There are two problems that I'm trying to solve.  Both deal with the learning algorithm for the interdomain mapping function.   The first addresses how the data enters training and the second is in the characteristics of the data once run through the trained model. Dealing with uncorrelated data Currently, the relational auto-encoder learns on pairs of triplets presented together by using standard back-propagation.  This approach could have…
Read More

Two-stage Learning Step for Backpropagation in a Relational Autoencoder

General
Working on gradient descent / backpropagation for a relational autoencoder.   I'm not really sure this is needed yet, so I have to build the testing framework for it all.  Separately I'm implementing a RL agent that uses Least Squares Policy Iteration to learn. Let X and Y be two sets of training examples of sizes N and M respectively Select $latex x \in X $ and $latex y \in Y $ randomly Feed $latex (x,y)$ forward Step 1 - Train X Hold w and x constant and minimize cost by treating y' as a parameter Calculate error on x side of cost function Backprop error Update W (and b) only on the X side Step 2 - Train Y Hold w and y constant and minimize cost by treating x' as a…
Read More
Preview Problem – Share the beer

Preview Problem – Share the beer

General
Recently, I shared a problem on Facebook.  It's typically called the "sharing wine" problem, but my friend, Cyrus, thought it'd be better with beer.  I agree, so I've modified these slightly. It went something like this: We have three containers of different sizes, 30L / 11L / 7L.  The 30L is filled with beer.  Empty exactly half of the 30L using only the 11L and 7L containers. I'll introduce some notation here so that we can talk about the answer.  Let's say that we create a triple (a, b, c) indicating the amount of beer in each container at any one time.  Say that we order these largest to smallest for convenience.  Let a = the 30L container, b = the 11L container, c = the 7L container. So, starting out we have (30,0,0).  Cyrus…
Read More

Fatal error: Call to undefined function the_posts_pagination() in /home/timdockins/public_html/wp-content/themes/total/index.php on line 44