Brains are good, but not that good.

Machine Learning, Research
What’s wrong with the idea that the brain is an excellent learning machine? In machine learning and artificial intelligence research literature, we almost invariably see the argument that we need better algorithms that can learn from fewer training examples or that are one-shot learners. Along with that argument, typically, comes the assertion that humans learn well from few (or single) training examples. The problem is that people forget the neurophysiology of the brain. Notably, within the brain, neural firing is not a constant. At the cellular level and the macro/network-level of neurons, neuronal activity is continuously oscillating. If we’re looking at connectivity between neurons, we may count each oscillation as a training example. Naturally, the phrase ”what fires together, wires together” comes to mind. Say we get a ”single five-second…
Read More

Transfer Learning for SDAR models from small datasets

Dissertation, Machine Learning, Research
Developing representational and predictive models for SDAR (structural descriptor - activity relationships) on small datasets is a problem for in-silico modeling of compound efficacies in drug discovery and design. While there are large sets of toxicity data available, the information about the effect of a compound when related to a human activity endpoint (e.g. reduction of symptoms) comes from clinical trials data and reports in the market. The relative number of data points for efficacy is low compared to toxicity due, in part, to the relatively small number of drugs making it to market. The limited number of examples makes it difficult to train robust machine learning models especially with techniques that traditionally require many observations. Using such techniques; however, is desirable because of potential non-linearities in the relationships. Therefore,…
Read More
Categorizing Job Orders with a Naive Bayes Classifier

Categorizing Job Orders with a Naive Bayes Classifier

datascience, Machine Learning
Meridian Staffing has about 12,000 job orders from 2010 to present and each is assigned zero or more categories such as "Application Developer", "Project Manager", "Network Engineer", etc.  We regularly extract this job information from our Applicant Tracking System (Bullhorn) and load it into our Posse Analytics server for data analysis and reporting. Unfortunately, nearly 50% of these jobs are either not categorized or categorized as "Other Area(s)".  As MSS moves towards being a data-driven organization, categorization will inform activities like capacity and candidate pipeline planning.  As such, having good, clean data becomes  more and more important and we need to mitigate this issue. Naturally, the first line of approach is to address the source of the data.  But, while we may fix import processes and train people to correctly assign categories when entering…
Read More