Talks as slide-show pdfs
Scaled machine learning solutions often address problems of volume and velocity, but require weeks of effort to analyze each new dataset. True predictive platforms need to effortlessly handle many datasets, each with its own feature set, sparsity level, and underlying structure. We’ll describe an open-source inference engine and query layer that together constitute a Predictive Database offering a SQL-like interface for Bayesian queries. This talk will cover predictive database workflows, probabilistic models, and our recently open-sourced predictive database engine, Loom.
(2014:11, at SF Bayarea Machine Learning Meetup)
Scaled machine learning solutions often address problems of volume and velocity in B2C applications, but require weeks of effort to analyze each new dataset. In B2B applications, solutions need to scale over thousands of customers each with small-to-medium sized datasets. We propose scaling to thousands of datasets by building a Predictive Database that offers a SQL-like interface for Bayesian queries. This talk will cover predictive database workflows, probabilistic models, and our recently open-sourced predictive database engine, Loom.
(2014:09, at USF Analytics Seminar)
This talk develops the notion of a probabilistic program, with an extended example in multi-target tracking, followed by a generalization to arbitrary recursive programs. This talk essentially covers and links my Masters and Ph.D. research.
(2009:04, at Toyon Research Corporation)
Ph.D. thesis defense.
(2009:03, at Carnegie-Mellon)
What mathematical statements are meaningful? What mathematical statements are true? Conservatively, only proven facts are true, only what can be deduced from assumptions. And assumptions come from science, the logic of what "might be true" and hasn't yet been falsified. Just as Godel's 1st Incompleteness Theorem limits how much can be proven, an analogous theorem limits what can be scientifically tested, ie how much can be meaningful.
In this talk, I will formulate the incompleteness theorems and discuss their relation to math and resp. science. I will also suggest some heuristics for guessing what "might be true" (step 1 in the scientific method) and prove that they are all doomed to fail.
(2008:10, at Carnegie-Mellon's Math Grad Student Seminar)
Part 1. Overview of randomness in programming languages. Probability valuations on topologies instead of measures on sigma-algebras. Plotkin's probabilistic powerdomain.
Part 2. Problems with randomness and parallelism. Lattice models and uncertainty. A lattice model with local probability spaces?
(2008:04, at Carnegie-Mellon's Math Logic Seminar)
Part 1:
Slides for online viewing,
or without pauses for printing.
Part 2:
Slides for online viewing,
or without pauses for printing.
References and Background Reading:
Which programs are simple? Which programming languages are simple? We address these questions by considering the dual spaces of programs and languages:
By treating the space of programs semantically, we can build much larger maps than would be possible treating programs syntactically. By restricting to a well-behaved class of languages (combinatory algebras), we can define a single space of programs across all languages. The space of languages then has very nice struture: it is a smooth manifold. We describe how to "fit" a language to training data in the form of useful or commonly-used programs.
(2007:07, at Stephen Wolfram's 2007 NKS Conference)
Slides and pretty datasets.
Dana Scott observed that in a particular model of lambda-calculus with join, the CCC of closures has a very rich structure, modelling polymorphic types, dependent types, subtyping and power-typing, a universal type, and --most importantly-- "atomic" types like unit and bool. We demonstrate that also in the term fragment of this model, these atoms are definable, but that they differ from those in the full model.
(2007:03, at Carnegie-Mellon's Math Logic Seminar)