Talks as slide-show pdfs

Markov Chain Monte Carlo with Broken Samplers

Modern MCMC samplers balance approximation error with sampling cost, and can achieve larger data scale with lower total compute cost. I formalize the error-cost trade-off in the framework of POMDPs and advocate the strategy: (1) add a knob that controls accuracy-vs-cost of each sample; (2) find an annealing schedule to tune this knob over time to minimize total error or total variance.

(2016:09, at Rigetti Quantum Computing)

Slides for online viewing.

A Data-Oriented Approach to Program Synthesis

This talk discusses how to leverage a curated knowledge base of facts in a sketching-based approach to program synthesis. Specifically, it discusses the synthesis algorithm implemented in the Pomagma inference engine.

(2016:04, at the Kestrel Institute)

Slides for online viewing.

Veritable: A Predictive Database Engine

Scaled machine learning solutions often address problems of volume and velocity, but require weeks of effort to analyze each new dataset. True predictive platforms need to effortlessly handle many datasets, each with its own feature set, sparsity level, and underlying structure. We’ll describe an open-source inference engine and query layer that together constitute a Predictive Database offering a SQL-like interface for Bayesian queries. This talk will cover predictive database workflows, probabilistic models, and our recently open-sourced predictive database engine, Loom.

(2014:11, at SF Bayarea Machine Learning Meetup)

Slides for online viewing.

Loom: an open-source predictive database engine

Scaled machine learning solutions often address problems of volume and velocity in B2C applications, but require weeks of effort to analyze each new dataset. In B2B applications, solutions need to scale over thousands of customers each with small-to-medium sized datasets. We propose scaling to thousands of datasets by building a Predictive Database that offers a SQL-like interface for Bayesian queries. This talk will cover predictive database workflows, probabilistic models, and our recently open-sourced predictive database engine, Loom.

(2014:09, at USF Analytics Seminar)

Slides for online viewing.

Bayesian Inference with Probabilistic Programs

This talk develops the notion of a probabilistic program, with an extended example in multi-target tracking, followed by a generalization to arbitrary recursive programs. This talk essentially covers and links my Masters and Ph.D. research.

(2009:04, at Toyon Research Corporation)

Slides for online viewing.

Automated Equational Reasoning in Nondeterministic λ-Calculi Modulo Theories H*

Ph.D. thesis defense.

(2009:03, at Carnegie-Mellon)

Slides for online viewing.

Meaning in Mathematics --or-- Belief as Irrefutability

What mathematical statements are meaningful? What mathematical statements are true? Conservatively, only proven facts are true, only what can be deduced from assumptions. And assumptions come from science, the logic of what "might be true" and hasn't yet been falsified. Just as Godel's 1st Incompleteness Theorem limits how much can be proven, an analogous theorem limits what can be scientifically tested, ie how much can be meaningful.

In this talk, I will formulate the incompleteness theorems and discuss their relation to math and resp. science. I will also suggest some heuristics for guessing what "might be true" (step 1 in the scientific method) and prove that they are all doomed to fail.

(2008:10, at Carnegie-Mellon's Math Grad Student Seminar)

Slides for online viewing, or without pauses for printing.

Models of randomness in programming languages

Part 1. Overview of randomness in programming languages. Probability valuations on topologies instead of measures on sigma-algebras. Plotkin's probabilistic powerdomain.

Part 2. Problems with randomness and parallelism. Lattice models and uncertainty. A lattice model with local probability spaces?

(2008:04, at Carnegie-Mellon's Math Logic Seminar)

Part 1: Slides for online viewing, or without pauses for printing.
Part 2: Slides for online viewing, or without pauses for printing.

References and Background Reading:

Mapping the space of Programs, Searching the space of Languages

Which programs are simple? Which programming languages are simple? We address these questions by considering the dual spaces of programs and languages:

By treating the space of programs semantically, we can build much larger maps than would be possible treating programs syntactically. By restricting to a well-behaved class of languages (combinatory algebras), we can define a single space of programs across all languages. The space of languages then has very nice struture: it is a smooth manifold. We describe how to "fit" a language to training data in the form of useful or commonly-used programs.

(2007:07, at Stephen Wolfram's 2007 NKS Conference)

Slides and pretty datasets.

Definable (types-as-)closures in concurrent combinatory algebra

Dana Scott observed that in a particular model of lambda-calculus with join, the CCC of closures has a very rich structure, modelling polymorphic types, dependent types, subtyping and power-typing, a universal type, and --most importantly-- "atomic" types like unit and bool. We demonstrate that also in the term fragment of this model, these atoms are definable, but that they differ from those in the full model.

(2007:03, at Carnegie-Mellon's Math Logic Seminar)

Slides for online viewing, or without pauses for printing.