About Me

I am a PhD trained data scientist with industry experience developing and deploying predictive models, optimization services, data pipelines, and more. Some of my main technical interests are:

  • Data science workflows on modern, cloud native infrastructure (multi-tenant clusters orchestrated, e.g., by Kubernetes)
  • Reproducibility and provenance for complex, distributed data pipelines
  • Machine learning interpretability
  • Portability and open standards for neural network architectures
  • Leveraging modern language primitives (e.g., in languages like Go) for machine learning and data science

On the Web:




Example Projects:

  • A next gen telephony company wanted an efficient pricing service that would predict operating costs dynamically over time. I developed this service (initially in python, eventually in Go), including a custom probability based model, and improved cost predictions from 45% error to less than 10% error.

  • A company developing a scheduling application wanted a recommendation service for their users. I developed this service to integrate in their current infrastructure (React, RethinkdB, Go, REST), which involved asynchronously processing recommendation requests from all their users with custom machine learning algorithms.

  • A startup wanted immediate visibility into real-time revenue and profit. I built a service that processed data streaming into Redshift and logged aggregates to Logstash. This data was then immediately searchable via Elasticsearch and visualized via Kibana and Dashing.