I am a PhD trained data scientist/engineer and Go developer with industry experience developing and deploying predictive models, optimization services, data pipelines, and more. I believe that data science/engineering services should be robust, testable, and easily deployable such that they can be iterated and improved over time. As such, I love exploring the world where cutting edge data science techniques meet up-and-coming languages and devops tools (such as Go and Docker).
On the Web:
- Data Scientist and Advocate, Pachyderm, 2016-Present
- Freelance Data Scientist/Engineer and Golang Developer, 2015-present, working with clients including:
- Data Scientist, Telnyx, 2015-2016
- Data Science Mentor, Thinkful, 2015-present
- Technical Specialist, Marshall, Gerstein, and Borun, 2012-2015
- Research Assistant, Purdue University, 2008-2015
- High Performance Computing Intern, NCAR, 2008
- Languages: Go, Python, Julia, Matlab, Mathematica, Latex
- Machine Learning: Clustering, Classification, Anomaly Detection, Regression, Dimensionality Reduction, Recommendation
- Big Data: Spark, ELK, Cassandra, Redshift, S3, Pachyderm
- Databases: Postgres, MySQL, BoltDB, RethinkDB, SQLite, MongoDB, Cassandra, Elasticsearch
- Visualization: Matplotlib, Seaborn, Bokeh, Dashing, Kibana, Domo
- Devops/Misc.: Docker, Jenkins, Ansible, Git, Jupyter, RabbitMQ
A next gen telephony company wanted an efficient pricing service that would predict operating costs dynamically over time. I developed this service (initially in python, eventually in Go), including a custom probability based model, and improved cost predictions from 45% error to less than 10% error.
A company developing a scheduling application wanted a recommendation service for their users. I developed this service to integrate in their current infrastructure (React, RethinkdB, Go, REST), which involved asynchronously processing recommendation requests from all their users with custom machine learning algorithms.
A startup wanted immediate visibility into real-time revenue and profit. I built a service that processed data streaming into Redshift and logged aggregates to Logstash. This data was then immediately searchable via Elasticsearch and visualized via Kibana and Dashing.