As an experienced Python programmer I have contributed
to large operational code bases, built upon medium-sized
open source toolkits, and developed small data processing
toolkits from the ground up. As a data scientist at Shopify,
I use Python daily to process and analyze data.
In the past, I have contributed to
an open source entity resolution toolkit.
I am proficient in building scalable data pipelines
with Apache Spark using the Spark RDD, DataFrame, DStream,
Structured Streaming, and SparkML APIs. I also have experience
debugging high-volume data pipelines in a production environment.
Through this work I also have basic familiarity with
related technologies such as Apache Kafka, HDFS, and
Google Cloud Services.
I am proficient in the use of SQL for data analysis and reporting,
in the context of a world-class data warehouse powered by Google
Cloud Services, Apache Spark, and Presto. Despite the language's limitations,
I hold my SQL queries to the same high level of quality, maintainability,
and readability I strive for in traditional programming languages.
I have three years' experience using R's powerful libraries for
data analysis and data visualization. In addition to data management,
data visualization with ggplot2, and classical
statistical inference, I use R for structural social network analysis,
textual topic modelling, and psychometrics.
I have basic proficiency with scikit-learn
having used the toolkit to
impute missing variables;
encode categorical data;
train Random Forest and Logistic Regression models;
cross-validate predictive models; and apply predictions.
My scikit-learn skills are complemented by
significant experience processing and analyzing data
with pandas and numpy.
I am proficient in developing small- and medium-sized object-oriented programs in C++.
My academic training in C++ included a collaborative software development project,
best practices for object oriented software development, and core language features.
I have basic proficiency in Scala, having completed course-based assignments
reviewing the fundamentals of functional and object-oriented programming
I am proficient in MatLab, having completed course-based assignments
involving data interpolation, error analysis, image compression, signal processing, and
computational linear algebra.
My other technical proficiencies include HTML, Git,
Markdown, Latex, Bash, Zsh, and ReStructured Text.
My workflow features IntelliJ PyCharm, IntelliJ DataGrip,
R Studio, and Atom.