What I do

Current Things

Earlybird Software, Data Scientist

At the beginning of a project I typically gather data from third-party sources and pipeline it into a format that we can use, on our infrastructure. This usually means integrating with an API, but it has also meant scraping websites, or cleaning and deduping data stored in spreadsheets or text files. I’ve even gone so far as to build a parser to automate transforming Word doc bullet points into database tables. (scream emoji + wilhelm scream sound)

Over the lifespan of a project, I figure things out about this data and the data we generate, and present these findings to stakeholders. Depending on the question, this has spanned the gamut from clustering customers to sussing out the network relationships between different business locations to training models with the aim of predicting people’s future behavior.

Open Source Software

Author, multicolor package, July 2018 (accepted to CRAN August 2018)

Author, postal package, June 2018 (accepted to CRAN July 2018)

Co-author, cowsay package, June 2018

Reviewer & contributor, rOpenSci oec package, June 2018, August 2018

Contributor, owmr package, October 2018

Contributor, rOpenSci rodev package, October 2018

Co-author, rOpenSci monkeylearn package, February 2018

Co-author, rOpenSci roomba package, May 2018

Contributor, beepr package, May 2018

What I Use

  • R
    • Tidyverse & base
    • Web scraping, working with RESTful APIs
    • Supervised machine learning, network analysis, cluster anlaysis
    • RMarkdown for reproducible presentations
  • git, GitHub, BitBucket
  • SQL (MySQL, Postgres)
  • drake for GNU Make pipelines
  • Continuous integration on Travis and Appveyor
  • Unit testing with testthat, code coverage on Codecov
  • Containerized environments with Docker
  • AWS (EC2, RDS, S3) e.g., installing R and configuring RStudio server, working with the S3 API
  • Some Python, Shiny, Spark
  • The bare minimum of JavaScript, HTML, CSS :)

Former Things

Experience and Cognition Lab, University of Chicago, Lab Manager

I ran traditional hypothesis tests and other statistical analyses on experimental data collected in the lab. I also contributed to the design of experiments, tended to the lab webiste, and programmed online experiments run on Amazon Mechanical Turk.

Behavioral Biology Lab, University of Chicago, Research Fellow

I designed a behavioral economics experiment to separate baseline risk preference from irrational risk aversion. The approach subtly varied risk and expected values in a novel gambling game I programmed. We also measured participants’ physiological levels of stress hormones to study the effect of stress on decision making under uncertainty.

Education

University of Chicago

Degree: Bachelor of Arts (2015) with general and departmental honors in Psychology; minor in French Literature. June 2015.

Honors: Lillian Gertrude Selz Prize for Academic Excellence (2012), Dean’s List (2011–’15), Phi Beta Kappa honors society (2014)

Honors Thesis, Behavioral Economics: An Exploration of Stress, Gender, and Risk Preference in Financial and Prosocial Domains

Articles and other Contributions

Former Co-Organizer, RLadies Chicago, 2017-2018

Data Skeptic Beer-in-Hand Data Science article, 2018

MonkeyLearn Sentiment Analysis: article; source code, 2018

Captain, UChicago Women’s Ultimate Frisbee Team, 2015

Talks, etc.

2018 rOpenSci unconf, Seattle, WA, May 2018

2018 class of NASA Datanauts, January 2018

Recipient of the 2018 rstudio::conf Diversity Scholarship, San Diego, CA, October 2017

RLadies Chicago, “Oktoberfest Edition: Beer-in-Hand Data Science,” Microsoft Technology Center, October 2017 Accompanying Interview at Earlybird Software, December 2017

Chicago Women’s Ultimate Summit, “Women in Chicago Ultimate Data Analysis,” Chicago, IL, February 2017