Posts

Drake's Plan

I gave a talk on the drake package for workflow management to the wonderful RLadies of NYC.

In it, we hit the Twitter API to get NYCFireWire tweets, clean the raw tweet data, send the …

Read Post

Catching Kareem

Lighting round of basketball analysis!

My friend and coworker Brad, who designed this very blog, is a sports fan and curious person. He wanted to know whether Lebron James is on track to …

Read Post

98% green spaghetti, sliced and chopped

This is the latest stop in an analysis tour of free-range menu data.

One of the goals of fishing for real recipes is to be able to suss out patterns in how foods are combined and in what …

Read Post

Peeling back The Onion

In this post I’ll programmatically find The Onion article links, scrape them for content, and clean them up into a tidy format. I chose The Onion because while not real news, the site does a …

Read Post

Scraping Together a Recipe, Episode I

The Internet is full of amazing content. Like these names of actual recipes. Methodology for getting these to follow.

Recipe Name
Sea-Purb Seafood Pasta
Tuna Salad for Grown-ups …
Read Post

Scraping Together a Recipe, Episode II

One of the goals here is to see what portion of a menu tends to be devoted to, say, meat or spices or a word that appears in the receipe name etc. In order to answer that, we’ll need to …

Read Post

Scraping Together a Recipe, Episode III

Converting to Grams

Rather than rolling our own conversion dictionary, let’s turn to the measurements package that sports the conv_unit() function for going from one unit to another. For …

Read Post
Next Page