August 22, 2014

Looking at Data: 2014 Tour of Utah

It’s been a while since I posted anything in my Looking at Data series, but with the 2014 edition of the Larry H. Miller Tour of Utah wrapping up last week I thought I’d take a look at the data, since I’m, you know, into cycling, and this race took place in my back yard, so to speak.

First off, where do the riders hail from?

Not too surprising for a Tier-2 race, most riders come from the host country, in this case the U.S.

Second up, the stage results:

Finally, some meat and potatoes:

It’s interesting to note that the spread in time by the end of the race (stage 7) is quite large. The rider with the cumulatively slowest time was over 1.5 hours behind the overall winner, Tom Danielson. Perhaps its due to a particularly poor showing in stages 4, 6, and 7? If you look at the box and whisker plot of time spread, you’ll notice that those stages have large spreads in times. Stage 4, from Ogden to Powder Mountain, was a pretty hilly stage, with a nasty climb to the mountain top finish. Stage 6, from Salt Lake City to Snowbird, was also a grueling stage, with over 12,000 ft of elevation gain, and Stage 7, although a little shorter, featured two really hard climbs at Wolf Creek Pass and Empire Pass. The common denominator, besides the endless vertical racked up over the course of the stages, was the high altitude. Most of the mountain finishes, and the hard climbs, took place above 8000 feet in elevation. I bet a lot of riders weren’t used to that…

All that climbing is probably also why you see attrition increasing throughout the race, especially after stages 5 and 6. There was a 7% dropoff in the number of riders starting stage 7 compared to stage 6!

April 30, 2014

Looking at Data: 2007 NSF research expenditures and research outcomes

In this week’s installment of Looking at Data, we’re digging into data from the National Science Foundation and NASA. The data can be found at, and covers research grants from the NSF and NASA from 2007, and how these grants fared in terms of publications and conference proceedings. They have data all the way up until FY 2013, but I chose 2007 because research takes time, and well, I wanted to have as much data as possible on research outcomes (in this case publications and conference proceedings).

The question I wanted to look at this week is whether “scrappiness is the Mother of invention.” This came up in a conversation I had with my friend @DaveyDeMille over lunch last week. In other words, is being well-funded correlated with a higher research quality, as measured by the proportion of NSF grants that lead to a publication.

First off, just some descriptive stats:

Now moving onto our question:

So although this is pretty cursory, it appears that there’s not really a relationship between getting lots of NSF funding and lots of publication-yielding projects. I’d say, without digging more into this, that scrappiness is, if not the Mother, of invention, at least the aunt.

EDIT: I’ve been reading up on UX lately, following 52 Week of UX (awesome resource, by the way), and came to Week 4 today. They have an excellent post on how constraints force creativity: go read it here: Constraints Fuel Creativity.