## Tuesday, September 25, 2012

### Going the Distance: Position in the Pack

On Saturday, September 22nd, I ran the Wörthersee Trail Maniak 57k ultramarathon (the race report is here) and as with the previous ultra I ran, I decided to see how my time stacked up against the rest of the field.  This exercise doesn't have anything to do with pursuit of my Ph.D. but it does, however, give me the excuse to program in Stata and generate a nifty graph.

As I stated in my race report, I felt like I was haulin' ass for most of this race but given my time goal (sub-seven hours for ~35 miles) I'd have to be moving.  And I was.  At least I thought I was until it seemed like the entire field was pulling away from me, one runner at a time.  My mediocre (relatively speaking) performance was confirmed once I examined the distribution of finishing times and my position therein.  The Stata code is pasted below with the graph following.
capture log close
log using worth57k_graph, replace
datetime

// program:  worth57k_graph.do
// task:  graph distribution of finish times w/ my time highlighted
// project:  drivel
// author:    cjt
// born on date:  20120925

// #0
// program setup

version 11.2
clear all
macro drop _all
set more off

// #1
// insheet CSV
insheet using "C:\Documents and Settings\cjt\Desktop\Worthersee57k\Worthersee57kResultsTrnsfr.csv"

// #2
// convert string time variables to numeric time variables
foreach var of varlist velden pyramiden finish {
gen double var'_temp = clock(var', "hms")
drop var'
rename var'_temp var'
format var' %tcHH:MM:SS
}
*end;

// #3
// identify quartiles, median, and mean for plotting on histogram
quietly su finish, det

// #4
// plot the finish times via a -histogram- along w/ lines denoting the 25th percentile,

//    median, mean, 75th percentile, as well as my time
hist finish, freq xtick(16200000(1800000)37800000) xlabel(18000000(3600000)36000000) ///
addplot(pci 0 27112000 38 27112000, lwidth(medthick) || pci 0 r(p25)' 40 r(p25)'  ///
0 r(p50)' 40 r(p50)'  0 r(p75)' 40 r(p75)', lpattern(shortdash) || pci 0 r(mean)' 40 r(mean)', ///
lpattern(longdash)) text(15 r(p25)' "25th Percentile",orientation(vertical) place(nw)) ///
text(15 r(p50)' "Median",orientation(vertical) place(nw)) text(15 r(mean)' "Mean", ///
orientation(vertical) place(ne)) text(15 r(p75)' "75th Percentile",orientation(vertical) place(ne)) ///
text(39 27112000 "Me", place(c)) text(0.5 27112000 "07:31:52", orientation(vertical) place(nw)) ///
scheme(s1color) legend(off) xtitle("Finish Time (Hours Elapsed)") ytitle("Frequency") ///
title("Worthersee Trail Maniak 57k") subtitle("Distribution of Finish Times")  ///

// #5
// -export- graph
gr export worth57k_graph.png, replace

log close
exit

The resulting graph:

As can be seen from the graph, I wasn't imagining the field pulling away from me.  The distribution of times is essentially normal --- the mean and median are nearly identical --- with my time approximately 30 minutes slower than that of the statistical middle-of-the-packer.  Ordinarily, I don't give much thought or attention to my relative position among the finishing times but with this race I couldn't resist simply because it was obvious while running the race that I'm either (a) running slower than it feels or (b) the field as a whole was running faster than I'm accustomed.  As is evident from the graph, it was probably a combination of both.