## Tuesday, January 31, 2012

### PFS and OS Curves Overlaid

The plotting of Kaplan-Meier (K-M) survival curves is about as straightforward and basic as you can get when doing survival analysis.  The curves, however, are simply a plotting of the values obtained when calculating the Kaplan-Meier estimate (product-limit method) of the survivor function.  Briefly explained, the estimates are obtained by the successive multiplication of the proportion of subjects that haven't yet "failed" divided by the number of subjects alive at the beginning of a respective interval.  Each interval is designed such that one "fail" (death) time is contained in the interval.  Subjects whose outcomes are censored (e.g. lost to follow-up, study terminated before reaching outcome, etc.) are included in the interval in which they occur although they are not used to calculate the probability of survival/failure at any subsequent time point.  Symbolically, the K-M survivor function can be expressed thus,

where n is the number of subjects alive in the kth interval and d is the number that fail (death).  As instructive as it is to manually calculate the survivor function and plot it, the ease with which you can generate and output the curves using Stata, SAS, and R relegates any pencil-and-paper calculations to being a learning exercise (I'll be using Stata in this post).

In a current collaboration with a friend of mine we needed K-M curves for overall survival (OS) and progression-free survival (PFS) and we needed them plotted on the same graph.  (A previous post explained the differences between the primary definitions of survival used in the medical literature, OS & PFS obviously included.)  This is a reasonable and, superficially, simple request but upon further thought about how to implement this in Stata it became less obvious.  In Stata you must declare your data to be survival time via -stset- before invoking any survival analysis commands and in doing so you specify when subjects enter the analysis time, when they exit, and what constitutes censorship.  These parameters -- particularly the censorship variable -- are going to differ depending on whether you are examining overall survival, progression-free survival, or event-free survival.  So in order to get around this, I -stset- the data twice -- first for OS then again for PFS -- and saved the survivor function from each.  The syntax is simple:  -stsset- followed by a -sts gen- for OS then again for PFS after you've cleared the first -stset-.  The code follows:

* **OS
stset dox_os, failure(status==1) origin(dxdt)
sts gen s_os = s        // survivor function
gen double t_os = _t    // analysis time

* **PFS
stset, clear
stset dox_pfs, failure(death==1) origin(dxdt)
sts gen s_pfs = s       // survivor function
gen double t_pfs = _t   // analysis time

In order to overlay the two curves on the same graph I initially thought I would need to use the -twoway line yvar1 xvar || line yvar2 xvar, sort - syntax but this returned a graph that properly plotted one curve but not the other.  I think I may have eventually had success with this approach but not without manipulating the data structure and creating two "fake" groups -- one for OS and one for PFS -- then plotting using a -by- option.  I wasn't convinced this was the most efficient and elegant way so I searched the Statalist archives and, as expected, someone else had posed this same problem a few years ago.  Turns out the solution requires nothing more than the addition of -addplot- to the -sts graph- command where the survivor function you are adding is the one saved from a previous -stset-.  The code, as well as the resultant graph, follow:

sts graph, addplot(line s_os t_os, sort c(J) ///
legend(order(1 "PFS" 2 "OS"))) scheme(sj) ytitle("Survival") ///
xtitle("Days Post Diagnosis") ylabel(0(.25)1) ytick(0(.25)1) ///
xlabel(0(5)25) xtick(0(1)25) title("Survival:  PFS vs OS")

## Friday, January 27, 2012

### Pedal Faster!

 Pedal faster!
I think I might be getting a little soft because, well, I'm substituting more and more sessions on the bicycle trainer rather than heading out for cold runs in the dark.  I even feel kind of bad about it, but when it's windy and the thermometer is below freezing my motivation plummets and the trainer looks even more inviting.  I concede it can be a bit dull at times and not knowing either the distance cycled or the effort expended is a bit annoying but I suppose when the elements are just too daunting, it'll do.  So much for personal grit.  Oh well, at least I can antagonize the cats while spinning by shouting in a high-pitched voice, "Get out of the way, kitties!", when they wander in front of the trainer.

## Thursday, January 26, 2012

### The Reel Deal

A guest columnist over at iRunFar just posted an article about the evolution and maturity of running videos and how their quality is improving and their appeal is broadening.  I think that may be understating it.  I've posted links to videos before featuring Anton Krupicka and Killian Journet and although those two focus on two specific athletes, these two videos are more general.  The first one was put together by Matt Hart and is, essentially, a celebration of mountain and trail running.  Awesome.  The second is an ultrarunner's contribution to the budding "Sh*t __________ Say" videos sprouting up all over YouTube.  Pretty amusing.

Props for the song selection on this one:

My favorite line:  "If you can run a marathon you can run a 50k."  Ha!