## Tuesday, November 29, 2011

In the spirit of prudence -- I'm not sure what boundaries exist with respect to disclosure of my research topic to a larger, non-academic audience -- I haven't described exactly what my research topic is.  Hell, I haven't even given a general description.  And aside from the title of this post -- "Optimism" -- not much else will be disclosed.  That being said, I'm not feeling terribly optimistic lately about the progress of my research.  A timeline I charted a few months ago had my committee reviewing a first draft of my proposal at this point -- I have yet to even assemble and write a first draft.   The problem isn't the actual writing, per se, but the absence of a detailed and executable statistical analysis plan.  In an ideal situation, the statistical analysis plan would follow naturally from the research question/objective/hypothesis and data structure (cohort, case/control, RCT, cross-sectional, survey, etc.), but alas, an ideal is just that:  ideal and rarely realized.  In my situation, I know where I need to go -- Google Maps is cued up and ready to go -- but I can't find my car.  I'm not even sure I'm searching for it in the right parking lot.  But I continue to search, albeit with the occasional distraction:  the most recent being the writing of a Stata program that created a series of variables each containing a random shuffling of the numbers [1,6] with no two variables sharing the same sequence.  This exercise was for the hosting of a beer tasting/testing party by my wife and me where the sequence of beers each person would taste/test would differ and be randomly generated (per the Stata random number generator).  In a previous blog posting, I described my approach to verifying that no two variables shared the same number sequence; what follows is the entirety of the .do file I used to create and output the randomly shuffled number sequences.

capture log close _all
log using BeerTestRandomNumbers, name(log1) replace
datetime

// program:  BeerTestRandomNumbers.do
// task:  create list of beers for each person w/ list randomized for each person
// project:  Team Clisa Beer Testing/Tasting Party
// author:    cjt
// born on date:  20111024

// #0
// program setup

version 11.2
clear all
macro drop _all
set more off

// #1
// declare number of observations (beers to taste)
set obs 6

// #2
// set randomization seed to ensure reproducibility
set seed 20111119

// #3
// create format label for numbers...
label define beerf 1 "Ottakringer" 2 "Gosser" 3 "Stiegl" 4 "Puntigamer" ///
5 "Schwechater" 6 "Zipfer"

// #4
// create 28 variables containing numbers [1,6] randomly shuffled
foreach num of numlist 1/28 {
gen int seqnum' = _n
label values seqnum' beerf
gen randnum' = runiform()
sort randnum'
drop randnum'
}

// #5
// generate 27! variables for pairwise comparisons to verify that randomization // order isn't the same between any two variables.
// !!all randomization orders are different!!
forvalues i = 1(1)27 {
forvalues j = i'(1)27 {
display "-------------------------------------------------"
display "Variables being compared are seqi' and seq++j'"
gen vari'_j' = 1 if seqi' == seqj'
quietly sum vari'_j' if i' != j'
assert r(sum)' < 6
drop vari'*
}
}
*end

// #6
// label the variables w/ guest names (note that this code was taken from a Stata Journal
// article ("Speaking Stata:  Problems With Lists"), SJ3,2.
local vars "28 PARTY GUEST NAMES INSERTED HERE"
forvalues i = 1/28 {
local v : word i' of vars'
rename seqi' v'
}

codebook, compact

// #7
// -list- beer tasting key for each person
capture log BeerKeys close
log using BeerKeys, name(log2) replace
foreach var of varlist z_Todd-z_Unknown2 {
list `var', sep(6)
}

log close _all
exit