Thursday, August 30, 2012

Logit Model Maximum Likelihood Estimate

In the previous post, I derived the log likelihood for the logit model.  In this post, I'll pick up where I left off and present the score functions followed by the maximum likelihood estimates (MLEs) for $\alpha$ and $\beta$.  

The score vector, $U(\theta) = [U(\theta)_\alpha \: U(\theta)_\beta]^T$, is arrived at by taking the derivative of the log likelihood with respect to $\alpha$ and $\beta$, respectively.

Recall the log likelihood: $\ell(\theta) = (m_1)\alpha + a\:\beta - (n_1)log(1 + e^{\alpha + \beta}) - (n_2)log(1 + e^{\alpha})$

Now take the derivatives (recall the properties of taking derivatives of logs) then substitute the $\alpha$ and $\beta$ expressions with the equivalent $\pi_i$ expressions. 

$U(\theta)_\alpha = \frac{\partial \ell}{\partial \alpha} = m_1 - n_1\frac{e^{\alpha + \beta}}{1 + e^{\alpha + \beta}} - n_2\frac{e^{\alpha}}{1 + e^{\alpha}} = m_1 - n_1\pi_1 - n_2\pi_2$

$U(\theta)_\beta = \frac{\partial \ell}{\partial \beta} = a - n_1\frac{e^{\alpha + \beta}}{1 + e^{\alpha + \beta}} = a - n_1\pi_1$

In order to obtain the MLEs, the two score equations above need to be set equal to zero then solved for the desired parameter but because there are two equations with two unknowns, let's first take the difference, $U(\theta)_\alpha - U(\theta)_\beta$.

$U(\theta)_\alpha - U(\theta)_\beta = m_1 - n_1\pi_1 - n_2\pi_2 - a + n_1\pi_1 = m_1 - n_2\pi_2 - a$

Solving for $\pi_2$ we then get $\pi_2 = \frac{m_1 - a}{n_2}$ which can be re-expressed in terms of $\alpha$ and $\beta$ as well as the cell values from a standard 2x2 table configuration. 

Substituting, we have $\frac{e^{\alpha}}{1 + e^{\alpha}} = \frac{b}{n_2}$ which is now in a form that can be solved for $\hat a$:

$\frac{e^{\hat \alpha}}{1 + e^{\hat \alpha}} = \frac{b}{n_2} \ \Rightarrow \ e^{\hat \alpha} = \frac{b}{n_2}(1 + e^{\hat \alpha}) \ \Rightarrow \ e^{\hat \alpha} = \frac{b}{n_2} + \frac{b}{n_2}e^{\hat \alpha}$

Following more algebraic manipulation and the recognition that the difference between $n_2 - b$ is $d$ (per the configuration of a standard 2x2 table), you get $e^{\hat \alpha} = \frac{b}{d}$ but since $\hat \alpha$ is sought, the natural log must be taken of both sides, resulting in the MLE for $\alpha$ as $\hat \alpha = log(\frac{b}{d})$.

The MLE for $\beta$ follows in a similar manner (set the $\beta$ element from the score vector to zero then solve by way of substitution and algebraic manipulation). 

$U(\theta)_\beta = a - n_1\pi_1 = a - n_1\frac{e^{\hat \alpha + \hat \beta}}{1 + e^{\hat \alpha + \hat \beta}}  = 0$

We know that $\hat \alpha = log(\frac{b}{d})$ and if we substitute accordingly, we get:
$a - n_1\frac{e^{log(\frac{b}{d}) + \hat \beta}}{1 + e^{log(\frac{b}{d}) + \hat \beta}} = a - n_1\frac{{(\frac{b}{d})e^{\hat \beta}}}{1 + (\frac{b}{d})e^{\hat \beta}} = 0$

After more algebra, grouping like terms $(e^\beta)$, and making appropriate substitutions, we eventually get $e^{\hat \beta} = \frac{ad}{bc}$ which can be natural logged to obtain $\hat \beta$ such that $\hat \beta = log\Bigl(\frac{ad}{bc}\Bigr)$.  The ratio of $ad$ to $bc$ in a binary logit setting is, of course, the odds ratio, thus the MLE of $\beta$ is the log odds ratio, $\hat \beta = log(OR)$.

Wednesday, August 29, 2012

What an idiot!

Photo credit:
Sometimes I remember and reflect on various experiences I've had over the years and although I've made some good decisions and been on the receiving end of some fortuitous circumstances, I've also done things and acted in ways that elicit a huffy "What an idiot!" reaction.  

One such occasion was a company holiday party I attended in (I think) 1997.  My then girlfriend (maybe she was my fiance at that time?) --- and who would eventually become my ex-wife after a long 16 months of marriage --- worked for a now-defunct car company (Saturn) at the height of Saturn's glory.  The cars were made in America at a spiffy auto manufacturing plant somewhere in Tennessee and the dealerships embraced a "no-haggle" sales policy --- the company revolutionized the car industry, at least for a short-time.  At any rate, the dealerships were locally owned by one businessman (I think there were two Saturn dealerships in the Salt Lake area at the time) and when the Christmas season rolled around, the head honcho threw a catered Christmas holiday party at a local resort.  It was swanky, or so it seemed:  full spread of appetizers, main courses, and desserts.  (No alcohol though --- this was Utah after all --- so no chance of the office harlot drunk-dancing on the tables.) 

I'm not sure what the hell we were thinking, but sometime between when we sat down at our table and the dealership CEO/president welcomed everyone with a short speech (and someone blessed the food), we sauntered over to the buffet line, piled our plates high with food, then returned to our table and started to eat while completely oblivious to the fact that no one else was yet eating.  When it finally dawned on me that we were suppose to wait to eat until after the host's welcoming remarks and the prayer, I was mortified.  I think I even tried to hide my plate under my napkin and may have even placed the plate on my lap.  What the hell was I thinking?  Oh yes, I wasn't thinking and I certainly wasn't relying on those years of never-received etiquette training.  In retrospect, waiting to eat until the host toasts the guests, blesses the food --- whatever he/she wants to do --- seems like common sense.  But it clearly wasn't common sense to me (or my companion).  Chalk up my oversight to whatever you want --- naivety, inexperience, obtuseness, obliviousness --- but it was, without question, very embarrassing and not one of my finer moments.   I still shudder when I think about it. 

Fortunately, 1997 was a long time ago and I'm no longer living in one of the most repressive places in the United States.  I've moved, I've grown, and I've picked up on the common sense things that at one point weren't so common sense.  And I've sure as hell learned to delay stuffing my face until everyone has food in front of them or the host gives the go-ahead. 

Tuesday, August 21, 2012

Logit Model Likelihood Function

In a previous post, I mapped out the relationship between the inverse logit and logistic function.  In this post, I'll present the likelihood function followed by the log likelihood.  

Since the likelihood is expressed in terms of frequencies (according to the text I'm referencing,  "Biostatistical Methods:  The Assessment of Relative Risks" by John Lachin), consider the following 2x2 table where the Response is some dependent variable, the Group is a binary independent variable (e.g. exposure), and the cell values denote the frequency in each cell.  The marginal totals are represented by m1, m2, n1, n2.  The grand total is N


a $(\pi_1)$
b $(\pi_2)$
c $(1 - \pi_1)$
d $(1 - \pi_2)$


The generic likelihood function, $L(\theta)$, is "the total probability of the sample under the assumed model" (pp. 465), denoted thus:  
$L(y_1, \cdots , y_N; \theta) = \prod_{i=1}^N f(y_i; \theta)$

If we express the generic likelihood function in terms of the frequencies from the table above (a, b, c, d) then the likelihood function becomes
$L(\pi_1, \pi_2) = \pi^a_1 (1-\pi_1)^c \pi^b_2 (1-\pi_2)^d$
by the fact that the cell probabilities, $\pi_i$, are exponentiated by the number of subjects in each cell (a, b, c, d).  

The log likelihood is just the log of the above:
$\ell(\pi_1, \pi_2) = a\:log(\pi_1) + c\:log(1-\pi_1) + b\:log(\pi_2) + d\:log(1-\pi_2)$

Since we'll eventually want to derive the maximum likelihood estimates for $\alpha$ and $\beta$, the log likelihood should be expressed in terms of $\alpha$ and $\beta$ (the substitutions for $\pi_i$ follow from the inverse logit to logistic post):
$\ell(\theta) = a\:log \Bigl[\frac{e^{\alpha + \beta}}{1 + e^{\alpha + \beta}}\Bigr] + c\:log \Bigl[\frac{1}{1 + e^{\alpha + \beta}}\Bigr] + b\:log \Bigl[\frac{e^{\alpha}}{1 + e^{\alpha}}\Bigr] + d\:log \Bigl[\frac{1}{1 + e^{\alpha}}\Bigr]$

Expanding the above (per logarithmic properties), we get
$\ell(\theta) = a\:log\:e^{\alpha + \beta} - a\:log(1 + e^{\alpha + \beta}) - c\:log(1 + e^{\alpha + \beta}) + b\:log\:e^{\alpha} - b\:log\:(1 + e^{\alpha}) - d\:log(1 + e^{\alpha})$

Simplifying and combining terms we get
$\ell(\theta) = a(\alpha + \beta) + b\:\alpha  - (a + c)log(1 + e^{\alpha + \beta}) - (b + d)log(1 + e^{\alpha})$
$\ell(\theta) = (a + b)\alpha + a\:\beta - (n_1)log(1 + e^{\alpha + \beta}) - (n_2)log(1 + e^{\alpha})$

After one more substitution $(a + b = m_1)$, the log likelihood function is as follows, expressed in terms of the frequencies and marginal totals from the 2x2 table. 
$\ell(\theta) = (m_1)\alpha + a\:\beta - (n_1)log(1 + e^{\alpha + \beta}) - (n_2)log(1 + e^{\alpha})$

With the log likelihood in this form, the score functions for $\alpha$ and $\beta$ can then be derived and the maximum likelihood estimates obtained (planned for a future blog post).