Altman & Bland review the use of odds ratio (OR), standard error (SE), and confidence interval (CI) with some examples in (Bland & Altman, 2000); the 42nd paper on the list of statistical notes in their BMJ series, ( lit-altman_bland.md ): 42. Bland JM, Altman DG. (2000) The odds ratio. 320, 1468. 1
In reproducing their examples I use X and Y;
As an example dataset, they cite the following table; Association between hay fever (Y) and eczema (X) in 11 year old children.
Y Yes | Y No | Y Total | |
---|---|---|---|
X Yes | 141 | 420 | 561 |
X No | 928 | 13 525 | 14 453 |
X Total | 1069 | 13 945 | 15 522 |
The probability that a child with X will also have Y is estimated by the proportion 141561 (25.1%) and odds is estimated by 141420.
Similarly, for children without X the probability of having Y is estimated by 92814453 (6.4%) and the odds is 92813525.
They compare the groups in several ways:
Looking at the table the other way round, What is the probability that a child with Y will also have X?
The proportion is 1411069 (13.2%) and the odds is 141928.
For a child without Y, the proportion with X is 42013945 (3.0%) and the odds is 42013525.
Comparing the proportions this way,
The difference is 1411069−42013945=0.102 (or 10.2 percentage points);
The OR is the same whichever way round we look at the table, but the difference and ratio of proportions are not. This is because the two OR are
141/420928/13525 and 141/928420/13525 which can both be rearranged to give 141×13525928×420.
Swapping orders for rows and columns produces the same OR.
Swapping orders for either only rows or only columns produces the the reciprocal of the OR, 1/4.89=0.204.
Thus, OR can indicate the strength of the relationship. OR cannot be negative but is not limited in the positive direction, producing a skew distribution. Reversing the order of categories for one variables simply results in a reversed sign of log OR:
log(4.89)=1.59,
log(0.204)=−1.59.
The standard error (SE) can be calculated for the log OR and hence a confidence interval (CI).
The SE of log OR is simply estimated by the square root of the sum of the reciprocals of the four frequencies. For the example,
A 95% confidence interval (CI) for the log OR is obtained as 1.96 standard errors on either side of the estimate.
For the example, the log OR is loge(4.89)=1.588 and the confidence interval is 1.588±1.96×0.103, which gives 1.386 to 1.790.
The antilog of these limits to give a 95% CI for the OR itself, as exp(1.386)=4.00 to exp(1.790)=5.99.
The observed OR, 4.89, is not in the centre of the confidence interval because of the asymmetrical nature of the OR scale. For this reason, in graphs ORs are often plotted using a logarithmic scale. The OR is 1 when there is no relationship. We can test the null hypothesis that the OR is 1 by the usual χ2 test for a two by two table.
Despite their usefulness, ORs can cause difficulties in interpretation. Altman & Bland review this debate and also discuss ORs in logistic regression and case-control studies in future Statistics Notes.
Footnote 1 This article is almost identical to the original version in acknowledgment to Altman and Bland. It is adapted here as part of a set of curated, consistent, and minimal examples of statistics required for human genomic analysis. ↩