## Ordinal regression

When the outcome variable is ordinal then the methods described in the earlier chapters are inadequate. One solution would be to dichotomise the data and use logistic regression as discussed in Chapter 3. However, this is inefficient and possibly biased if the point for the dichotomy is chosen by looking at the data. The main model for ordinal regression is known as the proportional odds or cumulative logit model. It is based on the cumulative response probabilities rather than the category...

## Significance tests

Significance tests such as the chi-squared test and the t test and the interpretation of P values were described in Statistics at Square One.1 The form of statistical significance testing is to set up a null hypothesis, and then collect data. Using the null hypothesis we test if the observed data are consistent with the null hypothesis. As an example, consider a clinical trial to compare a new diet with a standard to reduce weight in obese patients. The null hypothesis is that there is no...

## Statistical tests using models

A t test compares the mean values of a continuous variable in two groups. This can be written as a linear model. In the example above, weight after treatment was the continuous variable, under one of two diets. Here the primary predictor variable x is diet, which is a binary variable taking the value (say) 0 for standard diet and 1 for the new diet. The outcome variable is weight. There are no confounding variables. The fitted model is The FIT part of the model is bo + bi diet and is what we...

## Multiple regression in action 251 Analysis of covariance

We mentioned that model (2.3) is very commonly seen in the literature. To see its application in a clinical trial consider the results of Llewellyn-Jones et al. ,3 part of which are given in Table 2.6. This study was a randomised controlled trial of the effectiveness of a shared care intervention for depression in 220 subjects over the age of 65. Depression was measured using the Geriatric Depression Scale, taken at baseline and after 9.5 months of blinded follow up. The figure that helps the...

## Survival analysis in action

Oddy et al.6 looked at the association between breast feeding and developing asthma in a cohort of children to six years of age. The outcome was the age at developing asthma and they used Cox regression to examine the relationship with breast feeding and to adjust for confounding factors sex, gestational age, being of Aboriginal descent and smoking in the household. They stated that regression models were subjected to standard tests for goodness-of-fit including an investigation of the need for...

## Interpreting a computer output

We now describe how to interpret a computer output for linear regression. Most statistical packages produce an output similar to this one. The models are fitted using the principle of least squares, which is explained in Appendix 2, and is equivalent to maximum likelihood when the error distribution is Normal. 2.4.1 One continuous and one binary independent variable We must first create a new variable Asthma 1 for asthmatics and Asthma 0 for non-asthmatics and create a new variable AsthmaHt...

## Poisson regression

Poisson regression is an extension of logistic regression where the risk of an event to an individual is small, but there are a large number of individuals, so the number of events in a group is appreciable. We need to know not just whether an individual had an event, but for how long they were followed up, the person-years. This is sometimes known as the amount of time they were at risk. It is used extensively in epidemiology, particularly in the analysis of cohort studies. For further details...

## Interpreting a computer output matched casecontrol study

These data are taken from Eason et al.9 and described in Altman et al.6 Thirty-five patients who died in hospital from asthma were individually matched for sex and age with 35 control subjects who had been discharged from the same hospital in the preceding year. The adequacy of monitoring of the patients was independently assessed and the results given in Table 3.7. For a computer analysis this may be written as a datafile with 35X2 70 rows, one for each case and control as shown in Table 3.8....

## Logistic regression in action

Lavie et al.4 surveyed 2677 adults referred to a sleep clinic with suspected sleep apnoea. They developed an apnoea severity index, and related this to the presence or absence of hypertension. The questions that they wished to answer are (i) Is the apnoea index predictive of hypertension, allowing for age, sex and body mass index (ii) Is sex a predictor of hypertension, allowing for the other covariates The results are given in Table 3.3 and the authors chose to give the regression coefficients...

## Two independent variables

We will start off by considering two independent variables which can be either continuous or binary. There are three possibilities both variables continuous, both binary (0 1) or one continuous and one binary. We will anchor the examples in some real data. Consider the data given on the pulmonary anatomical deadspace and height in 15 children given in Swinscow.1 Suppose that of the 15 children, 8 had asthma and 4 bronchitis. The data are given in Table 2.1 Table 2.1 Lung function data on 15...