Monday, July 06, 2009

Dementia - concurrent validation of a test

 This paper reports a new 'screening' test for Alzheimers disease. More below on use of the word 'screening' in this context.

Good points:

a. validated against people with a firm clinical diagnosis, made by experts, of Alzheimers disease (i.e. concurrent validation, against a 'gold standard');

b. ROC comparison against the current screening test - the Mini Mental State Exam. The correct way to compare two screening tests is the ROC curve.

c. age specific norms published - i.e. what 'normal' people score on this test.

Although the test is described as a screening test I would classify it as a clinical algorithm, given its intended use in hosptial and other clinics. I certainly hope it won't be used in the general population without more validation, though unfortunately the authors have put the test on a website for anyone to try. 

There is some debate about whether tests like this are 'screening' or not.  Both population screening and clinical testing start with a prior probability of disease which changes to a higher (or lower)probability after the result is known. In a clinical setting the prior probability of disease is higher: the patient has come to the clinic because something is wrong.

On the other hand the ethics are certainly different. In population screening we invite ordinary people who are happily going about their business and tell them they may have some serious disease. With clinical tests, the doctor is sifting through the possibilities for someone who has asked what is wrong with them.

Thursday, July 02, 2009

Salt!

Here is an excellent summary of the scientific position on salt. Skip the tables and read the text. Like flu, this is a 'must know' for anyone in public health.

Friday, June 26, 2009

Home births

A great theorem in mathematics proves rigorously that some things can never be proved rigorously. It seems that in public health too we will never know the answer to some important questions, no matter how much they are researched.

The question of whether home births are safe has been around for twenty years or more, but still the debate rumbles on. Here is the latest attempt to answer the question; and here is an editorial shredding the attempt (though some of the criticisms seem unfounded to me).

Strictly speaking the paper compares outcomes in a group delivered at home by independent (i.e. non NHS) midwives versus a group delivered in NHS hospitals. Note that most home deliveries are by NHS midwives so this study looks at a specific detailed question - the independent midwife - rather than the general problem of home birth. 

And of course the problem is that no group can be found to match the sort of woman who wants a home delivery by an independent midwife. To my eye the two groups look comparable on most measures (Table 1) but the expert editorialists think not; and the authors themselves comment on some important differences in their discussion section.

So in the end (a) we find differences but (b) we have not compared like with like, leaving us no wiser than before. 


Monday, June 22, 2009

HPV vaccine efficacy

You don't often see trials of vaccine efficacy these days but here is an interesting paper on HPV vaccine in 25 - 44 year olds. The outcome is HPV infection, not cervical cancer.

Given that (1) the aim is to prevent infection with a sexually transmitted virus, and (2) the median age at first sexual intercourse for females is around 17, at least in western societies, vaccinating women after the 25th birthday seems to be leaving it too late. But the argument is that nowadays many women may start a new sexual relationship in their thirties, perhaps after a failed first marriage.

I note the use of a new phrase creeping into the literature - 'per protocol' analysis. This is opposed to 'intention to treat' analysis. 'Per protocol' seems to mean 'actually got the vaccine'.

'Intention to treat' seems a strict standard - you would judge the vaccine ineffective even if the problem was that most women never got it. But 'Intention to treat' tells you what happens in the real world - 'I intended to treat all of these patients (but all sorts of things prevented this from happening as intended)'; this is sometimes called effectiveness. 'Per protocol' tells you how much effect the intervention has considered pure and on its own; this is sometimes called efficacy.

But 'per protocol' introduces all sorts of bias because the women who did not get vaccinated after being assigned to the intervention group will differ systematically from the women who did receive the vaccine.  For example the sort of woman who attends to receive vaccine may, by comparison to the non-attender, be more health conscious and so more careful about her sexual relationships, use condoms more etc. and so put herself less at risk of HPV infection regardless of whether the vaccine works or not. Be vary careful about what 'per protocol' analysis actually tells you.

Saturday, May 30, 2009

Cervical cancer after three negative smears

The results of this  paper seem obvious once you know the answer. The question is this: if a woman has three consecutive negative smears, can you give her the all-clear and say she doesn't need to come again?

The details of method are quite complicated but the essence is simple - a very large national database in the Netherlands with results of smear tests and follow up data on which women developed cancer.

Before we answer that question, let's look at something which may at first sight appear rather similar. Thirty years ago, it was standard practice to do a test for blood in the stool (faecal occult blood) on patients suspected of having bowel cancer. One might be negative by chance; so how many tests to do before giving patients the all-clear? Each test seemed cheap, and sometimes the fifth or even sixth test was the first to be positive so to be on the safe side it was suggested that six consecutive negative tests were needed to rule out cancer. The problem was that finding a cancer with the sixth test was so rare that it worked out as costing $47m per extra cancer detected (i.e a marginal cost of $47m).  

D Neuhauser and A M Lweicki What do we gain from the sixth stool guaiac?The New England journal of medicine 293 (5), 226-8 (31 Jul 1975)

 

But this is not the same situation as cervical cancer screening.

 

The six tests for cancer are done in a short time frame - essentially repeat tests of the same situation. Cervical cancer screening every three years is to check if something has developed since the last test.

 

The finding that women can get cancer after three negative smears should not surprise us: cervical cancer is caused by human papilloma virus, so if the woman remains sexually active, she remains at risk of getting it.

Monday, May 25, 2009

Post traumatic stress disorder

 This is a good study from the US military looking at risk factors for PTSD. I particularly liked the careful definition of PTSD, based on questionnaire response but rooted in the DSM. The Diagnostic and Statistical Manual of Mental Disorders is the bible of psychiatric epidemiology. If the researchers don't use DSM criteria to make their diagnoses, you really have no idea of what they are choosing to call PTSD (or depression or schizophrenia or whatever).

The conclusion of the study may seem obvious with hindsight - that people with poorer health before deployment are more likely to develop PTSD - but it does enable military health services to target preventive action. More than two thirds of new cases of PTSD were in people with the lowest 15% of physical and mental health scores on the SF36 at baseline. 


Sunday, May 17, 2009

Alcohol in Slovenia

The statistical analysis in this paper is rather tricky but the basic design is simple - a study of the number of suicides before and after a change in law to restrict alcohol sales and consumption. 

The authors use the correct technique (ARIMA) to study change in the daily count of suicide. Time series analysis is very tricky because of a problem called serial autocorrelation: if the numbers are up one day they are likely to be slightly up the next day too. This happens because natural ups and downs which are continuous are sliced by us humans into counts which start afresh every midnight as a daily count. The same is true for daily counts of, say, hospital admissions during an epidemic (and at other times).

So the count for each day is correlated with the day before and the day after. As you know, independence of the observations (i.e. no correlation between them) is really important for routine statistical analyses: hence the need for a special technique to cope - ARIMA.

The message of the study seems clear, however: 1. Alcohol kills and 2. laws work.

Slovenia' start position was 17 litres alcohol per head  - the highest in Europe. By 2003, according to the WHO European office database it was down to about 9 litres per head: similar to France and the UK, but the French trend is down and ours is up.

A large report from SCHARR looked at the economics of alcohol consumption. Based on evidence of price elasticity (i.e. what happens when you put the price up) the CMO's call for a minimum price of 50p would preferentially reduce drinking among poor and young people i.e. do exactly what is needed. Gordon Brown dismissed it out of hand. But perhaps CMO will outlast Gordon and try again.  

Friday, May 08, 2009

Survival after cardiac surgery

I wonder what you think about this one. The dataset is large and the basic method simple, but I can't say I agree with the authors' headline.

The paper looks at survival after cardiac surgery, and the issue highlighted by the authors is deprivation: patients from deprived areas fare worse. But if you look at the numbers it seems that the effects of deprivation are dwarfed by the effects of smoking and diabetes.

It's always good to look at basic data, before any statistical manipulations are done. There is a table sorting the patients into quartiles by deprivation score, and tabulating the number of survivors at 3500 days after surgery:

n in quartile     n alive at 3500 days

10 803            536

10 854            554

10 872            568

10 748            532

Which looks a pretty small difference to me. Numerically the difference is reported as 2.4% increased mortality for a one unit increase in Carstairs score (i.e. a hazard ratio of 1.024 in a Cox proportional hazard model). This doesn't mean much if you don't know about Carstairs score, so you need to look for the range of Carstairs scores. Elsewhere the interquartile range for Carstairs is given as -2.2 to +2.3: a 4.1 point difference in score, which equates to a 4.1 * 2.4% i.e. a 9.2% difference in mortality. Some of the deprivation effect is due to more smoking, obesity and diabetes in the higher deprivation categories; after adjusting for this in a multivariate analysis, the deprivation effect drops to 1.7% per unit of Carstairs score. 

But the effect of diabetes is stated to be a 31% increase in mortality, and of smoking to be 29% (current smoker) or 25% (ex smoker). So surely the headline of the paper should have been about diabetes and smoking? Or have I misunderstood the numbers badly?

Friday, May 01, 2009

A parenting trial

 Here is a good trial which the BMJ short-changed by printing in its hard copy a ridiculously short summary. You need to read the full write-up online.

Little comment needed from me, except perhaps to point out that parenting trials, though important, measure us a proxy result - better parenting - rather than the thing we want - better (i.e. happier and healthier) children. On the other hand, as the authors point out, there is good evidence that poor bonding between mother and infant reliably predicts trouble later on.

Tuesday, April 14, 2009

Risk adjustment and outcome comparisons

When a hospital (or a surgeon) has a high death rate, the first response is "but we treat sicker patients". At which point risk adjustment is called for.

Risk adjustment is basically a multiple regression analysis, with 'place of treatment' as the predictor of outcome, and everything else as a confounder. The 'everything else' in a risk adjustment is a set of indicators of severity (risk of death). That's ok if your measure of risk is something very objective such as blood pressure but it's not ok if you pick a measure such as '%age of patients admitted as an emergency'.

At first sight, it seems obvious that hospital A, where most patients are admitted as emergencies, must be treating sicker patients than hospital B, where there are few emergency admissions. But there is a problem: policies on emergency admission may differ. So the data item 'emergency admission' does not mean the same in both hospitals (whereas blood pressure 100/60 does). You can't use emergency admission in risk adjustments.

Here is an interesting worked example of all this using real data. The statistical warning sign is that a predictor (emergency admission) signifies a higher risk of death in hospital B than hospital A. The predictor carries a different meaning or value in the different places, so technically this is an interaction between the two predictors.  

Moral: Don't measure something till you understand it thoroughly. (Kelvin famously thought the opposite!)