Episode 27: I'd give that study 4 stars: considering the quality of research.
Release Date: 02/28/2017
Guest:David Spiegelhalter
David Spiegelhalter is Winton Professor for the Public Understanding of Risk in the Statistical Laboratory at the University of Cambridge, Chair of the Winton Centre for Risk and Evidence Communication , and President of the Royal Statistical Society . He is passionate about helping the public understand uncertainty and risk . His recent book, Sex by Numbers , describes scientific research that provides a view of the world of sex.
John Bailer: I’d like to welcome you to today’s stats and short stories episode. Today’s guest is David Spiegelhalter, Winton professor of Public Understand of Risk in the Statistical Laboratory Center for Mathematical Sciences at the University of Cambridge and President of the Royal Statistical Society. He’s also author of the book Sex by Numbers: What Scientists can tell us about Sexual Behavior as well as many other impactful works. I’m John Bailer, I’m chair of the Department of Statistics at Miami University and I’m joined by my colleague, Richard Campbell, chair of the Department of Media Journalism and Film. We’d like to welcome David Spiegelhalter to our short episode today. Welcome David.
David Spiegelhalter: Hello
Bailer: What I’d like to do, I’d like to start with a quick question for you. In the start of your book, Sex by Numbers you had an interesting rating scheme for the quality of studies. You had, it was a four stars for the best, numbers you can indeed believe down to a one star, numbers that are unreliable, with two intermediate values in between. Why did you develop this scheme?
Spiegelhalter: Well I should say also, I did have a zero star for numbers that had no value whatsoever. This is an area of sexual statistics that vary in quality vastly. There are some really good numbers out there and there’s some complete and utter drivel. Because if you google, and I’m not suggesting you do, but if you search for sex statistics online, you’ll pick up all sorts of stuff that you’ll probably wish you didn’t have to look at. So, it’s an area where it’s very important to be able to communicate the quality of the evidence, and you can’t just look at the sample size. Some of the worst statistics have some of the biggest sample sizes. So it’s nothing to do with small samples, or statisticians would say there’s the margin of error, it’s to do with systematic biases. It’s the fact that this data has been asked of people who aren’t representative and the questions have been asked badly, and you don’t trust the answers anyway.
Richard Campbell: What are some examples of numbers that are unreliable, I think that’s one of your categories?
Spiegelhalter: Yeah. The ones that people just make up are things like, Men think of sex every seven seconds, the average amount of time spent kissing in a lifetime is 20,000 minutes. I mean really, these are not very reliable statistics at all, but they’re the ones you’ll find, you know, on the web. Ones that are also unreliable, to be honest, are the earlier works by Shere Hite, you know very huge, best-selling books and these made some very bold claims in the 1970’s about female sexuality. Reporting that 70% of men married for more than five years were having affairs, 84% of women were emotionally unsatisfied with their relationships. And those sort of claims received a lot of publicity, but in that…that book was based on 100,000 questionnaires and 4 ½ thousand responses, so a tiny response rate of what were probably a highly selected group of women. Actually the book’s rather a good read because the people wrote long stories and the book largely consists of some very moving quotations of people, of women, about their dissatisfaction with their emotional and sexual lives. I think it’s very powerful, very influential, quite right, but her statistics were grossly unreliable, and she got criticized quite strongly at the time about those.
Bailer: One aspect of the intermediate rating schemes that you described were within a certain believability range, you want to talk a little about that, can you give us an example?
Spiegelhalter: Essentially, a four star account in the star ratings, are numbers I can really believe. Those would be very limited but they would be official statistics, based on national registrations. In the UK, how many births in the UK were outside marriage? About a half. In 1973 one in twenty sixteen year old girls in the UK got pregnant. Now those are good statistics, we can believe those numbers, they’re really based on counting specific instances. Numbers that are reasonably accurate, I’d say are one’s I’d trust up to plus or minus 20%. Regardless, I don’t believe the confidence intervals quoted in surveys because that assumes a totally unbiased sample and I think in this area that’s very difficult to get. Those sorts of questions, I’d give that sort of rating if they were conducted in extremely good surveys, done very carefully. The sort of national surveys that are carried out in the UK and the US, relying on face to face interviews of a properly, randomly selected group of the population. Going down a bit, two star numbers can really be a long way out. Plus or minus 50%, they could be half, they could be double the true answer. So now you’re getting to, not that good surveys, actually I’d put a lot of online stuff into that category, I think, online panels are fine but actually, there’s not test about the honesty of the answer, about the reliability of the answers, and there’s not a representative group. I think some of Kinsey stuff could just about manage to get to two star, away from one star. So again, I’d trust them to give a very rough ball park figure, which is about what Tukey and others who inspected Kinsey’s work in the 1940’s said.
Campbell: Last question. So we’re living in this world of fake new and alternative facts.
Spiegelhalter: Yeah.
Campbell: What do you think statisticians and journalists can do to combat this? It’s a real problem.
Spiegelhalter: Yeah, this is the question of the day. And I must say recently, from events in the UK and the US I’ve sometimes just felt like giving up, but no, but then I wake up in the morning revived with new energy, that this gives a new emphasis for the importance of statistics and good journalism and both professions and their vital role in society in the future. And I just strongly believe the way to combat that is not to come back with something extreme at the other end, to manipulate the evidence ourselves to counter claims by one side or another. Maybe I’m just idealistic and optimistic but I do believe that balanced reporting of actually just being straight about the truths, about the evidence, the limitations of the evidence, the magnitudes of both the benefits and the possible harms of alternative policies. Being straight, reporting that straight should, with skilled journalism be able to be made into a story that’s interesting in itself. I’ve seen some fantastic stuff on, for example FiveThirtyEight, where they were talking about evidence on breast screening, which is a deeply contested area where there could be very strong claims on one side or another, and by doing good journalism based on the statistical evidence, someone could say “actually it’s not convincing one side or another, I and my friend, based on exactly the same evidence have come up with different conclusions for ourselves, there’s no correct answer to this, we have to interpret the evidence and make up our own minds.” Maybe that’s expecting too much, but I do believe that this is the way forward. In order to be trusted you have to demonstrate trustworthiness and that means, in a sense, making yourself vulnerable and open to criticism by the openness of the way that you’re using your evidence. Maybe I’m being idealistic, but I do think that that’s the right way to do it.
Bailer: Well I think, around this table, we embrace that idealism and I think we’re ready to start marching forward with you David.
Spiegelhalter: Yeah exactly, under the banner.
Bailer: Indeed. Well that’s all the time we have for this episode of Stats and Short stories. David, thank you so much for being here.
Spiegelhalter: No, thanks very much.
Bailer: It’s been our pleasure. Stats and Stories is a partnership between Miami University’s Department of Statistics and Media Journalism and Film and the American Statistical Association. Stay tuned and keep following us on twitter or iTunes. If you’d like to share your thoughts on our program send your email to statsandstories@miamioh.edu and be sure to listen for future episodes of Stats+Stories where we discuss the statistics behind the stories and the stories behind the statistics.
Click to close the script.