Tweeting the flu: Scientists tracking epidemic with social media

Jan 31, 2013

Flu season started early and came in swinging. Health officials say it’s been a moderate to severe flu season for most of the country.

Curtis Allen is a spokesman with the Centers for Disease Control and Prevention.

"In Michigan, it’s still at a high level of activity. Hopefully you’ll see less and less as we go on. But influenza is notoriously unpredictable and there could also be another peak," Allen says.

He says a strain of influenza virus called H3N2 has been circulating this year.

"The H3N2 strain tends to lead to more severe seasons and a greater number of illnesses and hospitalizations and deaths among those who are 65 or older and those who have underlying health conditions."

Allen says the CDC still recommends getting a flu shot this season if you haven’t yet.

Ugh... I feel terrible today.

It seems like every day somebody’s complaining about being sick on Facebook or Twitter.

It turns out that tweeting that you’re sick with the flu can be actually be useful for science.

Mark Dredze is an assistant research professor in the Department of Computer Science at Johns Hopkins University.  He’s designed a method of tracking flu cases using Twitter.

I asked him how he can tell who’s really sick with the flu and who’s just talking about some celebrity getting the flu.

"In order to figure out if someone's really sick, you need to go beyond just looking at what words they happen to use, like 'flu' or 'sick' and use a deeper level analysis of what they're saying to figure out if they're saying 'I am sick' or 'I'm worried about getting sick' or 'I hope I don't get sick.' And what we've done here is we've developed some new algorithms that can do exactly that. They go beyond just the shallow words and try and get the real meaning of what the person's trying to say."

He says Twitter is a great resource for tracking illness because it's publicly available.

"The Centers for Disease Control and Prevention do a very good job of putting together what the flu rate is in the United States, but they do it on a fairly broad level and it takes them a while to do that. It takes about two weeks for the CDC to put out today's flu rate. So, that's a little bit slow for making critical decisions about how to respond to flu epidemics, much like we're seeing this year. It would be much more valuable if we could have up to the day or up to the minute estimates of what the flu rate is, so we could respond much more quickly as epidemics arise, with the critical pieces of information."

I asked him how closely his Twitter method tracks with the CDC's data.

"So we looked at this for the current flu season, and it's still early because we're in the middle of that flu season, but it seems our new method tracks much more closely with CDC data than previous methods for doing things with Twitter."

Grandma, what's your Twitter handle?

There aren’t a lot of elderly people or really young kids tweeting – and Dredze acknowledges that's a limitation of tracking diseases with tweets.

"The demographics of Twitter are really key here. Certainly if we were interested in studying the elderly population, Twitter would be a very bad resource. Beyond that, Twitter is actually a very good resource for studying the population, at least in the United States, as it's really become an incredibly popular tool. Right now, Twitter has about a half a billion tweets a day, and while many of those are from outside the United States, the U.S. makes up a significant percentage of that data. So, we really have a lot of people in the United States using this, which makes it great. Tracking the flu is really just the tip of the iceberg here. There's a lot more data here, and a lot more interesting applications, and I think we're going to start to see those emerge in the next couple of years."

Update: February 1, 2013

We got a good question from one of our Facebook friends, so I wanted to add a little more info.

Except, in my experience, most folks who think they have the flu are actually afflicted with a stomach virus. I listened to this earlier and wondered about the reliability of the data. -Paschal Hackler

My response: I asked Dr. Dredze that same question but didn't have enough room in the segment to include it. His response: "Certainly there's a limitation here in that everything we're picking up is self-reporting. If someone thinks they have the flu but really don't, then obviously our system is going to be fooled. For diseases like the flu, which are really common diseases, I don't think that's a big problem. That might be more of a problem for rare diseases, where people don't have a good way of diagnosing them."

I also asked the CDC's Curtis Allen about this kind of social media-disease research (and things like Google Flu Trends) and the spokesman said more surveillance is good - but they're sticking with their data collection system, which is a network of 3,000 "sentinel physicians" around the country who are watching and testing for influenza and reporting to the CDC. - RW