Health / Pharma

Google Flu Trends: Underlying issues of big data

05th Nov `14, 01:00 PM in Health / Pharma

A Google software engineer took to an official company blog last Friday to announce that the tech firm…

Guest Contributor

A Google software engineer took to an official company blog last Friday to announce that the tech firm was changing its Flu Trends tool for the 2014-15 flu season. That’s potentially good news for healthcare for at least two reasons. First, it presumably means that the healthcare community will be getting more accurate data about flu incidence. Second, it provides a good opportunity to think about some of the challenges posed by the use of so-called big data, massive data bases combed through by computer power rather than human brain power, for healthcare’s future.

Google’s Flu Trends tool was based on the idea that the company could predict the incidence of influenza faster and more locally than the Centers for Disease Control and Prevention, which typically sent out its data with a lag. The Google tool initially used a group of 50 to 300 search queries that it believed was correlated with flu incidence and extrapolated from there. The company has since expanded the tool to 29 countries, and launched Dengue Trends in 10.

The thinking behind the tool was similar to the logic Google uses for its business model, namely that people often want or are experiencing what they search for. That makes it easy to sell advertisements against searches and is the same insight that’s driving its virtual visits trial: if you search “knee pain,” chances are decent you have knee pain and would like to do something about it.

The problem, noted a March 2014 paper in Science Magazine, was that the tool over-predicted the prevalence of the flu in the 2011-12 and 2012-13 seasons. Hence the latest change from Google. The March paper isn’t sure about why the tool didn’t work, but speculated it was because the underlying data set, Google searches, is not stable. The company is constantly changing how search works, which means that the old assumptions built into the old version of Flu Trends might not work.

Fortunately, an Oct. 29 paper in Royal Society Open Science showed, it’s possible to improve predictions by combining the data from the CDC and Google.

Read More