Surely nobody who has the slightest awareness of what’s going on in the world can be unaware of the phrase ‘big data’. Almost every day the newspapers and television make reference to it, and it’s ubiquitous on the web. In November, a Google search for the phrase ‘big data’ yielded 1.8 billion hits. Google Trends shows that the rate of searches for the phrase is now about ten times what it was at the start of 2011.
The phrase defies an exact definition: one can define it in absolute terms (so many gigabytes, petabytes, etc) or in relative terms (relative to your computational resources), and in other ways. The obvious way for data to be big is by having many units (e.g., stars in an astronomical database), but it could also be big in terms of the number of variables (e.g., genomic data), the number of times something is observed (e.g., high frequency financial data), or by virtue of its complexity (e.g., the number of potential interactions in a social network).