Large web and social media datasets like Twitter’s — which the company is now opening up to a select group of researchers — are the basis of a tension that has been building for years between social scientists who want such data and the companies that keep it under lock and key. That Twitter is willing to open hundreds of billions of tweets up for academic research is great news, but its decision to limit the number of grant recipients highlights the problem.
The promise of unadulterated access to a dataset like Twitter’s is already evident. Back in 2012, we reported on some university researchers who scoured publicly available tweets to identify bullies, their victims and their methods. On Thursday, MIT Technology Review reported on a study that identified clusters of individuals who share content about the Syrian civil war on Twitter and YouTube. These are two of probably thousands of studies that have been conducted on social media data since Facebook first took hold several years ago, none of which presumably have had access to data beyond what they were able to scrape or otherwise collect online.