Analytics

Three ways to overcome the lack of right data in India

28th Jan `15, 06:11 PM in Analytics

The digital footprint of society is expanding the world over into fragmented mediums (blogs, tweets, reviews etc) and…

Srikant
Srikant Sastri Contributor
Follow

The digital footprint of society is expanding the world over into fragmented mediums (blogs, tweets, reviews etc) and technologies (mobile, web, cloud/SaaS etc). Data generated from mobile devices and the internet of things are the main contributors to this data explosion. While this provides organisations with significant business opportunities, it also presents several challenges in harnessing these information sources.

India’s digital landscape maybe evolving quickly but the overall penetration remains low, with only one in five Indians using the internet (as in July 2014). Enterprises and businesses do have access to a veritable wealth of information. While some larger organisations have made a start in harnessing the information – telecom providers, online travel agencies, online retail stores are some of the industries that are using big data analytics to engage customers to a certain extent – most Indian companies are still learning how to collect and store big data.

To put it simply, big data analytics is still in its infancy in India. Most companies are just learning to store the data collected. There are several challenges when it comes to the collection of data sets themselves. Past and current data is required to make the application of big data analytics really useful but there is a scarcity of past data in public and private sectors in India. The lack of historical data can be traced to following:

Late and slow computerisation: Healthcare, economic and statistical data, in both private and public sectors in India, is yet to be fully computerised. The main reason for this is the late adoption of IT in India. Unlike in the West, most industries in India made the transition from manual records to computerised information systems only during the last decade. Over the years, the state and central ministries have made the move towards e-governance. Efforts to deliver public services and to make access to these services easier are being made as well. While this is still a work in progress, huge amounts of data across many government sectors are yet to be digitised.

Poor quality inputs: Not only quantity, the quality of data being used for crunching also influences the quality of insights. If the signal-to-noise-ratio is high, the accuracy of results may vary for less than optimum data samples. Public social media information that is available for most individuals from India lacks quality information about the users. Random facts and figures in individual profiles, sharing of spam content and fake social media accounts that are created for bots are very common in India.

Spam: Social media sites are becoming increasingly vulnerable to spam attacks. Time spent by a captive audience on social media sites opens up windows of opportunity for online threats and spammers. Again, social media spam contributes to the signal-to-noise-ratio that defines the quality of big data.

This comes in the way of generating appropriate results.

Cultural and social influences: In most Western markets, insights generated through big data can be applied across a wide consumer base. But given the extensive cultural and linguistic variation across India, any insight generated for a consumer based, say, in Chandigarh will not be directly applicable to a consumer based in Chennai. This problem is made worse by the fact that a lot of local data lives in regional publications, in different languages and has limited online visibility.

Unstructured sources of data: Big data in India is not structured. Most transactional data in the healthcare and retail segments are stored purely for book-keeping purposes. In most developed countries, user data is rich enough to provide demographic or group level markers that can be used to generate customised insights while maintaining individual privacy. The absence of such standard identifiers in Indian consumer data is one of the biggest bottlenecks in mapping transactional and social records in India.

Handsets and internet connectivity: Even though smart phones are driving the new handset market in India, feature phones still dominate everyday usage. Most connections in India are pre-paid and fewer than 10 per cent of users have access to 3G networks. To add to it, internet connection speed is among the lowest in Asia. As a result, consumer data, especially retail enterprise data is limited.

As more people in India make the move to smart phones and internet connectivity improves, there will be an increase in the amount of usable data generated. That said, organisations need to make a huge effort to improve the quality of enterprise data. The good news is, the key contributors to the promise of big data analytics in India are steadily gaining ground. An increase in social media users, efforts by enterprises, both public and private, for optimum collection and storage of transactional enterprise data, will contribute to better quality data sets, leading to improved application of big data analytics.

Srikant Sastri is the co-founder of Singapore-based Big Data start-up Crayon Data. His article originally appeared on Business Standard.

MORE FROM BIG DATA MADE SIMPLE