Big Data Made Simple https://bigdata-madesimple.com One source. Many perspectives. Sun, 16 Dec 2018 05:47:25 +0000 en-US hourly 1 https://wordpress.org/?v=4.9.7 https://bigdata-madesimple.com/wp-content/uploads/2018/10/bdms-favicon.jpg Big Data Made Simple https://bigdata-madesimple.com 32 32 Researchers harness machine learning to predict chemical reactions! https://bigdata-madesimple.com/researchers-harness-machine-learning-to-predict-chemical-reactions/ https://bigdata-madesimple.com/researchers-harness-machine-learning-to-predict-chemical-reactions/#respond Sun, 16 Dec 2018 05:47:25 +0000 https://bigdata-madesimple.com/?p=32540 This is an age of machine learning algorithms! They can predict stock market fluctuations, control complex manufacturing processes, enable navigation for robots and driverless vehicles, and much more. Now, a group of researchers has found a way to predict and control chemical reactions with precision and speed. The technique was developed by researchers at the … Continue reading Researchers harness machine learning to predict chemical reactions!

The post Researchers harness machine learning to predict chemical reactions! appeared first on Big Data Made Simple.

]]>
This is an age of machine learning algorithms! They can predict stock market fluctuations, control complex manufacturing processes, enable navigation for robots and driverless vehicles, and much more. Now, a group of researchers has found a way to predict and control chemical reactions with precision and speed.

The technique was developed by researchers at the NYU Tandon School of Engineering, tapping the capabilities of artificial intelligence, artificial neural networks, and infrared thermal imaging and it can allow chemical discoveries to take place quickly and with far less environmental waste than standard large-scale reactions.

“This system can reduce the decision-making process about certain chemical manufacturing processes from one year to a matter of weeks, saving tons of chemical waste and energy in the process,” said Ryan Hartman, lead author of the paper – “Artificial Neural Network Control of Thermoelectrically-Cooled Microfluidics using Computer Vision based on IR Thermography,” detailing the method in the journal Computers & Chemical Engineering.

How it works

Hartman and his team introduced a new class of miniaturized chemical reactors that brings reactions traditionally carried out in large-batch reactors with up to 100 liters of chemicals down to the microscale, using just microliters of fluid – a few small drops. They increased the utility of these reactors by pairing them with two additional technologies: infrared thermography — an imaging technique that captures a thermal map displaying changes in heat during a chemical reaction, and supervised machine learning — a discipline of artificial intelligence wherein an algorithm learns to interpret data based on inputs selected by researchers controlling the experiments.

Paired together, infrared thermography and machine learning allow researchers to capture changes in thermal energy during chemical reactions — as indicated by color changes on the thermal image — and to interpret these changes quickly. The research team is the first to train an artificial neural network to control and interpret infrared thermal images of a thermoelectrically cooled microfluidic device.

Potential impact

The potential impacts on both innovation and sustainability are significant. Large chemical companies may screen hundreds of catalysts while developing new polymers, for example, and each reaction can require more than 100 liters of chemicals and 24 hours or longer. Screening that a number of catalysts using current laboratory processes can take a year. Using Hartman’s approach, the entire process can be accomplished in weeks, with exponentially less waste and energy usage. Hartman estimates that a single industrial hood used to control fumes during large-scale chemical testing uses as much energy per year as the average U.S. home.

Other breakthroughs

Notably, early this year, another group of researchers at Princeton University and Spencer Dreher of Merck Research Laboratories found a way to predict reaction yields accurately while varying up to four reaction components by using an application of artificial intelligence known as machine learning. They have turned their method into software that they have made available to other chemists. They published their research in February in the journal Science.

In July 2018, Lee Cronin’s lab at the University of Glasgow, UK, also created an organic synthesis robotic AI system that can quickly explore the reactivity of a set of reagents from the bottom-up with no specific target. The autonomous system was able to predict with 86% accuracy the reactivity of the remaining 90% of reactions. The robot can perform up to 36 experiments per day – around 10 times more than a human.

The post Researchers harness machine learning to predict chemical reactions! appeared first on Big Data Made Simple.

]]>
https://bigdata-madesimple.com/researchers-harness-machine-learning-to-predict-chemical-reactions/feed/ 0
Fingerprint sensors are not the guarantee to privacy https://bigdata-madesimple.com/fingerprint-sensors-are-not-the-guarantee-to-privacy/ https://bigdata-madesimple.com/fingerprint-sensors-are-not-the-guarantee-to-privacy/#respond Fri, 14 Dec 2018 11:14:09 +0000 https://bigdata-madesimple.com/?p=32537 Biometric authentication is the latest, shiniest toy for large tech companies, especially among manufacturers of smartphones. Technology like fingerprint sensors, iris scans, facial image recognition and voice recognition used to be the stuff of sci-fi movies. But now, thanks to companies like Apple, biometric recognition systems (especially finger scanning), are directing modern smartphones into miracles … Continue reading Fingerprint sensors are not the guarantee to privacy

The post Fingerprint sensors are not the guarantee to privacy appeared first on Big Data Made Simple.

]]>
Biometric authentication is the latest, shiniest toy for large tech companies, especially among manufacturers of smartphones. Technology like fingerprint sensors, iris scans, facial image recognition and voice recognition used to be the stuff of sci-fi movies. But now, thanks to companies like Apple, biometric recognition systems (especially finger scanning), are directing modern smartphones into miracles of ease and suitability. A finger touch can easily unlock a phone without the requirement of a password. A simple fingerprint touch can help you in paying bills, buying groceries, and transfer hundreds worth of money. However, while may be super convenient, it can leave a massive gap in personal security.

In April 2017, a report was published by New York University and Michigan State University regarding fingerprint hacking. The fingerprint sensors present on your phones and tablets are previously imperfect with weak security protection. They do not guarantee privacy. The researchers emphasize that the sensors are tiny. They can only grab a small portion of the fingerprint. Therefore, it increases the possibility of matching with another fingerprint which is even slightly similar to yours.

The report revealed that smartphones could easily be fooled by false fingerprints, which are composed digitally of several features noticed in individual prints. By computer replications, the researchers were able to form a set of artificial Master Prints. These master prints can match the real prints, which are similar to those used by the phones as much as 65 percent of the day.

Human fingerprints are extremely difficult to fake. However, finger scanners on the phone can read the partial prints only when you set fingerprint security on your smartphone. At that time the phone usually records almost eight to ten images of a finger to make the matching easy and straightforward. Since a single finger swipe has to match only one retained image to unlock the phone, nearly all phones are vulnerable to fake matches.

What is fingerprint scanning?

Fingerprints scanners are recognition systems known as snoop level technology. Over the past few years, fingerprint scanning has become completely universal and omnipresent. In fact, biometric technology is very beneficial to law enforcement agencies and various other organizations.

Fingerprint scanning is a process of obtaining and storing human fingerprints by electronic means. The digital image gained after scanning is known as finger image.

It is a biometric procedure which includes the automatic capture, examination, and evaluation of specific characteristics of the human body. There are various methods by which a device or tool can capture the details, such as the pattern of both branches and raised areas, in the image of the human finger. The most usual and familiar methods include optical, tactile, and thermal. All these methods work by using heat emission analysis, pressure analysis, and visible light analysis.

How does it work?

The process of scanning begins when you put your finger on a glass plate. After this, a CCD camera captures a picture of your fingers. The scanner consists of a light source. From the scanner, a wide range of light emitting diodes to illuminate the raised areas, i.e., ridges of the finger. In the meantime, the CCD system produces a reversed image of the finger. The dark regions signify more reflected light while the lighter areas characterize less reflected light.

The scanner processor makes sure that the image obtained is clear, inspects the pixel darkness, and discards the scan if the image captured is not perfect, i.e., it is too dark or too light. After rejection, the scanner tries to scan the image again after adjusting exposure.

If the fingerprint image is of good definition, then a line flowing perpendicular to the raised areas will be made up of alternating sections of extremely dark pixels and extremely light pixels.

When a hard, crispy, a properly exposed image is obtained so, the processor compares it with the taken fingerprint with other prints on the file.

Fingerprint hacking

The fingerprint detection is set up on several smartphones. These biometric validations are more critical and vibrant in China. Especially now since smartphone-based e-wallets and cell phones payments are highly popular in the country.

A vital and acute concern of the fingerprint technology is that it can easily get hacked. Although it seems complicated and impossible, it does happen. Some hackers use a 3D printed mold, which is made from a retained fingerprint image. Fingerprints can get stolen despite firewalls and security. PIN codes and passcodes can quickly change, but a fingerprint does not change. One-time credential theft becomes a lifetime of susceptibility and vulnerability.

The cybersecurity expert and military commentator of People’s Liberation Army recently rekindled the warning. He said on a China Central Television (CCTV) program, that the security protection can get hacked. Malicious and vicious people can fake your fingerprint with the help of tools which are as plain as a translucent film and a circuit scribe. The film with ink from the circuit marker gets attached to cover half of the phone’s fingerprint reader. And the owner can use his finger for unlocking the phone even though the sensor reads only half of the print.

The sensing and matching of fingerprints algorithms assumed by Apple’s iOS and Android systems are based on machine learning algorithms. It’s an advanced process which occurs in minutes to allow the user to unlock the phone. The user puts his finger on the reader while using a capacitive touch to take in the image of a print and updating the print image which is already stored in the phone. However, similarly, a deceiving ink pattern on a translucent film can count as an update to the stored image.

China Central Television (CCTV) programs demonstrate that in some cases low-tech knock-off fingerprints which are made up of film and circuit-scribe ink has tricked people and unlocked several phones. These phones belong to some of the most renowned companies such as Apple, Huawei, Samsung, and Xiaomi.

It is a clear warning that fingerprints can get hacked. However, unlike passcodes, you cannot change your fingerprints. Therefore, a single credential theft often leads to a lifetime vulnerability. That’s why most cybersecurity experts persuade users to utilize two-factor authentication. With this, you cannot be snooped from the easiest of ways.

To avoid the fingerprint hacking, last year lawmakers of Washington passed ground-breaking legislation which forbids organizations from collecting or selling the biometric information without the accord of the individual. Due to the rising concerns – regarding the use of pinched biometric identifiers to compel identity fraud – the voting ratio was 37-12 in favor by State Senators and 81-17 by the house.

The post Fingerprint sensors are not the guarantee to privacy appeared first on Big Data Made Simple.

]]>
https://bigdata-madesimple.com/fingerprint-sensors-are-not-the-guarantee-to-privacy/feed/ 0
AI has huge impact on Energy and Utilities – Frost & Sullivan’s analysis https://bigdata-madesimple.com/ai-has-huge-impact-on-energy-and-utilities-frost-sullivans-analysis/ https://bigdata-madesimple.com/ai-has-huge-impact-on-energy-and-utilities-frost-sullivans-analysis/#comments Fri, 14 Dec 2018 10:20:19 +0000 https://bigdata-madesimple.com/?p=32530 The global energy market is undergoing a massive shift with the emergence of decarbonization, decentralization, and new technologies. Utilities, independent power producers (IPPs), and other energy companies are exploring effective ways to manage the imbalance in demand and supply caused by the growing use of unpredictable renewable energy source (RES) in power generation. Therefore, utilities, … Continue reading AI has huge impact on Energy and Utilities – Frost & Sullivan’s analysis

The post AI has huge impact on Energy and Utilities – Frost & Sullivan’s analysis appeared first on Big Data Made Simple.

]]>
The global energy market is undergoing a massive shift with the emergence of decarbonization, decentralization, and new technologies. Utilities, independent power producers (IPPs), and other energy companies are exploring effective ways to manage the imbalance in demand and supply caused by the growing use of unpredictable renewable energy source (RES) in power generation.

Therefore, utilities, energy companies, and grid operators are exploring ways to employ AI technologies to improve the accessibility and efficiency of renewable energy technologies. Despite its nascence, AI possesses tremendous potential to transform the energy & utilities sector.

“In combination with other technologies like Big Data, cloud, and Internet of Things (IoT), AI can support the active management of electricity grids by improving the accessibility of renewable energy sources,” said Swagath Navin Manohar, Research Analyst (Energy & Environment), while referring to Frost & Sullivan’s recent analysis – Impact of Artificial Intelligence (AI) on Energy and Utilities, 2018.

The analysis includes a detailed analysis of current and future applications of AI across the segments of renewables management, demand management, and infrastructure management. It examines technology, geographical, as well as industry trends, key companies that dominate the global AI market, and presents use cases with specific applications of AI.

Over the next several years, AI is expected to boost efficiencies across the renewable energy sector by automating operations in the solar and wind industries. It will also allow utilities and IPPs to launch new business and service models.

“In addition to making the electricity system intelligent and flexible, AI algorithms help utilities and energy companies understand and optimize consumer behavior and manage energy consumption across different sectors,” noted Manohar.

“Meanwhile, complex machine learning algorithms combined with real-time weather data from satellites, ground-based observation, and climate models can be used to forecast the electricity generated by RES like wind, solar, and ocean.”

AI-based applications can create further revenue opportunities for the energy and utility sector by:

  • Empowering software applications to analyze large data sets, identifying patterns, detecting anomalies, and making precise predictions.
  • Aiding the development of smart applications that can autonomously make accurate decisions based on learning. This drives AI’s integration with a wide range of applications.
  • Enabling customer-centric solutions that understand evolving customer needs and make automatic recommendations.
  • Using predictive analytics to improve equipment O&M and predict downtime, which can extend the lifetime of the equipment.
  • Facilitating active customer participation in demand-response programs using game theory algorithms and leveraging blockchain to protect data.

Impact of Artificial Intelligence (AI) on Energy and Utilities, 2018 is part of Frost & Sullivan’s global Power Generation Growth Partnership Service program.

With decarbonization, decentralization, and the rollout of novel technologies, utilities, independent power producers (IPPs), and other energy companies are employing AI to manage the imbalance in demand and supply caused by the growing share of renewable energy sources.

The post AI has huge impact on Energy and Utilities – Frost & Sullivan’s analysis appeared first on Big Data Made Simple.

]]>
https://bigdata-madesimple.com/ai-has-huge-impact-on-energy-and-utilities-frost-sullivans-analysis/feed/ 1
The internet of things and cybersecurity: What startups need to know https://bigdata-madesimple.com/the-internet-of-things-and-cybersecurity-what-startups-need-to-know/ https://bigdata-madesimple.com/the-internet-of-things-and-cybersecurity-what-startups-need-to-know/#respond Thu, 13 Dec 2018 12:14:56 +0000 https://bigdata-madesimple.com/?p=32525 The Internet of Things (IoT) is poised to be a transformative technology. As the inevitable result of boundless possibilities of the internet, it will generate and employ a vast sea of rich data on anything and everything that can be connected to a digital system — further dissolving the boundary that separates the online and … Continue reading The internet of things and cybersecurity: What startups need to know

The post The internet of things and cybersecurity: What startups need to know appeared first on Big Data Made Simple.

]]>
The Internet of Things (IoT) is poised to be a transformative technology. As the inevitable result of boundless possibilities of the internet, it will generate and employ a vast sea of rich data on anything and everything that can be connected to a digital system — further dissolving the boundary that separates the online and offline worlds.

This connected future is realistically unavoidable (it’s already happening, after all). But it’s still some way off. And it’s an exciting time for startups willing to plan ahead. They can take some time to explore and experiment with IoT technology so that they get a grip of it before it becomes the standard.

But with the massive potential of IoT devices comes a similarly large array of risks. Especially when it comes to cybersecurity. But why? And what do startups need to know about the security of IoT technology before they fully embrace it? Let’s answer these questions.

IoT devices open up countless points of vulnerability

Imagine that you ran a large family business and heard about a cost-effective team of remote freelancers. If you hired them, they would greatly aid your operation without incurring significant costs or getting in the way. But could you trust them? It only takes one weak link to severely compromise a chain, after all. And you likely couldn’t rely on them to follow your security policies. Give someone a keycard to enter your office, and they could lose it, or have it stolen.

In much the same vein, the introduction of myriad IoT devices to a formerly-secure network greatly increases its exposure to danger. The communicative channels that must be left open for devices to exchange data could plausibly be co-opted for nefarious purposes, and small devices could be taken, modified, and replaced to leak data or serve as backdoors for later invasion.

And even if your devices are physically secured and your network is locked down, an IoT device can still pose a major threat, as we’ll see next.

Having a strong update system is mandatory

Every IoT device runs on a firmware that determines how it interacts with other devices and systems. And software of any kind can contain basic vulnerabilities. Any given operating system invariably hits the market with numerous security holes that are subsequently patched incrementally (often only after hackers have identified the issues). But patching is a complicated task for IoT devices.

Security updates can be carried out manually or pushed automatically. Although each method invites difficulties. Manual updating can be awkward enough for a standard office but becomes enormously more complex when accounting for greatly-expanded IoT device ranges. If just one device is overlooked, the vulnerability that required the update in the first place remains unresolved.

And while automatic security updating is more convenient, what happens if a device loses connectivity while its update is being pushed? You could configure a system to reject any device attempting to communicate while using outdated firmware. But that could be tremendously impractical if not arranged optimally. Finding the right balance between security and operational viability is a challenge at the best of times.

We’re awaiting a robust IoT security standard

The legal world has yet to fully catch up with matters of personal data privacy (the implementation of GDPR earlier this year was a notable advance, but not a global solution). It should come as no surprise that there are currently no security standards for IoT setups. This is in itself a reason for concern. If you moved too heavily towards using IoT devices and suffered a security problem, there would be no protocol for addressing it. What right would you have to seek compensation from the provider of a vulnerable device? It would need to be argued at length in court since this remains unexplored ground.

At the very least, efforts are being made to introduce laws and regulations that can cover this area. The U.S. Congress read the Internet of Things (IoT) Cybersecurity Improvement Act of 2017 last year. The U.K. government is aiming to move more responsibility for IoT device security from the end users to the manufacturers. But this consideration doesn’t amount to tangible progress, and it’s probably going to be a while before we see anything implemented. The GDPR didn’t come into effect until over two years after it was finalized.

Hopefully, we’ll reach a point at which investing in a range of IoT devices and IoT technologies will be no different from buying a website for your business. The important factor will be the provider and choosing a trustworthy tech company will be enough to ensure that all major security issues are handled on your behalf.

You should experiment with IoT technology — but carefully

It wouldn’t be unreasonable to conclude from everything we’ve looked at thus far that IoT technology should be avoided entirely. That is until all the technical and legal quandaries have been solved. But I suggest going down a different route.

Instead of factoring IoT devices into your fundamental business model, look at them as interesting accessories — use them for entertainment or light assistance in areas that don’t require high security.

Having a full-fledged and all-encompassing IoT-driven business today isn’t actually all that valuable. There are various reasons for this:

  • IoT technology feeds off connectivity, and until the tipping point at which it becomes standard fare, its broader applications will be limited.
  • At this early stage, it could cause confusion and introduce more problems than it solves.
  • By the time IoT systems become mainstream, the hardware will have significantly matured, so an early investment is unlikely to provide lasting value.

The best thing a startup can do to prepare for an IoT-driven future becomes familiar with the concepts and the technology. Play around with ideas to identify ways in which you might be able to bolster your business with IoT technology down the line and create closed networks with secured devices to get a feel for how they work.

Any ambitious startup (in the tech field or otherwise) can benefit most extensively from preparing for the IoT revolution without trying to make it broadly workable today. If you keep an eye on the industry, you can stay apprised of its level of maturity, and identify the point at which it is secure enough to be fully embraced.

The post The internet of things and cybersecurity: What startups need to know appeared first on Big Data Made Simple.

]]>
https://bigdata-madesimple.com/the-internet-of-things-and-cybersecurity-what-startups-need-to-know/feed/ 0
How Grammarly & Google are using Artificial Intelligence for flawless writing https://bigdata-madesimple.com/how-grammarly-google-are-using-artificial-intelligence-for-flawless-writing/ https://bigdata-madesimple.com/how-grammarly-google-are-using-artificial-intelligence-for-flawless-writing/#comments Thu, 13 Dec 2018 07:58:08 +0000 https://bigdata-madesimple.com/?p=32519 Earlier, in order to run a quick spelling and grammar check on any text, we would rely on Microsoft’s Check Document feature. While it was definitely helpful, one would have to copy the text to a Word document to run the check. Now there are many online editing tools – including Grammarly and Google Doc’s … Continue reading How Grammarly & Google are using Artificial Intelligence for flawless writing

The post How Grammarly & Google are using Artificial Intelligence for flawless writing appeared first on Big Data Made Simple.

]]>
Earlier, in order to run a quick spelling and grammar check on any text, we would rely on Microsoft’s Check Document feature. While it was definitely helpful, one would have to copy the text to a Word document to run the check. Now there are many online editing tools – including Grammarly and Google Doc’s inbuilt grammar suggestions, that can do pretty much the same thing as the Check Document tool. And these online tools are designed to improve your writing, by not only helping you with grammar and spelling but also sentence structure. With the help of artificial intelligence (AI) and machine learning, they are now offering more accurate results.

Google and Grammarly suggestions prevent embarrassing grammar mistakes caused by sheer carelessness. We can use these tools even for writing an email, for formal, documented conversations, and for important assignments as well. It’s also something that professionals are dependent on to detect the errors. They have made digital writing so much easier and more accurate than ever. Let us have a look at how these impressive editing tools used by all categories and genres of writers to avoid both major and minor errors.

Grammarly

Grammarly was introduced in 2009 and since then it has helped many to deliver quality content. Currently, it has around fifteen million active daily users. It is one of the top grammar checkers available online, and in its current condition, slated to overtake everybody else. It’s easy to use and flexible. You can download this tool for your mobile device. Or get an extension to your browser – be it Chrome, Safari, Firefox or Microsoft Edge. You can also use the Grammarly desktop app. Right now, in order to gain more exposure and users, Grammarly offers free grammar checks to users.

How does it work?

With artificial intelligence of course!

It uses AI to offer high-quality training data to enable the algorithm to show examples of what it should look like. After a lot of effort and research and feeding the algorithms with data of proper grammar use cases, the AI model can now understand grammar rules. Including spelling, punctuation, and incorrect application of sentence structure or words. The machine is able to correct all these errors. It also uses natural language processing to understand and analyze the language, character, words, and full paragraphs.

The human feedback system is there to ensure improvements as well as adapt to individualistic style. For example, when humans do not accept a proposed suggestion, the system notes this and offers alternative linguistics workings and suggestions. They’ll use the input of the machine to make the required corrections. With more exposure to the text, the machine can offer better suggestions. In short, the algorithm learns not only the correct usage of words and sentence formations but as well as the preferred individual style. Which is why the company switched to a consumer service model in 2010 to have more opportunity to improve their machine learning algorithm as well as to access the major sources of data.

In 2017, investors like IVP, General Catalyst, and Spark Capital invested $110 million into this profitable company. With the aim to enhance the capabilities and to adopt more advanced features to benefit of users. Since then, the company has made significant improvements in their software and are looking to expand to newer capabilities.

Grammarly’s software is constantly adding new check features that also identifies vagueness to help you write more precisely. Hopefully, the new funds will also enable the company to add more staff to continue improvements to the editing tool and algorithm. At present, this editing tool is able to tackle the basic mechanics of your writing such as the grammars, sentence structure, spelling, and inappropriate use of words. It helps with better clarity and readability of a text. The next is to offer content specific suggestions.

Google Docs’ inbuilt grammar suggestion tool

Most of us are pretty familiar with Google Docs. It is one of the easiest ways for teams to share and edit content. And there are a lot of ways about how Google can improve your writing skills. So that you can write better in a given topic. And the good news? Grammarly is now offering extensions to Google Docs for optimal grammar checking services.

However, Google wants to dominate the game by introducing a new grammar suggestions product. With also includes a language translation feature! While the accuracy level is said to be on par with professional human translators and interpreters, this new addition can make a significant difference to the production of more globalized content. It will not only translate one language to another. But, a poorly written text can be translated to corrected one in another language. Users can also review grammar errors which the tool will highlight.

These developments are making significant changes in the writing industry. But it does not mean that these tools are going to replace human writers anytime soon. As said by the Grammarly blog, their company works with the objective of helping users to express themselves in the best possible manner whether it’s for a job application or a simple text to a friend. The company wants to remain true to the image of an ideal communication assistant to help the users understand each other in a better way. And with the right expression.

The post How Grammarly & Google are using Artificial Intelligence for flawless writing appeared first on Big Data Made Simple.

]]>
https://bigdata-madesimple.com/how-grammarly-google-are-using-artificial-intelligence-for-flawless-writing/feed/ 1
Interview with Pankaj Rai, Senior Vice President Strategy at Wells Fargo https://bigdata-madesimple.com/interview-with-pankaj-rai-senior-vice-president-strategy-at-wells-fargo/ https://bigdata-madesimple.com/interview-with-pankaj-rai-senior-vice-president-strategy-at-wells-fargo/#respond Wed, 12 Dec 2018 12:26:55 +0000 https://bigdata-madesimple.com/?p=32499 We recently interviewed Pankaj Rai, a well-known expert in analytics and banking and is currently the Senior Vice President Strategy at Wells Fargo. The former Analytics Director at Dell International Services, Pankaj shared his views on some of the hot topics in the financial & banking sector right now – digital transformation, innovation in fintech, … Continue reading Interview with Pankaj Rai, Senior Vice President Strategy at Wells Fargo

The post Interview with Pankaj Rai, Senior Vice President Strategy at Wells Fargo appeared first on Big Data Made Simple.

]]>
We recently interviewed Pankaj Rai, a well-known expert in analytics and banking and is currently the Senior Vice President Strategy at Wells Fargo. The former Analytics Director at Dell International Services, Pankaj shared his views on some of the hot topics in the financial & banking sector right now – digital transformation, innovation in fintech, new regulations, and blockchain to name a few.

Read the complete interview below:

What does your day typically look like at Wells Fargo? How do you start and end your day?

(Laughs) My days are quite straightforward. I get up 6:30 am in the morning, five days a week (I get up earlier on weekends for a run). At 7:45 am, I am in my car, going to work and 7:45 to 8:30 (which is my travel time) is spent talking to someone – this could be old friends or people who have recently connected with me. So, my morning starts with talking to people. Then 8:30 am to 6:30 pm is typically my work time which is filled with meetings, phone calls, brainstorming sessions and thinking about what to do, how to do it and getting things done. Late evenings are a mix of US calls and family time.

Financial institutions worldwide are exploring new technologies and business models that can help them compete in today’s digital age. From robot advisors, open APIs, chatbots, AI and blockchain technology, the financial industry is undergoing a radical transformation. Can you tell us the top innovations and trends to watch out in 2019?

First of all, let me talk a bit about the digital transformation that you mention. Clearly, technology is entering every business – whether it is automobile or finance or any other industry and disrupting/transforming them.

The future of banking will move away from the physical branch; away even from a PC-based world to a user-centric/mobile-centric one. This makes it all the more important for organizations, big and small, to focus on user-centric design. Design thinking will become a skill set that everyone will have to pick up – whether it’s the process team developing the workflow, the UI/UX guys building the interface or the engineer guys building the information infrastructure.

In developed markets, including ones like India, aggregators that help you manage your expenses (MoneyView), your credit cards (CRED), your financial health will start becoming ubiquitous. We will also see the global expansion of some of India’s world-class app functionalities/features – we all know how YouTube Offline Download was started as a feature for India’s price-sensitive data downloader. Similarly, Uber customized their app for added safety of women passengers, a feature it is now rolling out in other countries. In fact, the Indian digital wallet PayTM is doing well in Canada and is one of the top finance/wallet apps there, because that market has never seen the concept of cashbacks until now.

So non-Banking industry players, whether it is Paytm, Apple Pay or similar players, are the ones that are going after new business opportunities in the financial services space and are impacting banks.

Today, everybody understands the power of data analytics and machine learning capabilities, in understanding the customer’s needs and building a holistic system to enable personalized and relevant interactions with the customers. However, few banks have embraced the holistic approach. Many traditional banks still operate hard-to-transform legacy systems. What are the key pain points and the reasons why banks still struggle with digital transformation?

Banking, at the core of it, is still a highly risk-averse industry. It is heavily regulated in all countries. And in the case of international banks, regulations span multiple countries. Dealing with regulation requires a lot of oversight, involvement, and effort. And the risk of going wrong is too high – over time, this leads to a tendency of ‘if-it-not-broken-do-not-try-to-fix-it’. I believe it’s a cultural thing that’s unique to banks and common across the world. Which is why you don’t see a lot of innovation coming from the banks themselves. Even today, there are banks out there that have their systems based on mainframe technology from the 1970s – those machines are not manufactured anymore, and the hardware can only be repaired by a handful of professionals. The world has moved on long ago from the languages that were used to code those programs. But the situation still remains. There are still banks out there that are so risk-averse that they will keep trying to continue using old tech that works, instead of proactively moving to the latest cutting-edge tech.

When we talk about customer-centricity being the future, the data arrives from the customer’s standpoint. Now, what is happening today is that most of the innovative technology, I would say is connected to big data and analytics. However, data is bit of a problem for most companies, let alone banks. And the reasons are several. One of them is that when the IT systems were designed for most companies including banks, I don’t think they really gave a thought to the fact that data will become such an important element of the IT system. You know you will have to make sure that there is a data center, and capabilities for data manipulation and all of those things. So, the IT systems did not anticipate the big data revolution.  And therefore, they did not design it in a certain way which it is adaptable to better data manipulation. So that is the technology side of it.

The other side of it is privacy regulations. I mean banks are much more careful about customer data. But the price of regulations that are now coming in effect, like the GDPR in Europe, is about the access and control over data. Essentially, what’s happening is that regulators and governing organizations are saying that the data is owned by individual companies. And they should exercise control so that no one can use the data for a purpose that has been not sanctioned by the data owners. So that is a big deal. Because, if you look at most data users today – from companies like Google or Amazon, they are personalizing their services by using all the data. But they don’t always take all the necessary permissions to do so. And as consumers, you and I have technically not given our permission to use that data in whatever way. We don’t realize it because we are getting a lot of great deals and no ‘real’ harm is coming our way. There are also some of us who do know, but don’t mind and don’t care. However, some of the companies were heavily fined in Europe as that continent is leading in data privacy regulation. So, more and more regulations are coming into effect which leans towards the source of the data, and not letting the custodians decide how it can be used.

For example, in certain banking cases I, as a customer in Europe, can say that you can use my data only for ad campaigns from your organization. Or only for analysis, without sending me anything. And I don’t want any AI model to make a business decision for myself. So, you have to get a human operator, to look at my data and tell me the relevant insights. I can choose to do that. Now, by doing that I will suffer because I may not get certain benefits or services. Simply because I chose to opt out. The regulators are giving users that authority to make a choice. So, I think the choice of using the data from the consumer is a very big deal. And companies did not anticipate how big data would be used and regulated. A lot of consumers started selecting how their data would be used. And it slowly becomes a biased sample. Therefore, the analysis that you do on that data may or may not be complete or accurate.

So, I am saying that a lot of unanticipated things are happening in the data world. Thanks to innovators becoming more active and customers becoming more knowledgeable. The way how big data analysis and personalization will evolve is itself to be seen.

I, for example, can see the personalized social media adverts on Facebook. But if I am given a choice, I will prefer not seeing them. Because for me, I actually don’t need this level of personalization. I know what I need, and I know how to make that choice. I don’t need a company telling me what I think I need or not. So, I don’t need targeted marketing. I am saying from my own perspective. So, it’s a relief when data privacy regulations are passed. I think – okay, my data is mine. And I don’t need to worry about it leaking to the wrong hands. You can do this with any model. This is how I think it should be. And lately, more and more people have started thinking along similar lines. Some AI models and algorithms are thought to be very, very useful. But, in reality, they might not be.

The other point to note is also more on the regulatory side. And I think it will become a part of how most industries will function very soon. It is the concept of explainable AI. Today, many AI models, chatbots and machine learning algorithms are based on mathematical models. It shows an outcome, but one does not always know how or why those outcomes are being formed. This is mainly because of the large amounts of data being processed. Either way, the models come up with patterns that can analyze past events and predict future ones. But, at least in the case of the banking industry, it absolutely necessary that one is able to ascertain why a particular result or decision occurs. For example, if a loan is denied, then the decision-maker must be able to explain that the decision was not based on bias such as his designation, his gender, race or other traits. You need to be able to explain this to auditors. Now that explanation is also something new which is coming about, and you cannot explain some of those things.

So, as I said before, there are a lot of new developments that are happening in the finance industry. And I think they will actually curtail the use of big data and AI. On the one hand, we are possessing the power of data. And more and more data are being manufactured every day at an extremely fast pace. On the other hand, regulations and consumer awareness are going in a different direction – saying don’t want you to use the data and if you use it, I need an explanation. So, you cannot simply use a black box model. Because these models are sometimes quite counter-intuitive.

It is a very interesting time. There are two intercepting things happening in reverse directions. And it remains to see how they will play out. The only country where I think they are all sort of aligned is China where there are no particular data regulations. Everywhere else in the developed and democratic world, things are happening in a completely different way.

According to McKinsey, origination and sales generate 65% of global banking profits, totaling about $1,152 billion in 2016. These profit drivers are threatened by increased competition from digital-only challenger banks and tech giants like Google and Amazon. Banks need to act urgently to get ahead in the digital sales game, or they risk becoming mere commodity providers with very thin margins in just a few years. What’s your take on this? How do you think banks should respond?

If you look at how banks make money, there are two ways. One is asset-based lending in which you have the money and you give out loans. And the other one is that you perform a service with no risk of your asset. You are not putting money aside for the service. Not to mention that banks have a regulatory necessity for what kind of equity they should keep. And the equity is based on the risk of the assets they have. This is the fundamental foundation of how a bank works.

If you look at the fintech upstarts, none of them are operating in the space with the highest amount of regulation – customer deposits. They operate in areas where their responsibilities are much lower, where they are not directly holding the customer’s money, and where there isn’t necessarily too much of regulation.

Now, if you look at the system where you are not putting any assets or money aside, those are the areas where it is very easy to be disrupted. Because a new player in the industry or a technology company does need to have a lot of money or own their own assets to run a business like this. Obviously, there is a connection between the two. One area is payments and another area are money transfer. So that is where the latest Fintech innovation is aimed. It has a number of new players have come in to play which are eating into the bank’s profitability. Apart from these two, there are other areas like raising funds and investments. Before there were investment banks who would say I will raise money for you and I will charge you a fee. Now there are digital crowd-sourcing platforms where I can actually go to and raise funds on my own, and thereby rendering the middleman obsolete.  If you look at all the money-making schemes and payment gateways, fundraising platforms, there are now robot advisors for any individual looking for financial advice. Simply plug in financial data into an algorithm, and they can advise him accordingly.

Now banks have also entered this field of fintech. It’s not as if banks are sitting idle. Banks are self-innovating. They will also need to partner with surrounding technology eco-systems. Because they cannot remain isolated and expect similar outcomes as before. The future game-plan now is to build an eco-system in almost every industry, where you can to collaborate across verticals. So, I think given that the banks have multiple points of relations with their customers, they are the one-stop-shop for custom offers from the millions of travel, hospitality, dining, retail and healthcare merchants. Many customers would rather deal with one pair merchant for a particular product/service vs 15 different pairs of vendors.

For example, while you go to a grocery shop and buy every vegetable. You may not want to go to different grocery stores to buy different things, because it is too much of an effort. I think it’s the same way for banks. They are offering very specialized offers to their customers. Try looking at it from the grocery store analogy. Many customers would say that I am not just shopping for vegetables or clothes – I’d rather have the entire package. Some of those in Fintech will partner with departmental stores or malls. Because they are superstores where everything is available.

Given that there will be regulatory sanctions, these partnerships would be able to do several other things that simply Fintech cannot do, as you can imagine. The Fintech field is not regulated. They are unable to do many things these are tasks/transactions that need to be conducted under regulation.

So, I think the future will belong to collaboration. Banks need to learn how to collaborate, which I think has given rise to a whole revolution of related open APIs. In a world where you have to collaborate, APIs are a good mechanism to enable it. So, I think people who are able to creatively find collaboration and enable it can split the income between two collaborators. But since banks have usually had a closed-walled approach to outside parties, banks will need to figure out how to compartmentalize their fintech attempts by creating some mechanism to separate the efforts from the primary bank and its culture.

Apart from that, banks have to face reality – all human endeavor moves towards where the rewards are higher. If banks have been enjoying high-profit margins, it is only a matter of time before other players enter to take a piece of the pie. Technology has been the greatest leveler we’ve seen in the last 50 years, so it’s no surprise that Google and Amazon, who are both digital-born world leaders are the ones that end up driving that charge.

My next question is about blockchain technology. It is currently the hottest innovation in banking. The technology, which underpins cryptocurrencies such as bitcoin, was initially treated with skepticism by banks. However, this is changing dramatically. What’s your take on this?

You might have heard about the philosophical argument for blockchain. How it is based on the concept of establishing trust without a central authority. And that concept can currently be extended to anything related property or finance. Even commerce. Therefore, the context of blockchain is a good one. Collaborators disintermediate the central authority. And the benefits of this intermediation are few but valuable. Because the intermediate authority sometimes extracts rent, which you can get away from. You don’t have to pay any rent to the central authority. And secondly, sometimes the central authority can become too powerful and may not act on the best interest of the all the consequents. Therefore, the benefit of the blockchain technology is that a group of people can self-regulate and determine what is good for them. (They don’t have to pay any rent.)

Contextually, blockchain is a good business case and it should be utilized. Depending on the context of the activity in which this should be applied, it is yet to be seen how it plays out because as you know business is only one element of it. Economic, political and there are several other variables that play out when you commercialize any technology. Just take bitcoin for that matter. It is a popular use case of blockchain technology. In my personal opinion, currency is an important political and economic weapon that countries use to control. And also deal with other countries. The EU and all its trade activities are very, very critical issues of national importance.

So, I don’t think that any country will allow Bitcoin which is a disintermediate way of managing their currencies because the currency will have a very impact on the economy and the politics of the country. Therefore, it will always be been regulated. The moment you start to regulate it, it defeats its very purpose. Because the main purpose or the theory of blockchain was that there are no central authorities to dictate regulations. So, the current use case of blockchain is a difficult one. Maybe there are other creative ways of doing it, but it appears for it to become a meaningful reality as it is supposed to be in the past as an asset class people can trade and do certain things, which is unlikely to become mainstream. That’s my personal view.

However, on the other hand, if we talk about many of the other use cases like the peer group come together and validate some of their activities, which are not so national, economic or political importance, they are purely in the way of reducing friction in our transactions with each other and all of that. Those kinds of use cases I think will make our system far more efficient and will allow many more things to happen which is not happening right now because the technology is not there. So, I think every new technology, while it sorts of eliminates certain old ways of doing things, also creates new opportunity. So many times, people get worried about new technology, they think that it will kill seven years scheme and we will not be profitable. We have seen this in the past and it definitely happened, but many new opportunities have come, and our human ingenuity and innovation have been there. Newer opportunity for people to create value and therefore make money. That’s my personal view on the blockchain.

If we look at cryptocurrency, 2018 has been a year of expectation reset for the community. No longer is crypto being looked at as a replacement of fiat, but as playing a complementary role. The good thing is that the bear market has dried out a lot of the stupid money that was being tapped by ICOs, many of which were nothing more than pipe dreams. The good solid projects have a lot of work happening behind the scenes, and 2019 should throw up some real-world use cases that will see better adoption. There are projects out there which enable you to convert any crypto to any other crypto in real-time and on-the-fly, so if you’re trying to make a payment using Ethereum but the merchant accepts only Bitcoin, that transaction can still happen. Similarly, people are developing web-based protocols that work inside your browser, so if you want to read an article, or access content behind a paywall for 2 hours, or remove annoying pop-up ads, all of this will be possible through some upcoming protocols & integration. The rise of the prices of cryptocurrency had led to a spike in transaction charges, so much so that it became unfeasible to have micro-transactions (you wouldn’t pay $20 in transaction fees to spend $3 on your coffee). However, with the upcoming abilities to use off-chain transactions, micro-transactions will benefit the most.

My next question was about GDPR and new regulators, but you have already covered a lot about them. So, I am not going to ask that. I would like to ask what would be the future of the financial industry in the next five years?

Actually, five years is a very short time in the financial industry. Like I said, even if there are large companies and regulations, things change very slowly than we would like them to be or what we would imagine them to be. But still, I do think that if I were to look forward five years, the answer remains; all those financial companies remain. So, I don’t think substantially anything will change. Fintech will remain. What I image from five years from now is we will have better and new business models of the ecosystem play which today does not exist.

Today my bank has all the information about me. If I have to deal with Fintech, I have to establish a special relationship with them, do everything separately. If you look at the PSD regulations that have come in Europe, all the future regulations and all of those things –  that is my personal observation – is that Europe drives away. The Europeans are more thoughtful about how the regulations should evolve and the Americans are forced to follow that. So that is my personal view. PSD regulations which are asking all the banks to make all the data open for the customers.

So, my worldview five years from now is very API type of a system where the ecosystem helps to deal with things vs the bilateral relationship that we have today with banks and Fintech. At that point in time, in some cases, I might be dealing with some Fintech who has access to my bank data and in some other cases I might be dealing with my bank itself for certain service and they internally will have arrangements to sort of distribute the fees, because many of these things at the backend – they might be sort of helping each other with certain services. And I might also get a fee. So, I would say that it would become a big ecosystem place not just in Fintech, but I would think that the retailers would also start becoming an important part in this. Because if you look at the customers, at the end of the day they’d love to spend money on retail and many other places. And I think those retail companies will find a way to getting into payments. We still have to figure out a reason to collaborate with them to help their requirements of the customers to send the money and also get the financing. So, I imagine it as a dynamic ecosystem powered by API.

We also need to recognize GDPR and similar regulations for what they are – they are a reaction to the ever-increasing reach of digital-only companies when it comes to acquisition, possession, and usage of user-related data, as deemed fit by those companies. The EU is leading the charge in the pushback against such companies, by sending a message that ‘with increased data, comes increased responsibility’. At the end of the day, GDPR is a move in the right direction, as it’s right for the customers. And the organizations that are smart will realize that it is in their best interests to align to what is right for the customers. And if they don’t, then someone else will. The EU, with GDPR, is the equivalent of the referee entering the field of a free-for-all soccer match in which the team of companies had the upper hand and were completely dominating the team of customers.

That being said, there will be some countries where the risk of localization might be higher due to back-door channels/requests that can come in from governments in power. Organizations will need to be careful in how they navigate these risks because customers will be unforgiving and harsh on organizations more than governments if their trust/security gets breached. In situations like these, it is important and admirable to do what Apple has done – put their customer’s priorities front-and-center, fight back on government overreach that requires backdoor access to user data, and refuse to dumb down the level of encryption/security to cater to governmental demands (irrespective of the justification).

The post Interview with Pankaj Rai, Senior Vice President Strategy at Wells Fargo appeared first on Big Data Made Simple.

]]>
https://bigdata-madesimple.com/interview-with-pankaj-rai-senior-vice-president-strategy-at-wells-fargo/feed/ 0
Top 10 cloud security training resources! https://bigdata-madesimple.com/top-10-cloud-security-training-resources/ https://bigdata-madesimple.com/top-10-cloud-security-training-resources/#respond Wed, 12 Dec 2018 11:52:36 +0000 https://bigdata-madesimple.com/?p=32494 Cloud computing skills are in high demand right now. However, on the down-side, statistics have shown that there is a shortage of cybersecurity professionals. Today, 40 percent of businesses are looking for ways to find employees who are qualified in this kind of computing.  By 2019, there will be 1.5 million job openings and the … Continue reading Top 10 cloud security training resources!

The post Top 10 cloud security training resources! appeared first on Big Data Made Simple.

]]>
Cloud computing skills are in high demand right now. However, on the down-side, statistics have shown that there is a shortage of cybersecurity professionals. Today, 40 percent of businesses are looking for ways to find employees who are qualified in this kind of computing.  By 2019, there will be 1.5 million job openings and the talent market are not expected to catch up to this demand. However, that doesn’t stop educators and organizations from bridging that gap. Thousands of quality resources are available to help both established engineers and budding techies get the education and skill sets they need in order to master cloud computing and security.

The cloud migration is widespread and boosts the investments from the enterprises. This is an excellent chance for people who want to improve their knowledge in cloud security to get resources that can help them be more competitive in their jobs. Currently, there exist a big number of various certifications and even more are being uploaded each month. You must make sure that you choose the right resources that will help you achieve success. This list will probably make this choice a little bit easier for you.

1. CompTIA Cloud+

This one is the top suppliers of vendor-neutral certifications worldwide. It has been in existence since 1993. Together with its long-standing industry certifications like the A+ and Network+, CompTIA also has two computing certifications, cloud essentials, and Cloud+.

The CompTIA Cloud+ is a robust entry-level certification that allows professionals with at least two years’ experience of working with networking systems, data centers and storage to get the education needed. It should not be confused with CompTIA’s Cloud Essential that is designed to introduce cloud to non-IT professionals.

If you pass the Cloud+ Certification exams, you may be sure that you possess sufficient knowledge in virtualization, cloud models, security, infrastructure, business continuity, and resource management.  With Cloud+, you can top the list because it allows you to get the perfect cloud technology for your employers and allows you to stand out from professionals who have only done CompTIA certifications. If you have the Network+, Server+, and Security+ certifications, the next logical step is to get the cloud+ certification.

The good news is that you will not need the CompTIA certification to acquire the cloud+.

2. (ISC)2 CCSP

If you are thinking of migrating the entire business to the cloud, there are certain security flaws and resource-intense projects that you should pay attention to.  The market cap of such businesses is expected to be about $10 billion in 2019. So, you should ensure that your business has all the security measures.

Having become a Certified Cloud Security Professional, you may be sure that you have relevant and efficient skills related to data security. You can get necessary training at two non-commercial organizations like (ISC)2 or Cloud Security Alliance, where a team of experts will make you know everything to become a security professional yourself.

The platforms allow you to get the vital knowledge to become a CCSP by taking all the necessary tests and exams. The latter is aimed at checking your competencies related to cloud application and other data infrastructure security. The courses will also teach you to focus on legality, compliance, privacy, and auditing processes, which are relevant as never before after GDPR has come into force.

3. AWS certified DevOps Engineer

This is a certification that permits you to find out the way to operate, provide and manage the distributed utilizing the AWS. This certification will concentrate on the DevOps end of things and include only continuous deployment, testing, and delivery. It additionally covers the governance, controls, and compliance validation. The Op is all about logging systems, monitoring, and metrics in addition to maintaining an operational system.

4. MCSE: private cloud

This course deals with the Windows server and other system center-focused studies. It is excellent for people who have already done the Microsoft certified solutions associate certification. If you are thinking of a career path in Microsoft technology, this is a great course to start with. You should note that it is the only cloud-labeled certification offered by Microsoft.  You should keep in mind that it also dwells on Azure web services.

5. Cloud Academy

If you don’t have enough time or desire to attend a traditional college or university, this online course is completely for you. The website offers both AWS and non-AWS oriented online training. It includes security training for all AWS certifications outlined above as well as the basic courses that cover subjects like OpenStack, docker, and DevOps. It is an excellent resource for learning both the basics and the technology-specific details of the computing under consideration.

6. Practice tests

If you really want to be competitive in the industry, you must make sure that you practice what you learn. A great way to test your knowledge is to engage in these practice tests such as the intrinsic Quiz Library. You get sample questions that will help you see just how well you memorize and apply information.

7. IBM Interconnect

If you really want to learn about technology today, you must listen to what experts in the industry discuss when it comes to mobile and cloud-based conferences. The IBM conference is a well-attended educational meeting that you can easily learn from both the attendees and speakers. Their content is great, and it is more open on allowing its employees to look at the best cloud security world than other vendors in the same area.

8. Google cloud next

This is a cloud-focused conference from a world-famous company. You will learn a lot more than just about Google’s cloud technology. It includes cloud security training for development, data processing, containers, storage, and other related issues. These Google conferences have a tendency to be focused on developers, and this is not an exception.

9. Excellent books for cloud computing pros

Well, if you do not want to attend conferences and have not started any classes yet, consider getting some amazing books on cloud computing services. Books like Amazon Web Services in Action by Andreas Witting and Michael Witting give business people a detailed introduction on how to move their infrastructure to AWS. It helps developers and admins learn how to get started in this path. The book has both terminological and graphical representations of the processes involved.

Another excellent book is Architecting the Cloud by Mike Kavis gives beginners a great foundation on the ways to build cloud-based systems. It is a great book for developers and people who are interested in all the technical information.

10. Cloud-focused blogs

One of the most popular sources of information is blogs. There are numerous blogs like Lori MacVittie’s blog post and Cloud Computing Log that offer you all the information you need to know about this type of computing. Most of these blogs are written by professionals in the field, and you may be sure that the information you find there is invaluable in your daily operations. They also keep you on your toes and help you make use of the architectural patterns available.

Conclusion

With the resources in this list, you are sure that you will learn fast and have enough information to remain competitive in this digital age. However, remember that the path to learning about the computing at hand is not limited to these resources, the best way is to choose your own path and follow it until success is achieved.

The post Top 10 cloud security training resources! appeared first on Big Data Made Simple.

]]>
https://bigdata-madesimple.com/top-10-cloud-security-training-resources/feed/ 0
Data analytics use cases: How big data can help form better governments https://bigdata-madesimple.com/data-analytics-use-cases-how-big-data-can-help-form-better-governments/ https://bigdata-madesimple.com/data-analytics-use-cases-how-big-data-can-help-form-better-governments/#respond Tue, 11 Dec 2018 11:39:40 +0000 https://bigdata-madesimple.com/?p=32467 The best thing about big data analytics is its adaptability and widespread application. It can be (and has been) applied to a divergent number of industries. Even outside of these commercial industries, there is a number of use-cases and advantages for big data. For example, it can have a huge impact on the government sector. … Continue reading Data analytics use cases: How big data can help form better governments

The post Data analytics use cases: How big data can help form better governments appeared first on Big Data Made Simple.

]]>
The best thing about big data analytics is its adaptability and widespread application. It can be (and has been) applied to a divergent number of industries. Even outside of these commercial industries, there is a number of use-cases and advantages for big data. For example, it can have a huge impact on the government sector. Both on a native, national level and on a global scale.

Today, with the current state of affairs, there is a significant number of complex issues on the tables of government organizations. Officials need to work by comprehending all the information they receive from various data sources – media, official reports, surveys, government records etc. They then need to settle on imperative decisions that can influence the lives of a huge number of individuals. In addition to this, we all know how difficult it is to filter through this mountain of information. Not mention the task of verifying the accuracy of it. Flawed and false information can have severe consequences. But what is the solution to handling this problem?

Utilizing a big data management platform, that can clean, filter and analyze the huge number of records and reports. And with the different varieties of data that is generated every day, the constructive outcome it can have is almost unending.

Such a model of analysis is extremely essential to the legislature. It enables the governing bodies to pinpoint the areas that require special attention. And the outcomes become additional and progressive data that would help in future use cases. In this continuous cycle of data generation and actionable insights, constant examination becomes fundamental. Now the government has the benefit to settle on quicker choices. And screen those choices by rapidly enacting changes if essential. If you’re wondering which areas exactly governments can benefit by analyzing large data sets, then here are some examples.

Education

The education sector is the basic need for constant growth and progress in society. It has always been a hotly debated issue over the nation. What steps to be taken for better education and training? There are a variety of things to be done which are coming where big data is fundamental to this sector. Big data allows customized programs where lacs of students get benefited of blended learning which is a combination of online and offline learning. It also tracks the record of the dropout students, hence generates awareness for education. It becomes a crucial aspect for the targeted international recruiting where it is easy to get the information related to any number of past students.

Farming

A majority of the countries get a hike in their GDP from the farming sector. It not only includes the food grains but also covers the land and animals which are existing around the globe. In the horticulture world, it is necessary to monitor these cattle and land fertility to get the estimation of future needs. It reflects the difficulty in maintaining this immense number. The information gets changing every moment in which the government oversees and bolster the ranchers and their assets. Big data is intended to have a special effect on Smart Farming which involves the whole supply chain by using smart sensors and devices produced by big amounts of data for enhanced decision-making capabilities.

Healthcare

It is the most important area for the government and hence requires special attention. There is a huge amount of data generated from this area every day and also the new records are getting added up. To this, it overall has the vast data generation which leads to troublesome when it comes to tracking this data. To overcome this situation, big data can be utilized for keeping online health records of the patient so every time you do not have to get the new one instead to get the track from the old data. This reduces the complexity and enhances work performance.

National security

Countries can utilize advanced analytics to improve national security.  Even though threat and risks are unpredictable, by monitoring and analyzing unusual civilian behavior, AI models can predict potential threats that are most likely to occur. Data like security camera feeds etc. can be analyzed by the security agencies and police that promotes a sharp eye on the gathered information from the separate sources which help them to respond quickly in the cases of crime, attacks and other hazardous situation in the country.

Social and economic policies

With the massive information gathering for years, the government can formulate better decisions for the upcoming years. Government agencies hold a significant amount of data that is generated from the surveys, welfare programs, and public banks. By implementing predictive analytics, the government can come up with the smarter and citizen-focused policies. This offers a wide range of benefits to the individuals can improve the overall growth of the country.

To conclude…

Big data technology thus tends to provide a lot of benefits in the government sector. It allows reducing the time and space complexity by giving quick responses. Data maintenance becomes easy and useful by using this technology. It is proving the government with the necessary tools to drat future changes which are necessary for the massive population in any county. It is essential as it offers a wide security of safeguard that leads to serious concerns. We hope to see a better chance in the governance by implementing the big data technology from the upcoming years.

The post Data analytics use cases: How big data can help form better governments appeared first on Big Data Made Simple.

]]>
https://bigdata-madesimple.com/data-analytics-use-cases-how-big-data-can-help-form-better-governments/feed/ 0
Learning Scala Spark basics using spark shell in local https://bigdata-madesimple.com/learning-scala-spark-basics-using-spark-shell-in-local/ https://bigdata-madesimple.com/learning-scala-spark-basics-using-spark-shell-in-local/#respond Mon, 10 Dec 2018 10:29:20 +0000 https://bigdata-madesimple.com/?p=32396 Apache Spark™ is a unified analytics engine for large-scale data processing. It can be used for a variety of things like big data processing, machine learning, stream processing and etc., Environment I had tested it in Mac OS and Ubuntu. May or may not work on Windows. Prerequisites Java 8 installed and available as java in command line. … Continue reading Learning Scala Spark basics using spark shell in local

The post Learning Scala Spark basics using spark shell in local appeared first on Big Data Made Simple.

]]>
Apache Spark™ is a unified analytics engine for large-scale data processing. It can be used for a variety of things like big data processing, machine learning, stream processing and etc.,

Environment

I had tested it in Mac OS and Ubuntu. May or may not work on Windows.

Prerequisites

Java 8 installed and available as java in command line.

Set up

Download

  • Download the Spark package from Apache Spark Website. Download links points to Spark 2.2.0 which i had used. You can choose the latest from the website.
  • Download movies dataset from GroupLens. Download links point to a small dataset. You can choose a larger one if you have infra.

Unpack

Move both files to the same directory.

unzip ml-lastest-small.zip  
tar -xvzf spark-2.2.0-bin-hadoop2.7.tgz

Spark Shell

Move into the spark extracted directory.

cd spark-2.2.0-bin-hadoop2.7
./bin/spark-shell

It will take some seconds to boot up, be patient. Once it is up, you will be able to see,

Spark context Web UI available at http://172.16.3.139:4040
Spark context available as 'sc' (master = local[*], app id = local-1543057723300).
Spark session available as 'spark'.
Welcome to  
  ____              __  
 / __/__  ___ _____/ /__
 _\ \/ _ \/ _ `/ __/  '_/
/___/ .__/\_,_/_/ /_/\_\   version 2.2.0
   /_/

Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_161)
Type in expressions to have them evaluated.
Type :help for more information.
scala>

You can see the scala> prompt. We will be entering commands in this prompt.

Define data loading function

def loadDF(filepath:String) : org.apache.spark.sql.DataFrame 
= spark.read.format("com.databricks.spark.csv").option("header", "true").option("inferSchema", "true").load(filepath)

You will see following console output

loadDF: (filepath: String)org.apache.spark.sql.DataFrame

Load Data into Dataframe

Data frame is similar to a table in SQL world.

val moviesFile = "../ml-latest-small/movies.csv"
val ratingsFile = "../ml-latest-small/ratings.csv"
val tagsFile = "../ml-latest-small/tags.csv"

val moviesDF = loadDF(moviesFile)

// You can load other files into dataframe as well.

Let’s start the journey

Now we have moviesDF which loaded the data from movies.csv which we had downloaded earlier.

Refer to this documentation for various functions available in the data frame.

Number of records

So simple, just call method .count As with Scala, single parentheses is optional for methods.

moviesDF.count

It will show the number of records as follows

res1: Long = 9742

Schema of the Dataframe

moviesDF.printSchema

The output will show field names and data types.

root
|-- movieId: integer (nullable = true)
|-- title: string (nullable = true)
|-- genres: string (nullable = true)

Data frames can support complex data types such as map, array, and struct. So, it can be nested as well.

Show records from the data frame

moviesDF.show

By default, it shows top 20 records in the data frame.

+-------+--------------------+--------------------+
|movieId|               title|              genres|
+-------+--------------------+--------------------+
|      1|    Toy Story (1995)|Adventure|Animati...|
|      2|      Jumanji (1995)|Adventure|Childre...|
|      3|Grumpier Old Men ...|      Comedy|Romance|
|      4|Waiting to Exhale...|Comedy|Drama|Romance|
|      5|Father of the Bri...|              Comedy|
|      6|         Heat (1995)|Action|Crime|Thri...|
|      7|      Sabrina (1995)|      Comedy|Romance|
|      8| Tom and Huck (1995)|  Adventure|Children|
|      9| Sudden Death (1995)|              Action|
|     10|    GoldenEye (1995)|Action|Adventure|...|
|     11|American Presiden...|Comedy|Drama|Romance|
|     12|Dracula: Dead and...|       Comedy|Horror|
|     13|        Balto (1995)|Adventure|Animati...|
|     14|        Nixon (1995)|               Drama|
|     15|Cutthroat Island ...|Action|Adventure|...|
|     16|       Casino (1995)|         Crime|Drama|
|     17|Sense and Sensibi...|       Drama|Romance|
|     18|   Four Rooms (1995)|              Comedy|
|     19|Ace Ventura: When...|              Comedy|
|     20|  Money Train (1995)|Action|Comedy|Cri...|
+-------+--------------------+--------------------+
only showing top 20 rows

Also, we can specify the number of records as well.

moviesDF.show(4)

Creating new data frames

Operations such as select we do on data frame create another data frame.

val movieTitlesDF = moviesDF.select($"title")

$”title” means column title. select also supports "title" as well. But, it cannot be mix of column and string.

Console output,

movieTitlesDF: org.apache.spark.sql.DataFrame = [title: string]

Now we have created a new data frame called movieTitlesDF from the moviesDF

movieTitlesDF.show(3)
+--------------------+
|               title|
+--------------------+
|    Toy Story (1995)|
|      Jumanji (1995)|
|Grumpier Old Men ...|
+--------------------+
only showing top 3 rows

Creating new fields.

In machine learning applications, for feature engineering, we would be deriving new fields.

Creating new Dataframe with a length of a movie title.

val titleDF = movieTitlesDF.select($"title", length($"title").alias("length"))

titleDF.show(2)
+----------------+------+
|           title|length|
+----------------+------+
|Toy Story (1995)|    16|
|  Jumanji (1995)|    14|
+----------------+------+
only showing top 2 rows

Let us find the movie name with the longest name,

titleDF.sort($"length".desc).head

res2: org.apache.spark.sql.Row = [Dragon Ball Z the Movie: The World's Strongest (a.k.a. Dragon Ball Z: The Strongest Guy in The World) (Doragon bôru Z: Kono yo de ichiban tsuyoi yatsu) (1990),158]

So the movie name is Dragon Ball Z the Movie: The World's Strongest (a.k.a. Dragon Ball Z: The Strongest Guy in The World) (Doragon bôru Z: Kono yo de ichiban tsuyoi yatsu) (1990) which has 158 characters.

We have used sort and sorted the rows in data frame using length in desc order.

Let’s do pattern matching.

Find movies which have(year) in the name.

titleDF.where($"title".rlike("[0-9]{4}")).count
res3: Long = 9730

So, in total, we have 12 movies without a year in it.

Let’s work with Array

If you have seen the moviesDF.show closely, genres is | separated multiple values.

First we will split the genres column by | and make it as array<string>

val genresDF = moviesDF.select(split($"genres","\\|").alias("genres"))
genresDF: org.apache.spark.sql.DataFrame = [genres: array]

genresDF.show(2)
+--------------------+
|              genres|
+--------------------+
|[Adventure, Anima...|
|[Adventure, Child...|
+--------------------+
only showing top 2 rows

As you can see, genres is an array now.

Find out top 3 genres in our records

Next, we will explore the array and find the count of each genre.

genresDF.select(explode($"genres").alias("genre")).groupBy($"genre").count().show(3)

+--------+-----+
|   genre|count|
+--------+-----+
|   Crime| 1199|
| Romance| 1596|
|Thriller| 1894|
+--------+-----+
only showing top 3 rows

Let’s find top 3 movies with most tags

val tagsDF = loadDF(tagsFile)
tagsDF.groupBy($"movieId").count.sort($"count".desc).limit(3).
join(moviesDF, "movieId").select("title","count").show()
+--------------------+-----+
|               title|count|
+--------------------+-----+
| Pulp Fiction (1994)|  181|
|   Fight Club (1999)|   54|
|2001: A Space Ody...|   41|
+--------------------+-----+

That is not all

We have gone through some basic functions available in data frames.

  • count
  • printSchema
  • show
  • select
  • length
  • alias
  • desc
  • sort, desc
  • split
  • rlike
  • where
  • head
  • groupBy

Spark data frame supports number of operations and functions.

Enjoy learning Spark

Explore more and play with the data frame. You can use ratings.csv as well.

References

This post originally appeared here. Republished with permission.

The post Learning Scala Spark basics using spark shell in local appeared first on Big Data Made Simple.

]]>
https://bigdata-madesimple.com/learning-scala-spark-basics-using-spark-shell-in-local/feed/ 0
Data tends to lie a lot. How do you circumvent this? Interview with Suresh Shankar https://bigdata-madesimple.com/data-tends-to-lie-a-lot-how-do-you-circumvent-this-interview-with-suresh-shankar/ https://bigdata-madesimple.com/data-tends-to-lie-a-lot-how-do-you-circumvent-this-interview-with-suresh-shankar/#respond Fri, 07 Dec 2018 08:23:07 +0000 https://bigdata-madesimple.com/?p=32384 Consumer behaviour cannot be so easily predicted from social media reactions like Facebook Likes and Twitter shares. There’s no real way to know. There’s a lot of noise out there. There’s a lot of data available, but it’s not powerful in prediction. “Let me give you an example. Let’s say I post something online about … Continue reading Data tends to lie a lot. How do you circumvent this? Interview with Suresh Shankar

The post Data tends to lie a lot. How do you circumvent this? Interview with Suresh Shankar appeared first on Big Data Made Simple.

]]>
Consumer behaviour cannot be so easily predicted from social media reactions like Facebook Likes and Twitter shares. There’s no real way to know. There’s a lot of noise out there. There’s a lot of data available, but it’s not powerful in prediction.

“Let me give you an example. Let’s say I post something online about a Ferrari and you “Like” it. Are you liking Ferrari or are you liking your post? There’s no real way to know. There’s a lot of noise out there. There’s a lot of data available, but it’s not powerful in prediction, said Suresh Shankar, founder of Crayon Data, in an interview given to Kristie Neo of Deal Street Asia.

Founded in 2012, the Singapore-based Crayon Data has an AI algorithm which analyses consumer data to acquire customers for clients like Emirates Airlines, regional banks, and hotels. Crayon claims its algorithm is smart enough even to predict offline behaviour, a crucial aspect for Southeast Asia where a majority of transactions still take place off the grid.

Excerpts from the interview:

Suresh Shankar, founder, Crayon Data

Data tends to lie, a lot. So how do you circumvent this?

“This is one of the things that we’re training our algorithm to do. We’re basically saying that we’re living in an uncertain situation where at any given time, an enterprise may or may not have all the data about you. Even if they gave us data about you, there may not be data on the internet that I can match it to. Crayon is making the algorithm work extra hard in a model where you’re uncertain about how much data you have on each individual.”

Crayon spent three years being a big data analytics service and we realized that unless enterprises go digital, they’re not going to be able to leverage the power of our algorithm. We spent the last few years building the digital part of our Maya product and platform.

Today we tell our enterprise clients – give us your customers’ transaction data, Maya will tell you all about your customers’ individual tastes. And also give you an API that enables you to create a taste-led digital store front – one for each customer.

Here’s a personal example: I’m a customer with a bank for 17 years with 4 credit cards. This bank is still sending me offers to eat at steak restaurants (when I’m vegetarian!) You go to their portal and it’s got nothing to do with you – it’s not personalized at all. What we give them is an API that says, this is the page you have to give your customers according to the way he or she shops, dines and eats. The API can be used on the web, on a mobile, bot, the widget on a webpage. That’s the digital storefront.

What led to the idea of Crayon?

What we believe is that consumers in the world have too much choice. If I want to pick a coffee shop, there are 15 of them out there. We believe that what people don’t have is time. People don’t have time to navigate all the confusing options out there.

It takes over 45 minutes to make a choice. You read about 6 to 12 reviews, visit 10 sites to make one decision – where to get good coffee. If it’s a complex process like planning a holiday, this could be much longer. But it doesn’t have to be this hard. That was the problem that we wanted to solve.

Our algorithm can solve the problem for you, you are almost able to believe the smaller set of more relevant choices it presents you. You save time. In a way, you almost surrender to the algorithm, it helps simplify your life in more ways than you can imagine, though there’s a good and a bad side to such a surrender.

We want to build a platform for any consumer in the world, based on her personal tastes, to fundamentally be able to find the things that they want without spending the extra bit of time.

At Crayon Data, we put a very large emphasis on privacy. Our engines do not search individuals or track people on the Internet. We infer and decode your tastes and preferences based on what you do, as opposed to who you are.

It comes across to me that the space that you’re occupying is getting pretty crowded.

There are different forms of competition. First, there are the guys who sell the data. Then there are also the guys selling tools like SAP and Oracle. Finally, the digital guys who will provide the front-end personalization. The difference for us is that the moment you put the data into our platform, it does all of it for you. All it takes is three weeks. So, we’re a full stack solution.

I think client mindsets are changing slowly but surely. Today, some of the company CEOs we meet say “We can hire ten people to do this, but we recognize that it will take a long time to build. We should just partner with a specialist and get it out faster?” Most of the resistance tends to come from middle management.

Which sectors do you think are more receptive to Crayon today?

Crayon believes that airlines, hotels, and retailers will move faster. The reason is that their business model is getting destroyed faster than other industries. They’re under threat. They’re also living in a world where the internet has completely dominated their lives. They don’t know when someone like Google will take over everything. That’s where the big growth opportunity for us.

We also see opportunities in the credit card, debit card business. Some players think everything is okay. But they’re beginning to realize that with Amazon Pay, Alipay, GoPay and GrabPay, everyone’s doing their own payment platform and wallet. The threat is not the payment revenues, but the invaluable data that each transaction carries about customer preferences. Lose that, and you lose ownership of the customer.

Would you consider partnering with e-wallets?

We’re working on that. One of the things that we’re doing is to go to these guys and say – you already have your own wallet and a set of services that you offer we can help you quickly get into another vertical. Like a Go-Jek. Every month they’re adding a new service which you buy on the Go-Jek platform. We can say – why don’t you take over the travel business, I already do this for one of the world’s best airlines. Instead of building your own travel product, you can easily use my platform.

We also believe that the pure digital banks will begin to use more of us. We’re talking to a few in the US and one in the Middle East. We’re working with a completely new digital bank that just set up in the Middle East. We’re also working with one of the biggest banks in Yangon, Myanmar which is interesting. They want to move from a cash economy to a WeChat economy, where they want to skip all the intermediate steps and go directly to a mobile, digital model.

But most of the transactions in Southeast Asia are still offline. How is your product going to bridge this offline-online divide?

That’s what our algorithm aims to do. One thing we’ve learned is that if you’re trying to do something innovative that requires you to move the market, the markets themselves need time to adopt what you’re trying to do. This is especially so in a B2B context. It’s not just about my knowing what’s right, it’s also about convincing the market that this is right. Gartner predicted in 2016 that it will take two years for AI-led personalization to go mainstream, and it literally is happening as we speak, so we’re in the right place now.

Do you pay a significant financial penalty when it comes to data regulation and privacy?

In terms of GDPR, I would say everyone else has been clamping down significantly on data privacy with the exception of the US. The definitions that regulators use for how you use data classifies us as a data processor. This means we are not the enterprise which bears the financial penalty for the misuse of data. In other words, I’m not using the data, I’m using it on behalf of my clients. There are still consequences, but the main burden is with the owner of the customer.

The post Data tends to lie a lot. How do you circumvent this? Interview with Suresh Shankar appeared first on Big Data Made Simple.

]]>
https://bigdata-madesimple.com/data-tends-to-lie-a-lot-how-do-you-circumvent-this-interview-with-suresh-shankar/feed/ 0