Many companies have adopted scalable and flexible databases and open-source storage systems such as Hadoop to store and process varied and vast amounts of data to figure out how to proceed. But storing and accessing data is only part of the job of data scientists. They also need to run calculations on the data at scale, and available technology might not be sufficient.
Hence the rise of a concept called “big compute,” which allows for faster and more widely distributed crunching of data.
In a blog post published over the weekend, Michael Malak, an engineer working with data at Time Warner Cable and a board member of the Data Science Association, identified a few specific hardware components that could help data scientists do better big compute: graphic-processing units (GPUs) and random-access memory (RAM).