Big Data is a field where even a single millisecond loss can be significant over billions of events. Yet, languages often regarded as slow like Python have gained a lot of popularity in the past year. Recent articles and discussions in the Big Data community have started reigniting the debate around the choice of a programming language for data science and Big Data.
According to Ville Tuulos, principal engineer at AdRoll, the raw performance of a language doesn’t matter. Ville’s findings were presented in a meetup in September 2013 in San Francisco, showing AdRoll’s backend stack built around Python, and how they are able to outperform giants like Amazon’s Redshift. The key here is that they built their system based on their own very specific use case, which allowed them to optimize for that one use case. As Ville says, “You can use a high-level language to quickly implement domain-specific solutions that outperform generic solutions, regardless of the language they use.”