Data Science

5 groups of Data Scientists: Which group are you in?

01st Jun `15, 02:17 PM in Data Science

A data scientist is someone who performs statistical analysis, data mining and retrieval processes on a large amount…

Manu Jeevan
Manu Jeevan Contributor
Follow

A data scientist is someone who performs statistical analysis, data mining and retrieval processes on a large amount of data to identify trends, figures and other relevant information and help a business gain a competitive edge. But, are all the data scientists same? Nope. They come from different backgrounds and attack problems from different angles. Thanks to Harlan D. Harris, Sean Patrick Murphy and Marck Vaisman for their introspective survey on Data Scientists. Based on their survey, they were able to identify five different groups of data scientists. Let’s go through each one of them.

Data Business people

data business people

People like Kirthika are data scientists, found in large organizations or in their own start-ups. In large organisations, they are mainly found in project management roles. They are great at dealing with other professionals and have a comprehensive knowledge in data science process.

Data Creative

data creative data science persona

Data creative are good at statistics, programming and big data technologies. They are a great asset for smaller companies, where flexibility is important. They are good at doing day-to-day work of a data scientist.

Data Developer

data developers

Data developer’s day-to-day work involves getting data from different sources and sorting the data in large databases, querying those databases, and analyzing the results to derive meaningful information from them.

Data Researchers

data researchers

They come from the academic world and have a strong background in statistics. They also tend to have PhDs. Business skills are not their strength, but they are excellent analysts.

Data scientists Generic

data scientists generic

Generic data scientists are similar to data business people but without the immense experience or the intense business focus. They are more balanced than the other four types of data scientists. They are flexible like data science creative, but with a better understanding of the business world. Generic data scientists are passionate about the field, and have a T-shaped skill set.

Data science is a creative field which requires a professional to collaborate with various other people such as data base administrators, business people, software engineers etc to complete a project. A data scientist should not only have broad range of skills, but also posses a deep expertise in their area of specialization.

For example, a data scientist with an statistics background and deep skills in probability, descriptive and inferential statistics might find value in learning some of the machine learning algorithms and optimization techniques. The same data scientist should also posses enough broad programming, big data, and business skills.

Skills of other 4 data scientists (except generic data scientists)

Check out the two graphs below that represent the skills of four data scientists.

different types of data scientists.

different types of data scientists.

As you can see, Data Business people are most likely to have primarily Business-related skills. Data Business people have strongest skill rankings in other areas, such as Statistics and ML/Big Data. Data Researchers are also those most likely to have expertise in Statistics and Mathematics. Both Data Business people and Data Researchers were quite unlikely to rate Programming skills as their highest skills. Data creative and Data developers are likely to have expertise in programming and big data. But Data creatives are good at statistics when compared to data developers.

Conclusion

Which type of data scientist are you? To know about your strengths and weakness as a data scientist take this survey, and see where you fit in. If you are an aspiring data scientist I would recommend you to first concentrate on your core strength. If you are from computer science background then develop your skills in python, machine learning, big data and be proficient at any one of the lower level languages(c, c++ or java). If you have a business background it is good to concentrate on statistics, business and data visualization.

MORE FROM BIG DATA MADE SIMPLE