• Samuel Marcik

What do Data Scientists do?

Updated: Jun 20, 2018

SearchTeam asked Data Science Expert Dan Hnyk what actually Data Scientist do and how to enter a data science field.




What do you do as a Data Scientist?

  • clean the data and prepare for analysis

  • analyzing data sets of various sizes from data quality assurance perspective (missing values, errors, statistics)

  • visualizing the data (charts of distributions, tables, relations...)

  • model the data

  • transform the data


How can somebody become a Data Scientist?

  • learn any open source stack to work with data sheets, Python (Pandas library) - preferred - or R, then in depends on the size of data sets and domain

  • have a strong domain knowledge about the problem (or have someone else who has that)

  • knowing maths and statistics helps


What are the traits of excellent Data Scientist?

  • as every engineer: focus on a product, knowing what's the goal

  • his work is 100% reproducible, transparent

  • consider data as sacred - never does some weird transformation without thinking through possible side-effects 

  • he is never dishonest about the analysis, never hides something

  • never believes only his intuition, always validates that with reality

  • being able to handle whatever input source of the data he encounters


Do I have to know some programming language to become a Data Scientist?

  • yes, Python or R is ruling the world, respectively 

  • Excel, SPSS, Tableau etc. are sooner or later limit you what you can do (automation, running on a headless Linux server, size of the data set, ...). Plus they are expensive for no additional value


In what companies are Data Scientist positions?

  • tech companies, not surprisingly... Any company which collects some data and is big enough to hire a dedicated data scientist (>40 people my humble guess). Data are coming from non-data products such as user behavior, acquisition, various performance, pricing models... Or data products such as performing some data research (such as surveys), evaluating a performance of some processes, web analytics, ...


What is the best way to get a entry level data science job?

  • knowing some open source stack to the job (so the company doesn't have to invest in you)

  • have hands-on experience on a real data, e.g. through Kaggle competition

  • if you can program (advantage of knowing Python), it's usually much easier since you can also act based on the analysis


What are the main challenges of the Data Science job?

  • having the data - real-world data sets are small, messy, full of empty values and errors, without proper description or documentation. It's often more important to propose a better data acquisition mechanism for better data and providing a simple analysis rather than having robust analysis on a garbage data (there is a saying "Garbage in, garbage out" and it's true). It's often not necessary to collect everything, but only a good subset and concentrate on that

  • predicting time taken, it's just a dark art in DS. But all of us must meet the deadlines.


What do you like mostly about Data Science job?

  • it's variance (non routine) - no data set is the same, there are always surprises, you never know what's going to come

  • infinite number of approaches to solve a given problem

  • rapidly developing tech stack


What tools use Data Scientist?

  • PC/laptop with sufficiently high memory is all you need from a hardware perspective (depending of the size of the data set)

  • programming stack of his choice

  • various input sources of the data/storage (e.g. SQL, CSV, HDF, Hadoop...)


What in you opinion is a future of data science? 

  • automation of data insights - data science as an automated service 


Are there any good data science courses? a Are there any data science certifications?

  • yes, Coursera and Udacity offers great high-quality courses and the choice depends on ones target area

81 views
LINKS
CONTACTS
SOCIAL
  • Black LinkedIn Icon

SEARCHTEAM PRAGUE

Vaclavske namesti 1

110 00 Prague 1

Czech Republic

SEARCHTEAM LONDON

20-22 Wenlock Road

N1 7GU London

United Kingdom

© 2018 by SearchTeam