Data Scientist Interviews

Data Scientist Interview Questions

In a data scientist interview, expect employers to ask questions that assess your data modeling, problem-solving, and programming skills. Be prepared to answer general questions that test your knowledge of statistics and data science. You should also be ready to answer open-ended questions that test your creativity, communication skills, and formal education in data modeling and programming.

Top Data Scientist Interview Questions & How to Answer

Question 1

Question #1: Which data modeling techniques do you prefer and why?

How to answer
How to answer: Turning data into understandable and actionable information is a critical part of the data scientist's job. This question allows employers to understand your data modeling skills and background. List and discuss your preferred data modeling techniques, including benefits such as ease of use, flexibility, etc.
Question 2

Question #2: How would you detect bogus Instagram accounts used for scamming consumers?

How to answer
How to answer: Questions like this one allow an employer to test your problem-solving skills. When answering open-ended questions such as these, feel free to ask clarifying questions and use whiteboards to demonstrate your coding and diagramming skills. Share your thought process as you work through the problem.
Question 3

Question #3: Describe circumstances that require a list, tuple, or set in Python.

How to answer
How to answer: Interviewers will use questions such as this one to test your Python programming skills. Review Python basics such as lists, tuples, and sets before your interview. You should be able to explain when and how each tool is used by data scientists.

54,212 data scientist interview questions shared by candidates

write a code in R/SQL: Given a table with three column, (id, category, value) and each id has 3 or less category (price, size, color). Now, how can I find those id's for which the value of two or more category matches to one another? For eg: ID1 (price 10, size M, color Red), ID2 (price 10, Size L, Color Red) , ID3 (price 15, size L, color Red) Then the output should be two rows: ID1 ID2 and ID2 ID3
avatar

Data Scientist

Interviewed at Amazon

3.5
Apr 13, 2018

write a code in R/SQL: Given a table with three column, (id, category, value) and each id has 3 or less category (price, size, color). Now, how can I find those id's for which the value of two or more category matches to one another? For eg: ID1 (price 10, size M, color Red), ID2 (price 10, Size L, Color Red) , ID3 (price 15, size L, color Red) Then the output should be two rows: ID1 ID2 and ID2 ID3

- What is over-fitting? How do you avoid it? - What types of regularization do we have? Which one is simpler to use? L1 or L2? - Explain decision trees? What are different metrics to classify dataset? - What is bagging? - We have two models, one with 85% accuracy, one 82%. Which one do you pick? - What is p-value and how can we use it?
avatar

Data Scientist

Interviewed at Amazon

3.5
May 5, 2018

- What is over-fitting? How do you avoid it? - What types of regularization do we have? Which one is simpler to use? L1 or L2? - Explain decision trees? What are different metrics to classify dataset? - What is bagging? - We have two models, one with 85% accuracy, one 82%. Which one do you pick? - What is p-value and how can we use it?

Viewing 421 - 430 interview questions

Glassdoor has 54,212 interview questions and reports from Data scientist interviews. Prepare for your interview. Get hired. Love your job.