Data Scientist Interviews

Data Scientist Interview Questions

In a data scientist interview, expect employers to ask questions that assess your data modeling, problem-solving, and programming skills. Be prepared to answer general questions that test your knowledge of statistics and data science. You should also be ready to answer open-ended questions that test your creativity, communication skills, and formal education in data modeling and programming.

Top Data Scientist Interview Questions & How to Answer

Question 1

Question #1: Which data modeling techniques do you prefer and why?

How to answer
How to answer: Turning data into understandable and actionable information is a critical part of the data scientist's job. This question allows employers to understand your data modeling skills and background. List and discuss your preferred data modeling techniques, including benefits such as ease of use, flexibility, etc.
Question 2

Question #2: How would you detect bogus Instagram accounts used for scamming consumers?

How to answer
How to answer: Questions like this one allow an employer to test your problem-solving skills. When answering open-ended questions such as these, feel free to ask clarifying questions and use whiteboards to demonstrate your coding and diagramming skills. Share your thought process as you work through the problem.
Question 3

Question #3: Describe circumstances that require a list, tuple, or set in Python.

How to answer
How to answer: Interviewers will use questions such as this one to test your Python programming skills. Review Python basics such as lists, tuples, and sets before your interview. You should be able to explain when and how each tool is used by data scientists.

54,195 data scientist interview questions shared by candidates

How would you measure the health of Mentions, Facebook's app for celebrities? How can FB determine if it's worth it to keep using it? If a celebrity starts to use Mentions and begins interacting with their fans more, what part of the increase can be attributed to a celebrity using Mentions, and what part is just a celebrity wanting to get more involved in fan engagement?
avatar

Data Scientist

Interviewed at Meta

3.6
Mar 29, 2017

How would you measure the health of Mentions, Facebook's app for celebrities? How can FB determine if it's worth it to keep using it? If a celebrity starts to use Mentions and begins interacting with their fans more, what part of the increase can be attributed to a celebrity using Mentions, and what part is just a celebrity wanting to get more involved in fan engagement?

Case Interview: the case is the car finance loan. - what are revenues and expenses - given a model that predicts when a customer is good (loan should be approved) or bad (loadn should be decline) find out: 1. the probability that the customer is good given the model predicts good 2. the probability that the customer is bad given the model is good 3. given a pentile graph of # of checked off loans / # of loans what is a better model than the current; what is the best model. Behavioral interview: - tell me about a time that you had to deal with changing objectives in your team/project - tell me about a time that you had to deal with unexpected problems in your project - tell me about a time that you had to persuase somebody Role interview: the case is a report on air company with low percentage of flight on time. Read the report an give an evaluation of it and some reccomendations to your boss. 15 minutes to read the report and remove anything unecessary or spot errors. 20 minutes to present it to your boss. 15 minutes to discuss afterwards from data scientist to data scientist.
avatar

Data Scientist Intern

Interviewed at Capital One

3.6
Oct 14, 2016

Case Interview: the case is the car finance loan. - what are revenues and expenses - given a model that predicts when a customer is good (loan should be approved) or bad (loadn should be decline) find out: 1. the probability that the customer is good given the model predicts good 2. the probability that the customer is bad given the model is good 3. given a pentile graph of # of checked off loans / # of loans what is a better model than the current; what is the best model. Behavioral interview: - tell me about a time that you had to deal with changing objectives in your team/project - tell me about a time that you had to deal with unexpected problems in your project - tell me about a time that you had to persuase somebody Role interview: the case is a report on air company with low percentage of flight on time. Read the report an give an evaluation of it and some reccomendations to your boss. 15 minutes to read the report and remove anything unecessary or spot errors. 20 minutes to present it to your boss. 15 minutes to discuss afterwards from data scientist to data scientist.

A set of values given: Assume table in SQL or list of dictionaries if using Python. Basically a row of data contained information: if it is post or it is a comment, row id and some other data. Find distribution of comments. #comments # posts 1 5000 2 6787 .. ..
avatar

Data Scientist

Interviewed at Meta

3.6
Sep 27, 2017

A set of values given: Assume table in SQL or list of dictionaries if using Python. Basically a row of data contained information: if it is post or it is a comment, row id and some other data. Find distribution of comments. #comments # posts 1 5000 2 6787 .. ..

1. Given an empty BST consist of n nodes and and an array consist of n numbers. The n nodes in a BST have been already arranged in some fashion(i.e. the BST is not empty), and none of the nodes in BST are having any data, that means we have to pick the n numbers from the given array and have to fill in the given BST. We have to make sure that the structure of the BST doesn't change. That means all the left subtree and right subtree at any given node should not change at all. 2. We have a function which returns a value among {1, 0, -1}. When the function returns -1 that means we have to terminate. we have to keep on calling this function and till we get -1. this means we will get series of 1's and 0's which we have to treat like bit pattern and has to check whether the given number is divisible by 3 or not. for e.g. the function call returns the below output. 101-1=> 101 => it's a 5 which is not divisible by 3.
avatar

Computer Scientist

Interviewed at Adobe

4.1
Sep 11, 2016

1. Given an empty BST consist of n nodes and and an array consist of n numbers. The n nodes in a BST have been already arranged in some fashion(i.e. the BST is not empty), and none of the nodes in BST are having any data, that means we have to pick the n numbers from the given array and have to fill in the given BST. We have to make sure that the structure of the BST doesn't change. That means all the left subtree and right subtree at any given node should not change at all. 2. We have a function which returns a value among {1, 0, -1}. When the function returns -1 that means we have to terminate. we have to keep on calling this function and till we get -1. this means we will get series of 1's and 0's which we have to treat like bit pattern and has to check whether the given number is divisible by 3 or not. for e.g. the function call returns the below output. 101-1=> 101 => it's a 5 which is not divisible by 3.

Viewing 81 - 90 interview questions

Glassdoor has 54,195 interview questions and reports from Data scientist interviews. Prepare for your interview. Get hired. Love your job.