Data Scientist Interviews

Data Scientist Interview Questions

In a data scientist interview, expect employers to ask questions that assess your data modeling, problem-solving, and programming skills. Be prepared to answer general questions that test your knowledge of statistics and data science. You should also be ready to answer open-ended questions that test your creativity, communication skills, and formal education in data modeling and programming.

Top Data Scientist Interview Questions & How to Answer

Question 1

Question #1: Which data modeling techniques do you prefer and why?

How to answer
How to answer: Turning data into understandable and actionable information is a critical part of the data scientist's job. This question allows employers to understand your data modeling skills and background. List and discuss your preferred data modeling techniques, including benefits such as ease of use, flexibility, etc.
Question 2

Question #2: How would you detect bogus Instagram accounts used for scamming consumers?

How to answer
How to answer: Questions like this one allow an employer to test your problem-solving skills. When answering open-ended questions such as these, feel free to ask clarifying questions and use whiteboards to demonstrate your coding and diagramming skills. Share your thought process as you work through the problem.
Question 3

Question #3: Describe circumstances that require a list, tuple, or set in Python.

How to answer
How to answer: Interviewers will use questions such as this one to test your Python programming skills. Review Python basics such as lists, tuples, and sets before your interview. You should be able to explain when and how each tool is used by data scientists.

54,342 data scientist interview questions shared by candidates

1. Started with a detailed explanation of a past project - what was the business question, how did you come up with the solution, what was your hypothesis, how did you design the A/B test, why did you make certain choices, what was the result etc. Prepare 1-2 examples from your past, where you can talk in depth about the technical elements of your project. 2. Let's say we have a dataset with attributes for a house (Sq footage, locality etc) and house price. How will you predict the house price from these attributes? (Build a multiple regression model) 3. For this multiple regression model, explain the end-to-end process. What steps will you take before building the model, how will you impute missing values, how will you handle outliers etc. What are the underlying assumptions of a regression model? 4. Once the model is built, how will you infer the relationship (sign and magnitude) between the house attributes and house price. How will you explain it to someone that's not a technical person? 5. For the regression coefficients, how will you interpret them, (p-values, confidence interval etc). How will you explain a p-value to a layman 6. Next question was about "how will you segment customers" in order to serve a business requirement, such as determining which customers to show a given ad (I answered with clustering, because the business problem wasn't very specific, he just described it very generally) 7. For clustering, how does it work, how to choose the value of K in k-means. I also said we can use Gaussian mixture models for clustering, which he didn't seem to know because he asked me to clarify what I mentioned. There might have been a few more questions that I don't remember, but the theme of the interview was to check how well you know the basics of Stats/ML. I believe I answered most of the questions correctly so to receive the feedback that I wasn't up to the mark technically seemed like a case of Google not wanting to reveal the real reason, whatever it was. Either way, make sure you confirm the format of the interview with the recruiter. Because I was already interviewing with other companies, I had brushed up on my Stats/ML basics, but you might not be as lucky. Good luck!
avatar

Marketing Data Scientist

Interviewed at Google

4.4
Nov 19, 2020

1. Started with a detailed explanation of a past project - what was the business question, how did you come up with the solution, what was your hypothesis, how did you design the A/B test, why did you make certain choices, what was the result etc. Prepare 1-2 examples from your past, where you can talk in depth about the technical elements of your project. 2. Let's say we have a dataset with attributes for a house (Sq footage, locality etc) and house price. How will you predict the house price from these attributes? (Build a multiple regression model) 3. For this multiple regression model, explain the end-to-end process. What steps will you take before building the model, how will you impute missing values, how will you handle outliers etc. What are the underlying assumptions of a regression model? 4. Once the model is built, how will you infer the relationship (sign and magnitude) between the house attributes and house price. How will you explain it to someone that's not a technical person? 5. For the regression coefficients, how will you interpret them, (p-values, confidence interval etc). How will you explain a p-value to a layman 6. Next question was about "how will you segment customers" in order to serve a business requirement, such as determining which customers to show a given ad (I answered with clustering, because the business problem wasn't very specific, he just described it very generally) 7. For clustering, how does it work, how to choose the value of K in k-means. I also said we can use Gaussian mixture models for clustering, which he didn't seem to know because he asked me to clarify what I mentioned. There might have been a few more questions that I don't remember, but the theme of the interview was to check how well you know the basics of Stats/ML. I believe I answered most of the questions correctly so to receive the feedback that I wasn't up to the mark technically seemed like a case of Google not wanting to reveal the real reason, whatever it was. Either way, make sure you confirm the format of the interview with the recruiter. Because I was already interviewing with other companies, I had brushed up on my Stats/ML basics, but you might not be as lucky. Good luck!

There is a pool of people who took Uber rides from two cities that were close in proximity, for example Menlo Park and Palo Alto, and any data you could think of could be collected. What data would you collect so that the city the passenger took a ride from could be determined? If some type of supervised classification algorithm was used to split the two populations, for example Support Vector Machine, what type of business decision would you make on points that appear close to the hyperplane?
avatar

Data Scientist

Interviewed at Uber

3.7
Apr 8, 2015

There is a pool of people who took Uber rides from two cities that were close in proximity, for example Menlo Park and Palo Alto, and any data you could think of could be collected. What data would you collect so that the city the passenger took a ride from could be determined? If some type of supervised classification algorithm was used to split the two populations, for example Support Vector Machine, what type of business decision would you make on points that appear close to the hyperplane?

Viewing 1551 - 1560 interview questions

Glassdoor has 54,342 interview questions and reports from Data scientist interviews. Prepare for your interview. Get hired. Love your job.