Data engineer Interview Questions

Question 1: # Complete a function that returns the number of times a given character occurs in the given string # For example: # - input string = "mississippi" # - char = "s" # # - output : 4 """

Data Engineer

Interviewed at Meta

3.6★

Jun 8, 2020

Question 1: # Complete a function that returns the number of times a given character occurs in the given string # For example: # - input string = "mississippi" # - char = "s" # # - output : 4 """

Given a multi-step product feature, write SQL to see how well this feature is doing (loading times, step completion %). Then use Python to constantly update average step time as new values stream in, given that there are too many to store in memory.

Data Engineer

Interviewed at Meta

3.6★

Apr 30, 2018

Given a multi-step product feature, write SQL to see how well this feature is doing (loading times, step completion %). Then use Python to constantly update average step time as new values stream in, given that there are too many to store in memory.

SQL: 1. Percentage increase in revenue compared to promoted and non-promoted products. 2. Products classes that has the highest transactions 3. Count of Customers who bought 2 items type (A,B) 4. Don't remember Python: 1. Average length of letters to words. 2. Parse an ip address (This is a favourite FB question) 3. [[A],[A,B],[A,C],[B,D],[C,A]] -- Find the alphabet with highest neighbors? -- (Wasnt able to solve because of time limit but the interviewer was like I get what I want to convey.. I gave her an algo of what I would I have done)

Data Engineer

Interviewed at Meta

3.6★

Sep 21, 2018

SQL: 1. Percentage increase in revenue compared to promoted and non-promoted products. 2. Products classes that has the highest transactions 3. Count of Customers who bought 2 items type (A,B) 4. Don't remember Python: 1. Average length of letters to words. 2. Parse an ip address (This is a favourite FB question) 3. [[A],[A,B],[A,C],[B,D],[C,A]] -- Find the alphabet with highest neighbors? -- (Wasnt able to solve because of time limit but the interviewer was like I get what I want to convey.. I gave her an algo of what I would I have done)

In python code, given a json object with nested objects, write a function that flattens all the objects to a single key value dictionary. Do not use the lib that actually performs this function. { a:{b:c,d:e} } becomes {a_b:c, a_d:e} ( not, a:"b:c,d:e" }

Data Engineer

Interviewed at Amazon

3.5★

Apr 29, 2020

In python code, given a json object with nested objects, write a function that flattens all the objects to a single key value dictionary. Do not use the lib that actually performs this function. { a:{b:c,d:e} } becomes {a_b:c, a_d:e} ( not, a:"b:c,d:e" }

Python Questions - 1) Print Max element of a given list 2) Print median of a given list 3) Print the first nonrecurring element in a list 4) Print the most recurring element in a list 5) Greatest common Factor

Data Engineer

Interviewed at Meta

3.6★

May 24, 2016

Python Questions - 1) Print Max element of a given list 2) Print median of a given list 3) Print the first nonrecurring element in a list 4) Print the most recurring element in a list 5) Greatest common Factor

SQL Select the value of a column based on the max of a different column from each grouping of yet a third column. Column A, Column B, Column C. For each group based on Column A, give value of Column B, where Column C is max for that group.

Data Engineer

Interviewed at Amazon

3.5★

Apr 29, 2020

SQL Select the value of a column based on the max of a different column from each grouping of yet a third column. Column A, Column B, Column C. For each group based on Column A, give value of Column B, where Column C is max for that group.

# Question 3: # Complete a function that returns a list containing all the mismatched words (case sensitive) between two given input strings # For example: # - string 1 : "Firstly this is the first string" # - string 2 : "Next is the second string" # # - output : ['Firstly', 'this', 'first', 'Next', 'second']

Data Engineer

Interviewed at Meta

3.6★

Jun 8, 2020

# Question 3: # Complete a function that returns a list containing all the mismatched words (case sensitive) between two given input strings # For example: # - string 1 : "Firstly this is the first string" # - string 2 : "Next is the second string" # # - output : ['Firstly', 'this', 'first', 'Next', 'second']

1. What difference have you made in current team apart from regular work ? 2. What are the steps you follow to rebuild a table in database ? 3. How did you do performance tuning ? 4. How do you find the skewness of data in table ? 5. Difference between RDBMS and Dimensional Modeling SQL 1) purchase customer_id product_id quantity purchase_date 1 111 1 01/01/2017 1 111 2 01/02/2107 1 222 2 01/02/2017 2 111 3 01/04/2017 2 222 1 01/03/2017 3 222 1 01/05/2017 3 222 1 01/06/2017 3 111 1 01/06/2017 3 111 1 01/04/2017 Q: How many customers bought each product how many times during the week? Product_Id Number_of Customers Number_of_Times 111 2 2 111 1 1 222 2 1 222 1 2 2) daily_usage account_id usage_amount usage_date 1 10 1 1 20 2 1 15 3 1 30 4 Q. a) How do you print the usage_amount of previous/consecutive rows b) Without using window functions

Data Engineer

Interviewed at Amazon

3.5★

Feb 16, 2017

1. What difference have you made in current team apart from regular work ? 2. What are the steps you follow to rebuild a table in database ? 3. How did you do performance tuning ? 4. How do you find the skewness of data in table ? 5. Difference between RDBMS and Dimensional Modeling SQL 1) purchase customer_id product_id quantity purchase_date 1 111 1 01/01/2017 1 111 2 01/02/2107 1 222 2 01/02/2017 2 111 3 01/04/2017 2 222 1 01/03/2017 3 222 1 01/05/2017 3 222 1 01/06/2017 3 111 1 01/06/2017 3 111 1 01/04/2017 Q: How many customers bought each product how many times during the week? Product_Id Number_of Customers Number_of_Times 111 2 2 111 1 1 222 2 1 222 1 2 2) daily_usage account_id usage_amount usage_date 1 10 1 1 20 2 1 15 3 1 30 4 Q. a) How do you print the usage_amount of previous/consecutive rows b) Without using window functions

SQL : Top 3 Products by sale, % using Case, Basic Having clause and one Set operator (Intersect) type question Python : Average word length, ip-address parsing, dictionary, list of lists, flatten list of lists. ( Similar to previous interview experiences)

Data Engineer

Interviewed at Meta

3.6★

Nov 26, 2018

SQL : Top 3 Products by sale, % using Case, Basic Having clause and one Set operator (Intersect) type question Python : Average word length, ip-address parsing, dictionary, list of lists, flatten list of lists. ( Similar to previous interview experiences)

4 questions in HackerRank

Data Engineer

Interviewed at McKinsey & Company

4.1★

Nov 2, 2017

4 questions in HackerRank

Data Engineer Interview Questions

Data Engineer Interview Questions

Top Data Engineer Interview Questions & How To Answer

Question #1: Can you describe in detail your level of expertise with programming languages?

Question #2: Explain data engineering in your own words.

Question #3: Can you describe your experience working with Apache Hadoop and cloud data management environments?

20,118 data engineer interview questions shared by candidates

See Interview Questions for Similar Jobs