Data Interview Questions

132,690 data interview questions shared by candidates

Python 1 #1.returns the number of times a given character occurs in the given string s1='missisipi' #print(s1.find('s')) res=[] for i in range(len(s1)): #print(s1[i]) if s1[i]=='s': res.append('s') print(len(res)) #2.[1,None,1,2,None} --> [1,1,1,2,2] arr=[None,1,2,None] new_l=[] for i in range(0,len(arr)): if arr[i] != None: new_l.append(arr[i]) else: new_l.append(arr[i-1]) print(new_l) #2. (python) Given two sentences, construct an array that has the words that appear in one sentence and not the other. A = "Geeks for Geeks" B = "Learning from Geeks for Geeks" d={} for w in A.split(): if w in d: d[w]=d.get(w,0)+1 else: d[w]=1 for w in B.split(): if w in d: d[w]=d.get(w,0)+1 else: d[w]=1 unmatchedW=[w for w in d if d[w]==1] print (unmatchedW) 3. d = {"a": 4, "c": 3, "b": 12} [(k, v) for k, v in sorted(d.items(), key=lambda x: x[1], reverse=True)] #[('b', 12), ('a', 4), ('c', 3)] SQL # # sales # products # +------------------+---------+ +---------------------+---------+ # | product_id | INTEGER |>--------| product_id | INTEGER | # | store_id | INTEGER | +---<| product_class_id | INTEGER | # | customer_id | INTEGER | | | brand_name | VARCHAR | # +---<| promotion_id | INTEGER | | | product_name | VARCHAR | # | | store_sales | DECIMAL | | | is_low_fat_flg | TINYINT | # | | store_cost | DECIMAL | | | is_recyclable_flg |… Show More 1. find top 5 sales products having promotions Select Sum(s.store_sales), brand_name, count(p.product_id) from products p inner join sales s p.product_id = s.product_id where promotion_id is not null group by brand_name having count(p.product_id) =1 /* single-channel media type */ order by 1 desc limit 5 2. # -- % Of sales that had a valid promotion, the VP of marketing # -- wants to know what % of transactions occur on either # -- the very first day or the very last day of a promotion campaign. select sum(case when valid_promotion = 1 then 1 else 0 end)/count(*) * 100 as percentage from sales where day = First_day(date) or day = last_day(date) or select sum(case when transaction_date = (select min(transaction_date) from sales) then 1 else 0)/count(*) as first_day_sales, sum(case when transaction_date = (select max(transaction_date) from sales) then 1 else 0)/count(*) as last_day_sales from sales or select avg(transaction_date in (p.start_date,p.end_date))*100 as first_last_pct from sales s join promotions p using(promotion_id)
avatar

Data Engineer

Interviewed at Meta

3.6
Aug 25, 2020

Python 1 #1.returns the number of times a given character occurs in the given string s1='missisipi' #print(s1.find('s')) res=[] for i in range(len(s1)): #print(s1[i]) if s1[i]=='s': res.append('s') print(len(res)) #2.[1,None,1,2,None} --> [1,1,1,2,2] arr=[None,1,2,None] new_l=[] for i in range(0,len(arr)): if arr[i] != None: new_l.append(arr[i]) else: new_l.append(arr[i-1]) print(new_l) #2. (python) Given two sentences, construct an array that has the words that appear in one sentence and not the other. A = "Geeks for Geeks" B = "Learning from Geeks for Geeks" d={} for w in A.split(): if w in d: d[w]=d.get(w,0)+1 else: d[w]=1 for w in B.split(): if w in d: d[w]=d.get(w,0)+1 else: d[w]=1 unmatchedW=[w for w in d if d[w]==1] print (unmatchedW) 3. d = {"a": 4, "c": 3, "b": 12} [(k, v) for k, v in sorted(d.items(), key=lambda x: x[1], reverse=True)] #[('b', 12), ('a', 4), ('c', 3)] SQL # # sales # products # +------------------+---------+ +---------------------+---------+ # | product_id | INTEGER |>--------| product_id | INTEGER | # | store_id | INTEGER | +---<| product_class_id | INTEGER | # | customer_id | INTEGER | | | brand_name | VARCHAR | # +---<| promotion_id | INTEGER | | | product_name | VARCHAR | # | | store_sales | DECIMAL | | | is_low_fat_flg | TINYINT | # | | store_cost | DECIMAL | | | is_recyclable_flg |… Show More 1. find top 5 sales products having promotions Select Sum(s.store_sales), brand_name, count(p.product_id) from products p inner join sales s p.product_id = s.product_id where promotion_id is not null group by brand_name having count(p.product_id) =1 /* single-channel media type */ order by 1 desc limit 5 2. # -- % Of sales that had a valid promotion, the VP of marketing # -- wants to know what % of transactions occur on either # -- the very first day or the very last day of a promotion campaign. select sum(case when valid_promotion = 1 then 1 else 0 end)/count(*) * 100 as percentage from sales where day = First_day(date) or day = last_day(date) or select sum(case when transaction_date = (select min(transaction_date) from sales) then 1 else 0)/count(*) as first_day_sales, sum(case when transaction_date = (select max(transaction_date) from sales) then 1 else 0)/count(*) as last_day_sales from sales or select avg(transaction_date in (p.start_date,p.end_date))*100 as first_last_pct from sales s join promotions p using(promotion_id)

Building a histogram of post reply count in SQL (number of posts with x replies, x+1 replies, etc). Building a table with a summary of feature usage per user every day (keep track of the last action by user and roll that up every day). Basic conditional probabilities (check out brilliant.org for their source of inspiration)
avatar

Data Scientist

Interviewed at Meta

3.6
May 16, 2017

Building a histogram of post reply count in SQL (number of posts with x replies, x+1 replies, etc). Building a table with a summary of feature usage per user every day (keep track of the last action by user and roll that up every day). Basic conditional probabilities (check out brilliant.org for their source of inspiration)

Viewing 141 - 150 interview questions

Glassdoor has 132,690 interview questions and reports from Data interviews. Prepare for your interview. Get hired. Love your job.