A: They asked me to walk through a real-world Spark use case I’d implemented—building a streaming ETL for clickstream data—detailing how I managed stateful windowed aggregations, tuned `spark.sql.shuffle.partitions`, and fixed data skew with a custom partitioner.
Senior Data Engineer Interview Questions
2,562 senior data engineer interview questions shared by candidates
Design big Data solution for 2 streaming dataset using any Cloud or open source tech
Craft demo mostly demonstrating spark working knowledge, distributed systems.
Sort 2 already sorted arrays into one array
Design LRU cache
Implement iterator which take 3 iterator as Input and sort it out.
Analysis and problem solving.
Implement scalable topK words for Amazon product descriptions using count-min sketch.
Real time and practical questions
Python list of lists containing nodes with parent , child relationship and find the node that has max connections.
Viewing 2011 - 2020 interview questions