Sr data engineer Interview Questions

which model to get results from a cube with low latency? what are the models in warehouse? how to use merge statement in which scenario? what motivates you to work? interviewer has some timelines sometimes with good plan sometime have to deliver at gunpoint..etc, how comfortable are you..etc?

Senior Data Engineer

Interviewed at Mott MacDonald

3.9★

Oct 2, 2022

which model to get results from a cube with low latency? what are the models in warehouse? how to use merge statement in which scenario? what motivates you to work? interviewer has some timelines sometimes with good plan sometime have to deliver at gunpoint..etc, how comfortable are you..etc?

What's the hardest data set you've come to work with and why?

Senior Data Engineer

Interviewed at Citation

4★

Feb 8, 2023

What's the hardest data set you've come to work with and why?

They asked SQL and Python questions, the questions were challenging but would have been a fair play if we were given a code playground to test our queries and then submit the final answer.

Senior Big Data Engineer

Interviewed at dunnhumby

4.3★

Jun 4, 2025

They asked SQL and Python questions, the questions were challenging but would have been a fair play if we were given a code playground to test our queries and then submit the final answer.

A. Core Data Engineering Concepts SQL (joins, window functions, performance tuning) Data Modeling (star vs snowflake, normalization) ETL/ELT pipelines (batch vs streaming, orchestration tools like Airflow) B. Apache Spark / PySpark Catalyst Optimizer & Tungsten Narrow vs Wide transformations Joins (broadcast, sort-merge), Skew handling AQE (Adaptive Query Execution) Partitioning, Predicate Pushdown Execution Plan (DAG → Stage → Tasks) Spark UI and Job Debugging SCD Type 2 Implementation in PySpark C. AWS S3, Glue, Athena, Lambda, EMR, Redshift Event-driven design (S3 → EventBridge → Lambda) Security: IAM roles, bucket policies, encryption CI/CD in AWS (CodePipeline, CloudFormation) D. Python Writing modular, reusable code Working with Pandas, Boto3 (for AWS interaction) Exception handling, logging Lambda functions and decorators E. Kafka / Streaming Kafka topic partitioning, consumer groups Offset management Integration with Spark Structured Streaming

Senior Data Engineer

Interviewed at EPAM Systems

4★

Jul 21, 2025

A. Core Data Engineering Concepts SQL (joins, window functions, performance tuning) Data Modeling (star vs snowflake, normalization) ETL/ELT pipelines (batch vs streaming, orchestration tools like Airflow) B. Apache Spark / PySpark Catalyst Optimizer & Tungsten Narrow vs Wide transformations Joins (broadcast, sort-merge), Skew handling AQE (Adaptive Query Execution) Partitioning, Predicate Pushdown Execution Plan (DAG → Stage → Tasks) Spark UI and Job Debugging SCD Type 2 Implementation in PySpark C. AWS S3, Glue, Athena, Lambda, EMR, Redshift Event-driven design (S3 → EventBridge → Lambda) Security: IAM roles, bucket policies, encryption CI/CD in AWS (CodePipeline, CloudFormation) D. Python Writing modular, reusable code Working with Pandas, Boto3 (for AWS interaction) Exception handling, logging Lambda functions and decorators E. Kafka / Streaming Kafka topic partitioning, consumer groups Offset management Integration with Spark Structured Streaming

Pyspark memory optimization, different types of keys in SQL

Sr Data Engineer

Interviewed at EPAM Systems

4★

Sep 2, 2025

Pyspark memory optimization, different types of keys in SQL

Why are long process of release offer letter after interview is done.

Senior Data Engineer

Interviewed at Capgemini

4.2★

Sep 8, 2025

Why are long process of release offer letter after interview is done.

Linked list reversal with pointers.

Senior Data Engineer

Interviewed at EPAM Systems

4★

Sep 23, 2025

Linked list reversal with pointers.

Experience, SQL queries based on big data.

Senior Data Engineer

Interviewed at Indeed

3.8★

Mar 8, 2025

Experience, SQL queries based on big data.

Explain the difference between dataset and dataframe Spark, cluster, jobs, optimization Pyspark, scala Real-time project based questions

Senior Data Engineer

Interviewed at EPAM Systems

4★

Apr 5, 2025

Explain the difference between dataset and dataframe Spark, cluster, jobs, optimization Pyspark, scala Real-time project based questions

About project

Senior Big Data Engineer

Interviewed at Microsoft

4★

Apr 25, 2018

About project

Sr Data Engineer Interview Questions

2,563 sr data engineer interview questions shared by candidates

See Interview Questions for Similar Jobs