Design an ETL solution using Go that would upload CSV files from the directory to Postgres database. Functionality should include: - listing CSV files status (are files already uploaded?) - manual upload from provided CSV file to db - real-time syncing (if file arrived to the folder, and not uploaded yet - ingest data to db)
Senior Data Engineer Interview Questions
2,562 senior data engineer interview questions shared by candidates
Signed confidentiality agreement not to convey technical interview information
What was your college GPA?
SQL : 1. create Pivot tables, 2. find the sequence gap and aggregate the result python: 1. Tricky python questions like function and their output 2. Normal and basic level of coding Pyspark: 1. Please check this topics of pyspark Windows functions , date_diff, case when , watermark and aggregation
How is Kafka different from MQ
Round 1 Coding Questions: Write a program to count the number of binary 1s in a number. Example: For 7 → Output should be 3. Write a program to check whether the given input contains valid parentheses. SQL Question: List all employees whose salary is greater than the average salary of their respective department. Scala Theory Question: What is implicit in Scala? Spark Questions: How do you copy data from HDFS to the local file system? What is the command for spark-submit? What is the difference between reduceByKey and groupByKey? What happens during a broadcast join in Spark, and how does it reduce shuffling? What happens when we use the collect() action in Spark? How do you define the minimum and maximum number of executors in Spark?
Times X{open, close} , Y{open, close} rec_type, status, time x1, open, 930 x1, close 1030 x2, open, 1035 y1, open, 1040 y2, open, 1041 x2, close, 1100 x3, open, 1110 x3, close, 1115 y1, close, 1120 y2, close, 1121 |---x1, open, 930 |---x1, close 1030 |-----x2, open, 1035 | y1, open, 1040----| | y2, open, 1041 ---+---| |-----x2, close, 1100 | | |---x3, open, 1110 | | |---x3, close, 1115 | | y1, close, 1120---| | y2, close, 1121-------| Find the pairs of x-type and y-type where they have any time overlap between them.
Java OOP concepts Difference between Interface and Abstract class. Java Memory Management Optimization. Checked vs Unchecked Exception. Definition of Microservices What is new in Python 3? Different types of Python Structures Definition of Monkey Patching Mostly definitions around Software engineering practices.
Various Python, Snowflake, and dbt questions. More heavily weighted to Python, as noted above. Some ETL vs ELT and OLAP vs OLTP questions.
Design an ETL pipeline that loads data every hour from X system to Y System with consistency and reliability describing edge cases.
Viewing 1981 - 1990 interview questions