The interview mainly focused on:
Python fundamentals
Object-Oriented Programming (OOPs) concepts
Data structures and algorithms
PySpark internals (how it works behind the scenes)
Data pipeline design based on scenario's for batch and streaming processing
Writing code in SQL, PySpark, and Python
One of the coding questions was about grouping anagrams in Python.
During the interview, it was emphasized that when solving problems, you shouldn’t just write code directly. Instead, you should explain your thought process -
Why you are choosing a particular approach
What the next step is and why it makes sense
How your solution handles edge cases or scales
So the expectation was not only to provide the solution but also to walk through the reasoning step-by-step.