What is a distribution you may use to model data whose range of input values is [0, N]?
Data Interview Questions
132,990 data interview questions shared by candidates
that they'd never seen someone do machine learning from the ground up in 6 hours. At least I know the basicas
what are the possible approaches, why did you take yours?
How can you add value to our IT department?
Find a path through a 2D matrix.
What is your greatest weakness?
What would your family and friends say is your best asset?
Do you have the knowledge on how to use Spread Sheet, Excel and General Microsoft Office tools?
Hi, I have seen you profile and it looks very interesting. You have been Data manager and then SAS programmer. Tell why do you want to come back to data management?
Spark : 1. Difference between map and flatmap. 2. Difference between groupbykey and reducebykey 3. Which file format spark saves file ? Answer should be ".orc" files only. (It took me multiple attempts to understand the question. and ".txt", ".csv" "xml" "anyflat-file-format" is not the correct answer. LOL ) 4. Difference between coalesce and repartition. 5. some more questions on RDD Hive 1 Difference between orderby and sortby 2. Select * from . How many mapper will be created. 3. Analytical function/ What is RANK ? hdfs 1. Difference between block and split SQL 1. What is indexing on single column ? (stopped in between of my answer ) and followed by what is sorting type in index key ascending or descending. (How does it matter its primary key not composite key) 2. Can you have two primary key . (I do not thing there will need of two primary as one it self does the job.) No tell the implementation is it possible, can you define two primary key ? (No) One of the panel member got irritated and fired a few questions after saying bye and answering my question related to profile and position . (Literally like Panel one : okay bye thanks for you ... Panel: No I will ask one last question . which file format spark saves files. )
Viewing 1371 - 1380 interview questions