Interview Experience for Data Scientist Role | Udaan

12 November 2023 - 3 mins read time
Tags: Interview archives

With one year of experience as a data scientist, I received an interview invitation from Udaan.com in Bengaluru. The HR outlined the five rounds as follows:

ML Coding Round
ML Past Projects Discussions
Diving Deep into ML Concepts
Product Understanding
HR Discussion

ML Coding Round

The Task at Hand

In the ML Coding Round at Udaan, my challenge was to construct a basic classifier model using a credit worthiness dataset. The interviewer’s emphasis was clear – this was not a quest for the perfect model but an assessment of my proficiency in Python coding. 🕵️‍♂️

Navigating the Terrain

Given my inclination towards deep learning, my familiarity with sklearn utility functions like Label Encoder and Preprocessors was a bit rusty. However, armed with knowledge of logistic regression, I embarked on building the preprocessing steps from scratch. This included handling NaN values, imputing with mean or mode, and encoding string values into numerical equivalents. The interviewer was a guiding presence, offering assistance whenever I hit roadblocks. The flexibility to refer to basic syntaxes online eased the process. 🧙‍♂️

The Verdict: 60%

Despite my model achieving a 60% accuracy on the final test data, the interviewer expressed satisfaction. It was evident that the focus was on the coding process rather than achieving perfection in the model. 😌

Post-Assessment Discussion

Following the model evaluation, a brief discussion unfolded on potential improvements. I suggested leveraging advanced tree algorithms for enhanced performance. Additionally, I highlighted the importance of domain knowledge in accurately selecting and encoding features. 🌲💡

With this, the ML Coding Round concluded, marking my successful progression to the next stage of the interview process. Onwards and upwards! 🚀

ML Past Projects Discussion

Project Unveiling

The ML Past Projects Discussion at Udaan provided a thorough exploration of my professional journey, spanning decision tree-based projects to the fascinating realms of neural networks, transformers, and LoRA. Each question seamlessly tied back to my project experiences, offering a contextual dance of exploration. 📊🔍

Delving into ML Proficiency

Here is a list of questions I received during the interview:

How do you split a node in a decision tree?
In the context of decision trees, imagine you have a very small dataset. You don’t want to split it into train and validation datasets to avoid losing data for training. What would you do in this case?
How is XGBoost different from other boosting techniques?
What is the difference between boosting and bagging?
You have a black box model that cannot be changed or understood. Given a tabular data set, and the ability to feed the data into the model multiple times, can you determine the importance of each feature?
How do you tackle an imbalanced dataset?
What metrics do you use to measure performance in the case of an imbalanced dataset?
When dealing with a dataset with too many features, how do you select the most important ones?
How does PCA work, and what is the significance of eigenvalues and eigenvectors?
How do you set the threshold for your classification problem?
What is batch norm, and how is it useful?
How do you train a batch norm layer? Is there a difference in its application between testing and training?
How do you select an optimizer, and are there differences among existing optimizers in the market?
How does the Adam Optimizer work?
Explain the transformer architecture, focusing on key, query, and value vectors.
How does an LSTM work?
Explain LoRA.

STAY TUNED FOR UPDATES ON NEXT ROUND!!