Data Scientist Interview Questions
Comprehensive guide covering Statistics, Machine Learning, Deep Learning, NLP, Computer Vision, Model Deployment, and Business Scenarios.
Total Questions:450
Difficulty Levels:
BeginnerIntermediateAdvanced
0%
Overall Progress
0/450
Status
Problem
Level
2.Explain the Central Limit Theorem and its significance.Medium
2.Explain the Central Limit Theorem and its significance.
Medium
3.What is the difference between Type I and Type II errors?Easy
3.What is the difference between Type I and Type II errors?
Easy
4.What is p-value and how do you interpret it?Medium
4.What is p-value and how do you interpret it?
Medium
5.Explain confidence intervals.Easy
5.Explain confidence intervals.
Easy
6.What is the difference between confidence interval and prediction interval?Medium
6.What is the difference between confidence interval and prediction interval?
Medium
7.What is statistical significance and practical significance?Medium
7.What is statistical significance and practical significance?
Medium
8.Explain hypothesis testing process.Easy
8.Explain hypothesis testing process.
Easy
9.What is the difference between one-tailed and two-tailed tests?Medium
9.What is the difference between one-tailed and two-tailed tests?
Medium
10.What is power of a statistical test?Hard
10.What is power of a statistical test?
Hard
11.Explain Bayes' Theorem with an example.Medium
11.Explain Bayes' Theorem with an example.
Medium
12.What is the difference between frequentist and Bayesian statistics?Hard
12.What is the difference between frequentist and Bayesian statistics?
Hard
13.What is maximum likelihood estimation (MLE)?Hard
13.What is maximum likelihood estimation (MLE)?
Hard
14.What is the difference between parametric and non-parametric tests?Medium
14.What is the difference between parametric and non-parametric tests?
Medium
15.Explain the t-test, chi-square test, and ANOVA.Medium
15.Explain the t-test, chi-square test, and ANOVA.
Medium
16.What is correlation vs causation?Easy
16.What is correlation vs causation?
Easy
17.What is the difference between Pearson and Spearman correlation?Medium
17.What is the difference between Pearson and Spearman correlation?
Medium
18.What is covariance and how does it differ from correlation?Medium
18.What is covariance and how does it differ from correlation?
Medium
19.Explain probability distributions (normal, binomial, Poisson, exponential).Medium
19.Explain probability distributions (normal, binomial, Poisson, exponential).
Medium
20.What is the law of large numbers?Easy
20.What is the law of large numbers?
Easy
21.What is sampling and different sampling methods?Easy
21.What is sampling and different sampling methods?
Easy
22.What is stratified sampling vs random sampling?Easy
22.What is stratified sampling vs random sampling?
Easy
23.What is the difference between mean, median, and mode?Easy
23.What is the difference between mean, median, and mode?
Easy
24.Explain standard deviation and variance.Easy
24.Explain standard deviation and variance.
Easy
25.What is z-score and standardization?Easy
25.What is z-score and standardization?
Easy
26.What is skewness and kurtosis?Medium
26.What is skewness and kurtosis?
Medium
27.What is the 68-95-99.7 rule?Easy
27.What is the 68-95-99.7 rule?
Easy
28.What is conditional probability?Easy
28.What is conditional probability?
Easy
29.What is the birthday paradox?Medium
29.What is the birthday paradox?
Medium
30.What is Monte Carlo simulation?Medium
30.What is Monte Carlo simulation?
Medium
31.What is bootstrapping?Medium
31.What is bootstrapping?
Medium
32.What is permutation test?Hard
32.What is permutation test?
Hard
33.What is A/B testing and how do you design one?Medium
33.What is A/B testing and how do you design one?
Medium
34.How do you determine sample size for experiments?Medium
34.How do you determine sample size for experiments?
Medium
35.What is multiple hypothesis testing problem?Hard
35.What is multiple hypothesis testing problem?
Hard
36.What is Bonferroni correction?Hard
36.What is Bonferroni correction?
Hard
37.What is false discovery rate (FDR)?Hard
37.What is false discovery rate (FDR)?
Hard
38.What is survival analysis?Hard
38.What is survival analysis?
Hard
39.What is time series analysis basics?Medium
39.What is time series analysis basics?
Medium
40.What is stationary vs non-stationary time series?Medium
40.What is stationary vs non-stationary time series?
Medium
41.What is the difference between supervised and unsupervised learning?Easy
41.What is the difference between supervised and unsupervised learning?
Easy
42.Explain semi-supervised and reinforcement learning.Medium
42.Explain semi-supervised and reinforcement learning.
Medium
43.What is the bias-variance tradeoff?Medium
43.What is the bias-variance tradeoff?
Medium
44.What is overfitting and underfitting?Easy
44.What is overfitting and underfitting?
Easy
45.How do you detect and prevent overfitting?Medium
45.How do you detect and prevent overfitting?
Medium
46.What is regularization and why is it important?Medium
46.What is regularization and why is it important?
Medium
47.Explain L1 (Lasso) and L2 (Ridge) regularization.Hard
47.Explain L1 (Lasso) and L2 (Ridge) regularization.
Hard
48.What is Elastic Net?Hard
48.What is Elastic Net?
Hard
49.What is cross-validation and different types?Easy
49.What is cross-validation and different types?
Easy
50.What is k-fold cross-validation?Medium
50.What is k-fold cross-validation?
Medium
51.What is stratified cross-validation?Medium
51.What is stratified cross-validation?
Medium
52.What is train-test-validation split?Easy
52.What is train-test-validation split?
Easy
53.What is the curse of dimensionality?Hard
53.What is the curse of dimensionality?
Hard
54.What is feature engineering?Easy
54.What is feature engineering?
Easy
55.What is feature selection vs feature extraction?Medium
55.What is feature selection vs feature extraction?
Medium
56.What is dimensionality reduction?Medium
56.What is dimensionality reduction?
Medium
57.What is the difference between PCA and t-SNE?Hard
57.What is the difference between PCA and t-SNE?
Hard
58.What is PCA and how does it work?Hard
58.What is PCA and how does it work?
Hard
59.What are eigenvectors and eigenvalues in PCA?Hard
59.What are eigenvectors and eigenvalues in PCA?
Hard
60.What is LDA (Linear Discriminant Analysis)?Medium
60.What is LDA (Linear Discriminant Analysis)?
Medium
61.What is the difference between PCA and LDA?Hard
61.What is the difference between PCA and LDA?
Hard
62.What is batch learning vs online learning?Medium
62.What is batch learning vs online learning?
Medium
63.What is instance-based vs model-based learning?Medium
63.What is instance-based vs model-based learning?
Medium
64.What is ensemble learning?Easy
64.What is ensemble learning?
Easy
65.What is bagging vs boosting?Medium
65.What is bagging vs boosting?
Medium
66.What is stacking in ensemble methods?Hard
66.What is stacking in ensemble methods?
Hard
67.What is the difference between classification and regression?Easy
67.What is the difference between classification and regression?
Easy
68.What is multi-class vs multi-label classification?Medium
68.What is multi-class vs multi-label classification?
Medium
69.What is imbalanced dataset and how to handle it?Medium
69.What is imbalanced dataset and how to handle it?
Medium
70.What is SMOTE?Hard
70.What is SMOTE?
Hard
71.What is cost-sensitive learning?Hard
71.What is cost-sensitive learning?
Hard
72.What is anomaly detection?Medium
72.What is anomaly detection?
Medium
73.What is one-class classification?Hard
73.What is one-class classification?
Hard
74.What is active learning?Hard
74.What is active learning?
Hard
75.What is transfer learning?Medium
75.What is transfer learning?
Medium
76.What is data augmentation?Easy
76.What is data augmentation?
Easy
77.What is the no free lunch theorem?Hard
77.What is the no free lunch theorem?
Hard
78.What is inductive bias?Hard
78.What is inductive bias?
Hard
79.What is Occam's razor in ML?Medium
79.What is Occam's razor in ML?
Medium
80.What is model interpretability vs explainability?Hard
80.What is model interpretability vs explainability?
Hard
81.Explain Linear Regression in detail.Easy
81.Explain Linear Regression in detail.
Easy
82.What are the assumptions of Linear Regression?Medium
82.What are the assumptions of Linear Regression?
Medium
83.How do you interpret regression coefficients?Easy
83.How do you interpret regression coefficients?
Easy
84.What is multicollinearity and how to detect it?Medium
84.What is multicollinearity and how to detect it?
Medium
85.What is VIF (Variance Inflation Factor)?Hard
85.What is VIF (Variance Inflation Factor)?
Hard
86.What is heteroscedasticity?Hard
86.What is heteroscedasticity?
Hard
87.What is R-squared and adjusted R-squared?Medium
87.What is R-squared and adjusted R-squared?
Medium
88.What is the difference between R-squared and RMSE?Medium
88.What is the difference between R-squared and RMSE?
Medium
89.What is Logistic Regression?Easy
89.What is Logistic Regression?
Easy
90.What is the sigmoid function?Easy
90.What is the sigmoid function?
Easy
91.What is the difference between Linear and Logistic Regression?Easy
91.What is the difference between Linear and Logistic Regression?
Easy
92.How do you interpret logistic regression coefficients?Hard
92.How do you interpret logistic regression coefficients?
Hard
93.What is odds ratio?Hard
93.What is odds ratio?
Hard
94.What is softmax function for multi-class classification?Medium
94.What is softmax function for multi-class classification?
Medium
95.What is Decision Tree and how does it work?Easy
95.What is Decision Tree and how does it work?
Easy
96.What is entropy and information gain?Medium
96.What is entropy and information gain?
Medium
97.What is Gini impurity?Medium
97.What is Gini impurity?
Medium
98.What is the difference between Gini and Entropy?Hard
98.What is the difference between Gini and Entropy?
Hard
99.What is pruning in decision trees?Medium
99.What is pruning in decision trees?
Medium
100.What are the advantages and disadvantages of Decision Trees?Easy
100.What are the advantages and disadvantages of Decision Trees?
Easy
101.What is Random Forest?Easy
101.What is Random Forest?
Easy
102.How does Random Forest reduce overfitting?Medium
102.How does Random Forest reduce overfitting?
Medium
103.What is out-of-bag (OOB) error?Hard
103.What is out-of-bag (OOB) error?
Hard
104.What is Gradient Boosting?Medium
104.What is Gradient Boosting?
Medium
105.What is XGBoost and its advantages?Medium
105.What is XGBoost and its advantages?
Medium
106.What is LightGBM?Hard
106.What is LightGBM?
Hard
107.What is CatBoost?Hard
107.What is CatBoost?
Hard
108.What is the difference between Random Forest and Gradient Boosting?Medium
108.What is the difference between Random Forest and Gradient Boosting?
Medium
109.What is AdaBoost?Medium
109.What is AdaBoost?
Medium
110.What is Support Vector Machine (SVM)?Medium
110.What is Support Vector Machine (SVM)?
Medium
111.What is the kernel trick in SVM?Hard
111.What is the kernel trick in SVM?
Hard
112.What are different kernel functions (linear, RBF, polynomial)?Hard
112.What are different kernel functions (linear, RBF, polynomial)?
Hard
113.What is the margin in SVM?Medium
113.What is the margin in SVM?
Medium
114.What is C parameter in SVM?Hard
114.What is C parameter in SVM?
Hard
115.What is K-Nearest Neighbors (KNN)?Easy
115.What is K-Nearest Neighbors (KNN)?
Easy
116.How do you choose K in KNN?Medium
116.How do you choose K in KNN?
Medium
117.What is distance metric in KNN (Euclidean, Manhattan, Minkowski)?Medium
117.What is distance metric in KNN (Euclidean, Manhattan, Minkowski)?
Medium
118.What are advantages and disadvantages of KNN?Easy
118.What are advantages and disadvantages of KNN?
Easy
119.What is Naive Bayes classifier?Easy
119.What is Naive Bayes classifier?
Easy
120.What is the 'naive' assumption in Naive Bayes?Medium
120.What is the 'naive' assumption in Naive Bayes?
Medium
121.What is clustering?Easy
121.What is clustering?
Easy
122.What is K-Means clustering?Easy
122.What is K-Means clustering?
Easy
123.How do you choose the number of clusters (K)?Medium
123.How do you choose the number of clusters (K)?
Medium
124.What is the elbow method?Medium
124.What is the elbow method?
Medium
125.What is silhouette score?Hard
125.What is silhouette score?
Hard
126.What is Hierarchical clustering?Medium
126.What is Hierarchical clustering?
Medium
127.What is dendrogram?Medium
127.What is dendrogram?
Medium
128.What is DBSCAN?Hard
128.What is DBSCAN?
Hard
129.What is the difference between K-Means and DBSCAN?Hard
129.What is the difference between K-Means and DBSCAN?
Hard
130.What is Gaussian Mixture Models (GMM)?Hard
130.What is Gaussian Mixture Models (GMM)?
Hard
131.What is the difference between K-Means and GMM?Hard
131.What is the difference between K-Means and GMM?
Hard
132.What is association rule learning?Medium
132.What is association rule learning?
Medium
133.What is Apriori algorithm?Hard
133.What is Apriori algorithm?
Hard
134.What is support, confidence, and lift in association rules?Medium
134.What is support, confidence, and lift in association rules?
Medium
135.What is collaborative filtering?Medium
135.What is collaborative filtering?
Medium
136.What is content-based filtering?Easy
136.What is content-based filtering?
Easy
137.What is matrix factorization?Hard
137.What is matrix factorization?
Hard
138.What is recommendation system approaches?Medium
138.What is recommendation system approaches?
Medium
139.What is topic modeling?Medium
139.What is topic modeling?
Medium
140.What is Latent Dirichlet Allocation (LDA)?Hard
140.What is Latent Dirichlet Allocation (LDA)?
Hard
141.What is a neural network?Easy
141.What is a neural network?
Easy
142.What is a perceptron?Easy
142.What is a perceptron?
Easy
143.What is an activation function?Medium
143.What is an activation function?
Medium
144.Explain different activation functions (ReLU, Sigmoid, Tanh, Leaky ReLU).Medium
144.Explain different activation functions (ReLU, Sigmoid, Tanh, Leaky ReLU).
Medium
145.Why is ReLU preferred over Sigmoid?Hard
145.Why is ReLU preferred over Sigmoid?
Hard
146.What is the vanishing gradient problem in deep networks?Hard
146.What is the vanishing gradient problem in deep networks?
Hard
147.What is backpropagation?Medium
147.What is backpropagation?
Medium
148.What is gradient descent?Easy
148.What is gradient descent?
Easy
149.What is stochastic gradient descent (SGD)?Medium
149.What is stochastic gradient descent (SGD)?
Medium
150.What is mini-batch gradient descent?Easy
150.What is mini-batch gradient descent?
Easy
151.What is momentum in gradient descent?Medium
151.What is momentum in gradient descent?
Medium
152.What is Adam optimizer?Hard
152.What is Adam optimizer?
Hard
153.What is learning rate and learning rate scheduling?Medium
153.What is learning rate and learning rate scheduling?
Medium
154.What is the difference between epoch, batch, and iteration?Easy
154.What is the difference between epoch, batch, and iteration?
Easy
155.What is weight initialization and why is it important?Medium
155.What is weight initialization and why is it important?
Medium
156.What is Xavier/He initialization?Hard
156.What is Xavier/He initialization?
Hard
157.What is a Convolutional Neural Network (CNN)?Medium
157.What is a Convolutional Neural Network (CNN)?
Medium
158.What is convolution operation?Medium
158.What is convolution operation?
Medium
159.What is pooling (max pooling, average pooling)?Easy
159.What is pooling (max pooling, average pooling)?
Easy
160.What is stride and padding?Easy
160.What is stride and padding?
Easy
161.What is a filter/kernel in CNN?Medium
161.What is a filter/kernel in CNN?
Medium
162.What are popular CNN architectures (VGG, ResNet, Inception)?Hard
162.What are popular CNN architectures (VGG, ResNet, Inception)?
Hard
163.What is transfer learning in CNNs?Medium
163.What is transfer learning in CNNs?
Medium
164.What is Recurrent Neural Network (RNN)?Medium
164.What is Recurrent Neural Network (RNN)?
Medium
165.What is the vanishing gradient problem in RNN?Hard
165.What is the vanishing gradient problem in RNN?
Hard
166.What is LSTM (Long Short-Term Memory)?Hard
166.What is LSTM (Long Short-Term Memory)?
Hard
167.What is GRU (Gated Recurrent Unit)?Hard
167.What is GRU (Gated Recurrent Unit)?
Hard
168.What is the difference between LSTM and GRU?Hard
168.What is the difference between LSTM and GRU?
Hard
169.What is bidirectional RNN?Medium
169.What is bidirectional RNN?
Medium
170.What is sequence-to-sequence model?Hard
170.What is sequence-to-sequence model?
Hard
171.What is attention mechanism?Hard
171.What is attention mechanism?
Hard
172.What is Transformer architecture?Hard
172.What is Transformer architecture?
Hard
173.What is self-attention?Hard
173.What is self-attention?
Hard
174.What is BERT?Medium
174.What is BERT?
Medium
175.What is GPT?Medium
175.What is GPT?
Medium
176.What is the difference between BERT and GPT?Hard
176.What is the difference between BERT and GPT?
Hard
177.What is fine-tuning in NLP?Easy
177.What is fine-tuning in NLP?
Easy
178.What is word embedding (Word2Vec, GloVe, FastText)?Medium
178.What is word embedding (Word2Vec, GloVe, FastText)?
Medium
179.What is autoencoder?Medium
179.What is autoencoder?
Medium
180.What is Generative Adversarial Network (GAN)?Hard
180.What is Generative Adversarial Network (GAN)?
Hard
181.What is accuracy and when is it not a good metric?Easy
181.What is accuracy and when is it not a good metric?
Easy
182.What is precision and recall?Easy
182.What is precision and recall?
Easy
183.What is F1-score?Medium
183.What is F1-score?
Medium
184.What is the difference between micro and macro averaging?Hard
184.What is the difference between micro and macro averaging?
Hard
185.What is confusion matrix?Easy
185.What is confusion matrix?
Easy
186.What is ROC curve?Medium
186.What is ROC curve?
Medium
187.What is AUC-ROC?Medium
187.What is AUC-ROC?
Medium
188.What is PR curve (Precision-Recall)?Medium
188.What is PR curve (Precision-Recall)?
Medium
189.When to use ROC vs PR curve?Hard
189.When to use ROC vs PR curve?
Hard
190.What is mean absolute error (MAE)?Easy
190.What is mean absolute error (MAE)?
Easy
191.What is mean squared error (MSE)?Easy
191.What is mean squared error (MSE)?
Easy
192.What is root mean squared error (RMSE)?Easy
192.What is root mean squared error (RMSE)?
Easy
193.What is mean absolute percentage error (MAPE)?Medium
193.What is mean absolute percentage error (MAPE)?
Medium
194.What is log loss?Hard
194.What is log loss?
Hard
195.What is Cohen's Kappa?Hard
195.What is Cohen's Kappa?
Hard
196.What is Matthews Correlation Coefficient (MCC)?Hard
196.What is Matthews Correlation Coefficient (MCC)?
Hard
197.What is specificity and sensitivity?Medium
197.What is specificity and sensitivity?
Medium
198.What is true positive rate and false positive rate?Easy
198.What is true positive rate and false positive rate?
Easy
199.What is balanced accuracy?Medium
199.What is balanced accuracy?
Medium
200.What is top-k accuracy?Medium
200.What is top-k accuracy?
Medium
201.What is perplexity in NLP?Hard
201.What is perplexity in NLP?
Hard
202.What is BLEU score?Medium
202.What is BLEU score?
Medium
203.What is mean reciprocal rank (MRR)?Hard
203.What is mean reciprocal rank (MRR)?
Hard
204.What is NDCG (Normalized Discounted Cumulative Gain)?Hard
204.What is NDCG (Normalized Discounted Cumulative Gain)?
Hard
205.How do you choose the right evaluation metric?Medium
205.How do you choose the right evaluation metric?
Medium
206.What is tokenization?Easy
206.What is tokenization?
Easy
207.What is stemming vs lemmatization?Easy
207.What is stemming vs lemmatization?
Easy
208.What is bag of words (BoW)?Easy
208.What is bag of words (BoW)?
Easy
209.What is TF-IDF?Medium
209.What is TF-IDF?
Medium
210.What is n-gram?Easy
210.What is n-gram?
Easy
211.What is part-of-speech (POS) tagging?Easy
211.What is part-of-speech (POS) tagging?
Easy
212.What is named entity recognition (NER)?Medium
212.What is named entity recognition (NER)?
Medium
213.What is sentiment analysis?Easy
213.What is sentiment analysis?
Easy
214.What is topic modeling?Medium
214.What is topic modeling?
Medium
215.What is word embedding?Medium
215.What is word embedding?
Medium
216.What is Word2Vec (Skip-gram and CBOW)?Hard
216.What is Word2Vec (Skip-gram and CBOW)?
Hard
217.What is the difference between Word2Vec and GloVe?Hard
217.What is the difference between Word2Vec and GloVe?
Hard
218.What is contextual embedding?Hard
218.What is contextual embedding?
Hard
219.What is BERT and how does it work?Hard
219.What is BERT and how does it work?
Hard
220.What is text classification?Easy
220.What is text classification?
Easy
221.What is sequence labeling?Medium
221.What is sequence labeling?
Medium
222.What is language modeling?Medium
222.What is language modeling?
Medium
223.What is machine translation?Medium
223.What is machine translation?
Medium
224.What is text summarization (extractive vs abstractive)?Hard
224.What is text summarization (extractive vs abstractive)?
Hard
225.What is question answering system?Hard
225.What is question answering system?
Hard
226.What is information extraction?Medium
226.What is information extraction?
Medium
227.What is coreference resolution?Hard
227.What is coreference resolution?
Hard
228.What is dependency parsing?Hard
228.What is dependency parsing?
Hard
229.What is attention mechanism in NLP?Hard
229.What is attention mechanism in NLP?
Hard
230.What are challenges in NLP?Easy
230.What are challenges in NLP?
Easy
231.What is image classification?Easy
231.What is image classification?
Easy
232.What is object detection?Medium
232.What is object detection?
Medium
233.What is the difference between classification and detection?Easy
233.What is the difference between classification and detection?
Easy
234.What is semantic segmentation?Hard
234.What is semantic segmentation?
Hard
235.What is instance segmentation?Hard
235.What is instance segmentation?
Hard
236.What is image augmentation?Easy
236.What is image augmentation?
Easy
237.What is YOLO (You Only Look Once)?Medium
237.What is YOLO (You Only Look Once)?
Medium
238.What is R-CNN and Fast R-CNN?Hard
238.What is R-CNN and Fast R-CNN?
Hard
239.What is U-Net architecture?Hard
239.What is U-Net architecture?
Hard
240.What is face recognition vs face detection?Easy
240.What is face recognition vs face detection?
Easy
241.What is optical character recognition (OCR)?Medium
241.What is optical character recognition (OCR)?
Medium
242.What is image captioning?Hard
242.What is image captioning?
Hard
243.What is style transfer?Medium
243.What is style transfer?
Medium
244.What is ResNet and residual connections?Hard
244.What is ResNet and residual connections?
Hard
245.What are common preprocessing techniques for images?Easy
245.What are common preprocessing techniques for images?
Easy
246.What is time series data?Easy
246.What is time series data?
Easy
247.What is stationarity and why is it important?Medium
247.What is stationarity and why is it important?
Medium
248.How do you test for stationarity (ADF test)?Medium
248.How do you test for stationarity (ADF test)?
Medium
249.What is differencing in time series?Easy
249.What is differencing in time series?
Easy
250.What is autocorrelation (ACF)?Medium
250.What is autocorrelation (ACF)?
Medium
251.What is partial autocorrelation (PACF)?Hard
251.What is partial autocorrelation (PACF)?
Hard
252.What is ARIMA model?Medium
252.What is ARIMA model?
Medium
253.What is AR, MA, and ARMA?Medium
253.What is AR, MA, and ARMA?
Medium
254.What is seasonal ARIMA (SARIMA)?Hard
254.What is seasonal ARIMA (SARIMA)?
Hard
255.What is exponential smoothing?Medium
255.What is exponential smoothing?
Medium
256.What is Holt-Winters method?Hard
256.What is Holt-Winters method?
Hard
257.What is trend and seasonality?Easy
257.What is trend and seasonality?
Easy
258.How do you decompose time series?Medium
258.How do you decompose time series?
Medium
259.What is Prophet by Facebook?Medium
259.What is Prophet by Facebook?
Medium
260.What is LSTM for time series forecasting?Hard
260.What is LSTM for time series forecasting?
Hard
261.What is walk-forward validation?Medium
261.What is walk-forward validation?
Medium
262.What is rolling window approach?Easy
262.What is rolling window approach?
Easy
263.What is lag features in time series?Easy
263.What is lag features in time series?
Easy
264.What is change point detection?Hard
264.What is change point detection?
Hard
265.What is anomaly detection in time series?Medium
265.What is anomaly detection in time series?
Medium
266.What is feature engineering and why is it important?Easy
266.What is feature engineering and why is it important?
Easy
267.What is feature scaling (normalization vs standardization)?Easy
267.What is feature scaling (normalization vs standardization)?
Easy
268.When to use normalization vs standardization?Medium
268.When to use normalization vs standardization?
Medium
269.What is one-hot encoding?Easy
269.What is one-hot encoding?
Easy
270.What is label encoding?Easy
270.What is label encoding?
Easy
271.What is target encoding?Hard
271.What is target encoding?
Hard
272.What is binning/discretization?Easy
272.What is binning/discretization?
Easy
273.What is polynomial features?Medium
273.What is polynomial features?
Medium
274.What is interaction features?Medium
274.What is interaction features?
Medium
275.What is feature hashing?Hard
275.What is feature hashing?
Hard
276.How do you handle missing values?Medium
276.How do you handle missing values?
Medium
277.How do you handle categorical variables?Medium
277.How do you handle categorical variables?
Medium
278.How do you handle date-time features?Easy
278.How do you handle date-time features?
Easy
279.What is feature selection methods?Medium
279.What is feature selection methods?
Medium
280.What is recursive feature elimination (RFE)?Hard
280.What is recursive feature elimination (RFE)?
Hard
281.What Python libraries do you use for data science?Easy
281.What Python libraries do you use for data science?
Easy
282.What is NumPy and its advantages?Easy
282.What is NumPy and its advantages?
Easy
283.What is pandas DataFrame?Easy
283.What is pandas DataFrame?
Easy
284.What is the difference between loc and iloc?Easy
284.What is the difference between loc and iloc?
Easy
285.How do you handle missing data in pandas?Easy
285.How do you handle missing data in pandas?
Easy
286.What is scikit-learn?Easy
286.What is scikit-learn?
Easy
287.What is TensorFlow vs PyTorch?Medium
287.What is TensorFlow vs PyTorch?
Medium
288.What is Keras?Easy
288.What is Keras?
Easy
289.What is the difference between fit, transform, and fit_transform?Medium
289.What is the difference between fit, transform, and fit_transform?
Medium
290.What is pipeline in scikit-learn?Medium
290.What is pipeline in scikit-learn?
Medium
291.How do you save and load models?Easy
291.How do you save and load models?
Easy
292.What is pickle in Python?Easy
292.What is pickle in Python?
Easy
293.What is lambda function?Easy
293.What is lambda function?
Easy
294.What is list comprehension?Easy
294.What is list comprehension?
Easy
295.What is generator in Python?Medium
295.What is generator in Python?
Medium
296.How do you optimize Python code?Hard
296.How do you optimize Python code?
Hard
297.What is vectorization in NumPy?Medium
297.What is vectorization in NumPy?
Medium
298.What is broadcasting in NumPy?Hard
298.What is broadcasting in NumPy?
Hard
299.How do you parallelize code in Python?Hard
299.How do you parallelize code in Python?
Hard
300.What is virtual environment?Easy
300.What is virtual environment?
Easy
301.How do you use SQL in data science projects?Easy
301.How do you use SQL in data science projects?
Easy
302.What is the difference between JOIN types?Easy
302.What is the difference between JOIN types?
Easy
303.What is window function in SQL?Medium
303.What is window function in SQL?
Medium
304.How do you calculate running totals?Medium
304.How do you calculate running totals?
Medium
305.What is GROUP BY and HAVING?Easy
305.What is GROUP BY and HAVING?
Easy
306.How do you find duplicates in SQL?Medium
306.How do you find duplicates in SQL?
Medium
307.What is subquery vs CTE?Medium
307.What is subquery vs CTE?
Medium
308.How do you optimize SQL queries?Hard
308.How do you optimize SQL queries?
Hard
309.What is the difference between WHERE and HAVING?Easy
309.What is the difference between WHERE and HAVING?
Easy
310.How do you handle NULL values in SQL?Easy
310.How do you handle NULL values in SQL?
Easy
311.What is UNION vs UNION ALL?Easy
311.What is UNION vs UNION ALL?
Easy
312.How do you pivot data in SQL?Hard
312.How do you pivot data in SQL?
Hard
313.What is the difference between RANK and DENSE_RANK?Medium
313.What is the difference between RANK and DENSE_RANK?
Medium
314.How do you calculate percentiles in SQL?Hard
314.How do you calculate percentiles in SQL?
Hard
315.What is sampling data in SQL?Medium
315.What is sampling data in SQL?
Medium
316.What is A/B testing?Easy
316.What is A/B testing?
Easy
317.How do you design an A/B test?Medium
317.How do you design an A/B test?
Medium
318.What is statistical power?Medium
318.What is statistical power?
Medium
319.How do you calculate sample size for A/B test?Hard
319.How do you calculate sample size for A/B test?
Hard
320.What is p-hacking?Medium
320.What is p-hacking?
Medium
321.What is the multiple testing problem?Hard
321.What is the multiple testing problem?
Hard
322.How do you handle novelty effect?Hard
322.How do you handle novelty effect?
Hard
323.What is selection bias?Medium
323.What is selection bias?
Medium
324.What is Hawthorne effect?Medium
324.What is Hawthorne effect?
Medium
325.What is multivariate testing?Medium
325.What is multivariate testing?
Medium
326.What is sequential testing?Hard
326.What is sequential testing?
Hard
327.What is Bayesian A/B testing?Hard
327.What is Bayesian A/B testing?
Hard
328.How long should you run an A/B test?Easy
328.How long should you run an A/B test?
Easy
329.What is the difference between statistical and practical significance?Easy
329.What is the difference between statistical and practical significance?
Easy
330.How do you analyze A/B test results?Medium
330.How do you analyze A/B test results?
Medium
331.How do you deploy a machine learning model?Medium
331.How do you deploy a machine learning model?
Medium
332.What is model serving?Medium
332.What is model serving?
Medium
333.What is the difference between batch and real-time prediction?Easy
333.What is the difference between batch and real-time prediction?
Easy
334.What is REST API for ML models?Easy
334.What is REST API for ML models?
Easy
335.What is Flask/FastAPI for deployment?Medium
335.What is Flask/FastAPI for deployment?
Medium
336.What is Docker and containerization?Easy
336.What is Docker and containerization?
Easy
337.What is model versioning?Medium
337.What is model versioning?
Medium
338.What is model monitoring in production?Medium
338.What is model monitoring in production?
Medium
339.What is data drift?Hard
339.What is data drift?
Hard
340.What is concept drift?Hard
340.What is concept drift?
Hard
341.How do you detect model degradation?Hard
341.How do you detect model degradation?
Hard
342.What is A/B testing for models?Medium
342.What is A/B testing for models?
Medium
343.What is shadow deployment?Hard
343.What is shadow deployment?
Hard
344.What is canary deployment?Hard
344.What is canary deployment?
Hard
345.What is model retraining strategy?Medium
345.What is model retraining strategy?
Medium
346.What is feature store?Hard
346.What is feature store?
Hard
347.What is MLOps?Medium
347.What is MLOps?
Medium
348.What is CI/CD for ML?Hard
348.What is CI/CD for ML?
Hard
349.How do you ensure model reproducibility?Hard
349.How do you ensure model reproducibility?
Hard
350.What is model explainability in production?Hard
350.What is model explainability in production?
Hard
351.What is Apache Spark for data science?Medium
351.What is Apache Spark for data science?
Medium
352.What is PySpark?Easy
352.What is PySpark?
Easy
353.How do you train models on big data?Medium
353.How do you train models on big data?
Medium
354.What is distributed machine learning?Hard
354.What is distributed machine learning?
Hard
355.What is Dask for parallel computing?Medium
355.What is Dask for parallel computing?
Medium
356.What is GPU computing for deep learning?Easy
356.What is GPU computing for deep learning?
Easy
357.What is the difference between CPU and GPU training?Easy
357.What is the difference between CPU and GPU training?
Easy
358.What is distributed training in deep learning?Hard
358.What is distributed training in deep learning?
Hard
359.What is data parallelism vs model parallelism?Hard
359.What is data parallelism vs model parallelism?
Hard
360.How do you handle large datasets that don't fit in memory?Medium
360.How do you handle large datasets that don't fit in memory?
Medium
361.How do you translate business problems into data science problems?Medium
361.How do you translate business problems into data science problems?
Medium
362.How do you communicate technical results to non-technical stakeholders?Medium
362.How do you communicate technical results to non-technical stakeholders?
Medium
363.What KPIs have you worked with?Easy
363.What KPIs have you worked with?
Easy
364.How do you measure ROI of a data science project?Medium
364.How do you measure ROI of a data science project?
Medium
365.How do you prioritize data science projects?Medium
365.How do you prioritize data science projects?
Medium
366.What is the data science project lifecycle?Easy
366.What is the data science project lifecycle?
Easy
367.How do you handle stakeholder expectations?Medium
367.How do you handle stakeholder expectations?
Medium
368.How do you present model results?Easy
368.How do you present model results?
Easy
369.What is storytelling with data?Easy
369.What is storytelling with data?
Easy
370.How do you justify model decisions?Medium
370.How do you justify model decisions?
Medium
371.How do you handle ethical considerations in ML?Hard
371.How do you handle ethical considerations in ML?
Hard
372.What is bias in machine learning?Medium
372.What is bias in machine learning?
Medium
373.What is fairness in ML?Hard
373.What is fairness in ML?
Hard
374.How do you ensure model fairness?Hard
374.How do you ensure model fairness?
Hard
375.What are privacy concerns in data science?Hard
375.What are privacy concerns in data science?
Hard
376.How would you build a recommendation system for e-commerce?Hard
376.How would you build a recommendation system for e-commerce?
Hard
377.Design a fraud detection system.Hard
377.Design a fraud detection system.
Hard
378.How would you predict customer churn?Medium
378.How would you predict customer churn?
Medium
379.Design a credit scoring model.Medium
379.Design a credit scoring model.
Medium
380.How would you build a spam detection system?Easy
380.How would you build a spam detection system?
Easy
381.Design a sentiment analysis system for social media.Medium
381.Design a sentiment analysis system for social media.
Medium
382.How would you forecast sales for a retail company?Medium
382.How would you forecast sales for a retail company?
Medium
383.Design an image classification system.Easy
383.Design an image classification system.
Easy
384.How would you build a chatbot?Hard
384.How would you build a chatbot?
Hard
385.Design a predictive maintenance system.Hard
385.Design a predictive maintenance system.
Hard
386.How would you detect anomalies in network traffic?Hard
386.How would you detect anomalies in network traffic?
Hard
387.Design a personalized marketing campaign model.Medium
387.Design a personalized marketing campaign model.
Medium
388.How would you build a price optimization model?Hard
388.How would you build a price optimization model?
Hard
389.Design a demand forecasting system.Hard
389.Design a demand forecasting system.
Hard
390.How would you build a customer segmentation model?Medium
390.How would you build a customer segmentation model?
Medium
391.Your model has 95% accuracy but performs poorly in production. Why?Medium
391.Your model has 95% accuracy but performs poorly in production. Why?
Medium
392.How would you handle imbalanced data in fraud detection?Medium
392.How would you handle imbalanced data in fraud detection?
Medium
393.Your model is overfitting. What would you do?Easy
393.Your model is overfitting. What would you do?
Easy
394.How would you improve model performance?Easy
394.How would you improve model performance?
Easy
395.How do you handle missing data in a dataset with 40% nulls?Medium
395.How do you handle missing data in a dataset with 40% nulls?
Medium
396.How would you detect fake news?Hard
396.How would you detect fake news?
Hard
397.Design a movie recommendation system.Medium
397.Design a movie recommendation system.
Medium
398.How would you predict employee attrition?Medium
398.How would you predict employee attrition?
Medium
399.Design a medical diagnosis system.Hard
399.Design a medical diagnosis system.
Hard
400.How would you build a speech recognition system?Hard
400.How would you build a speech recognition system?
Hard
401.Design a real-time bidding system.Hard
401.Design a real-time bidding system.
Hard
402.How would you optimize delivery routes?Hard
402.How would you optimize delivery routes?
Hard
403.Design a face recognition system.Medium
403.Design a face recognition system.
Medium
404.How would you predict stock prices?Medium
404.How would you predict stock prices?
Medium
405.Design a customer lifetime value model.Hard
405.Design a customer lifetime value model.
Hard
406.How would you build a question-answering system?Hard
406.How would you build a question-answering system?
Hard
407.Design a document classification system.Easy
407.Design a document classification system.
Easy
408.How would you detect plagiarism?Medium
408.How would you detect plagiarism?
Medium
409.Design an object detection system for autonomous vehicles.Hard
409.Design an object detection system for autonomous vehicles.
Hard
410.How would you build a music recommendation system?Medium
410.How would you build a music recommendation system?
Medium
411.Tell me about your most challenging data science project.Hard
411.Tell me about your most challenging data science project.
Hard
412.How do you approach a new data science problem?Easy
412.How do you approach a new data science problem?
Easy
413.Describe a time when your model failed in production.Medium
413.Describe a time when your model failed in production.
Medium
414.How do you stay updated with latest ML trends?Easy
414.How do you stay updated with latest ML trends?
Easy
415.Tell me about a time you had to explain a complex model to stakeholders.Medium
415.Tell me about a time you had to explain a complex model to stakeholders.
Medium
416.How do you handle disagreements with team members?Medium
416.How do you handle disagreements with team members?
Medium
417.Describe your experience with cross-functional collaboration.Medium
417.Describe your experience with cross-functional collaboration.
Medium
418.How do you manage multiple projects?Easy
418.How do you manage multiple projects?
Easy
419.Tell me about a time you had to make a trade-off between accuracy and interpretability.Hard
419.Tell me about a time you had to make a trade-off between accuracy and interpretability.
Hard
420.How do you handle ambiguous requirements?Medium
420.How do you handle ambiguous requirements?
Medium
421.Describe a time you found an unexpected insight.Medium
421.Describe a time you found an unexpected insight.
Medium
422.How do you approach debugging ML models?Hard
422.How do you approach debugging ML models?
Hard
423.Tell me about a time you had to learn a new technique quickly.Easy
423.Tell me about a time you had to learn a new technique quickly.
Easy
424.How do you ensure reproducibility in your work?Medium
424.How do you ensure reproducibility in your work?
Medium
425.Describe your code review process.Easy
425.Describe your code review process.
Easy
426.How do you handle tight deadlines?Easy
426.How do you handle tight deadlines?
Easy
427.Tell me about a time you optimized a slow model.Hard
427.Tell me about a time you optimized a slow model.
Hard
428.How do you handle negative feedback?Easy
428.How do you handle negative feedback?
Easy
429.Describe a time you failed and what you learned.Medium
429.Describe a time you failed and what you learned.
Medium
430.How do you mentor junior data scientists?Medium
430.How do you mentor junior data scientists?
Medium
431.What's your approach to experimentation?Medium
431.What's your approach to experimentation?
Medium
432.How do you balance exploration vs exploitation?Hard
432.How do you balance exploration vs exploitation?
Hard
433.Tell me about a time you had to make a decision with incomplete data.Hard
433.Tell me about a time you had to make a decision with incomplete data.
Hard
434.How do you prioritize feature requests?Easy
434.How do you prioritize feature requests?
Easy
435.Describe your experience with agile methodology.Easy
435.Describe your experience with agile methodology.
Easy
436.How do you handle model bias issues?Hard
436.How do you handle model bias issues?
Hard
437.Tell me about a time you automated a manual process.Easy
437.Tell me about a time you automated a manual process.
Easy
438.How do you ensure data quality?Medium
438.How do you ensure data quality?
Medium
439.Describe your approach to documentation.Easy
439.Describe your approach to documentation.
Easy
440.Why do you want to be a Data Scientist?Easy
440.Why do you want to be a Data Scientist?
Easy