LLM & GenAI Engineer Interview Questions
The definitive guide to LLM architecture, Prompt Engineering, RAG, Fine-tuning, Agents, and Production MLOps. Essential for acing GenAI roles in 2026.
Total Questions:260
Difficulty Levels:
BeginnerIntermediateAdvanced
0%
Overall Progress
0/260
Status
Problem
Level
2.What's the difference between GPT-3.5, GPT-4, and GPT-4 Turbo?Medium
2.What's the difference between GPT-3.5, GPT-4, and GPT-4 Turbo?
Medium
3.How does the Transformer architecture work? Explain self-attention.Hard
3.How does the Transformer architecture work? Explain self-attention.
Hard
4.What's the difference between encoder-only (BERT), decoder-only (GPT), and encoder-decoder (T5) models?Medium
4.What's the difference between encoder-only (BERT), decoder-only (GPT), and encoder-decoder (T5) models?
Medium
5.What's tokenization and why does it matter in LLMs?Easy
5.What's tokenization and why does it matter in LLMs?
Easy
6.Explain byte-pair encoding (BPE) vs WordPiece vs SentencePiece.Hard
6.Explain byte-pair encoding (BPE) vs WordPiece vs SentencePiece.
Hard
7.What's the context window and why is it important?Easy
7.What's the context window and why is it important?
Easy
8.How do you handle inputs longer than the context window?Medium
8.How do you handle inputs longer than the context window?
Medium
9.What's positional encoding in Transformers?Hard
9.What's positional encoding in Transformers?
Hard
10.Explain the difference between absolute and relative positional encoding.Hard
10.Explain the difference between absolute and relative positional encoding.
Hard
11.What's the role of the feedforward layer in Transformers?Medium
11.What's the role of the feedforward layer in Transformers?
Medium
12.How does multi-head attention work?Medium
12.How does multi-head attention work?
Medium
13.What's the difference between masked and causal attention?Hard
13.What's the difference between masked and causal attention?
Hard
14.Why do we use layer normalization in Transformers?Medium
14.Why do we use layer normalization in Transformers?
Medium
15.What's the purpose of the residual connections in Transformers?Hard
15.What's the purpose of the residual connections in Transformers?
Hard
16.What's prompt engineering and why is it important?Easy
16.What's prompt engineering and why is it important?
Easy
17.Explain zero-shot, one-shot, and few-shot prompting with examples.Easy
17.Explain zero-shot, one-shot, and few-shot prompting with examples.
Easy
18.What's chain-of-thought (CoT) prompting?Medium
18.What's chain-of-thought (CoT) prompting?
Medium
19.What's the difference between CoT and zero-shot CoT?Medium
19.What's the difference between CoT and zero-shot CoT?
Medium
20.What's tree-of-thought prompting?Hard
20.What's tree-of-thought prompting?
Hard
21.Explain ReAct (Reasoning and Acting) prompting.Hard
21.Explain ReAct (Reasoning and Acting) prompting.
Hard
22.What's the role of system prompts vs user prompts?Easy
22.What's the role of system prompts vs user prompts?
Easy
23.How do you prevent prompt injection attacks?Medium
23.How do you prevent prompt injection attacks?
Medium
24.What's prompt leaking and how do you prevent it?Medium
24.What's prompt leaking and how do you prevent it?
Medium
25.What are delimiters and why use them in prompts?Easy
25.What are delimiters and why use them in prompts?
Easy
26.How do you structure prompts for better results?Medium
26.How do you structure prompts for better results?
Medium
27.What's role prompting (e.g., 'You are an expert...')?Easy
27.What's role prompting (e.g., 'You are an expert...')?
Easy
28.How do you handle ambiguous user queries?Medium
28.How do you handle ambiguous user queries?
Medium
29.What's negative prompting?Easy
29.What's negative prompting?
Easy
30.How do you use examples effectively in prompts?Medium
30.How do you use examples effectively in prompts?
Medium
31.What's the difference between instructional and conversational prompts?Easy
31.What's the difference between instructional and conversational prompts?
Easy
32.How do you optimize prompts for cost and quality?Medium
32.How do you optimize prompts for cost and quality?
Medium
33.What's prompt chaining?Medium
33.What's prompt chaining?
Medium
34.Explain self-consistency in prompting.Hard
34.Explain self-consistency in prompting.
Hard
35.What's constitutional AI prompting?Hard
35.What's constitutional AI prompting?
Hard
36.What's the difference between pre-training and fine-tuning?Easy
36.What's the difference between pre-training and fine-tuning?
Easy
37.When should you fine-tune vs use prompt engineering?Medium
37.When should you fine-tune vs use prompt engineering?
Medium
38.What's instruction tuning?Medium
38.What's instruction tuning?
Medium
39.Explain RLHF (Reinforcement Learning from Human Feedback).Hard
39.Explain RLHF (Reinforcement Learning from Human Feedback).
Hard
40.What's the difference between RLHF and supervised fine-tuning?Medium
40.What's the difference between RLHF and supervised fine-tuning?
Medium
41.What's LoRA (Low-Rank Adaptation) and why use it?Hard
41.What's LoRA (Low-Rank Adaptation) and why use it?
Hard
42.How does QLoRA differ from LoRA?Hard
42.How does QLoRA differ from LoRA?
Hard
43.What's PEFT (Parameter-Efficient Fine-Tuning)?Medium
43.What's PEFT (Parameter-Efficient Fine-Tuning)?
Medium
44.Explain adapter-based fine-tuning.Hard
44.Explain adapter-based fine-tuning.
Hard
45.What's catastrophic forgetting in fine-tuning?Medium
45.What's catastrophic forgetting in fine-tuning?
Medium
46.How do you prevent catastrophic forgetting?Hard
46.How do you prevent catastrophic forgetting?
Hard
47.What's the difference between full fine-tuning and PEFT?Medium
47.What's the difference between full fine-tuning and PEFT?
Medium
48.What dataset size do you need for fine-tuning?Easy
48.What dataset size do you need for fine-tuning?
Easy
49.How do you prepare training data for fine-tuning?Easy
49.How do you prepare training data for fine-tuning?
Easy
50.What's data contamination and how do you avoid it?Medium
50.What's data contamination and how do you avoid it?
Medium
51.What's the optimal learning rate for fine-tuning LLMs?Medium
51.What's the optimal learning rate for fine-tuning LLMs?
Medium
52.How many epochs should you fine-tune for?Easy
52.How many epochs should you fine-tune for?
Easy
53.What's warmup in training and why use it?Medium
53.What's warmup in training and why use it?
Medium
54.How do you evaluate a fine-tuned model?Medium
54.How do you evaluate a fine-tuned model?
Medium
55.What's alignment in LLMs?Easy
55.What's alignment in LLMs?
Easy
36.What's the difference between pre-training and fine-tuning?Easy
36.What's the difference between pre-training and fine-tuning?
Easy
37.When should you fine-tune vs use prompt engineering?Medium
37.When should you fine-tune vs use prompt engineering?
Medium
38.What's instruction tuning?Medium
38.What's instruction tuning?
Medium
39.Explain RLHF (Reinforcement Learning from Human Feedback).Hard
39.Explain RLHF (Reinforcement Learning from Human Feedback).
Hard
40.What's the difference between RLHF and supervised fine-tuning?Medium
40.What's the difference between RLHF and supervised fine-tuning?
Medium
41.What's LoRA (Low-Rank Adaptation) and why use it?Hard
41.What's LoRA (Low-Rank Adaptation) and why use it?
Hard
42.How does QLoRA differ from LoRA?Hard
42.How does QLoRA differ from LoRA?
Hard
43.What's PEFT (Parameter-Efficient Fine-Tuning)?Medium
43.What's PEFT (Parameter-Efficient Fine-Tuning)?
Medium
44.Explain adapter-based fine-tuning.Hard
44.Explain adapter-based fine-tuning.
Hard
45.What's catastrophic forgetting in fine-tuning?Medium
45.What's catastrophic forgetting in fine-tuning?
Medium
46.How do you prevent catastrophic forgetting?Hard
46.How do you prevent catastrophic forgetting?
Hard
47.What's the difference between full fine-tuning and PEFT?Medium
47.What's the difference between full fine-tuning and PEFT?
Medium
48.What dataset size do you need for fine-tuning?Easy
48.What dataset size do you need for fine-tuning?
Easy
49.How do you prepare training data for fine-tuning?Easy
49.How do you prepare training data for fine-tuning?
Easy
50.What's data contamination and how do you avoid it?Medium
50.What's data contamination and how do you avoid it?
Medium
51.What's the optimal learning rate for fine-tuning LLMs?Medium
51.What's the optimal learning rate for fine-tuning LLMs?
Medium
52.How many epochs should you fine-tune for?Easy
52.How many epochs should you fine-tune for?
Easy
53.What's warmup in training and why use it?Medium
53.What's warmup in training and why use it?
Medium
54.How do you evaluate a fine-tuned model?Medium
54.How do you evaluate a fine-tuned model?
Medium
55.What's alignment in LLMs?Easy
55.What's alignment in LLMs?
Easy
56.What's RAG and why is it important?Easy
56.What's RAG and why is it important?
Easy
57.When should you use RAG vs fine-tuning?Medium
57.When should you use RAG vs fine-tuning?
Medium
58.Explain the RAG pipeline step-by-step.Medium
58.Explain the RAG pipeline step-by-step.
Medium
59.What's chunking and why does chunk size matter?Easy
59.What's chunking and why does chunk size matter?
Easy
60.What's the optimal chunk size for RAG?Medium
60.What's the optimal chunk size for RAG?
Medium
61.What's chunk overlap and why use it?Medium
61.What's chunk overlap and why use it?
Medium
62.How do you handle structured vs unstructured data in RAG?Hard
62.How do you handle structured vs unstructured data in RAG?
Hard
63.What's embedding and how is it used in RAG?Easy
63.What's embedding and how is it used in RAG?
Easy
64.What embedding models do you know?Easy
64.What embedding models do you know?
Easy
65.What's the difference between dense and sparse embeddings?Hard
65.What's the difference between dense and sparse embeddings?
Hard
66.What's cosine similarity and why use it for retrieval?Medium
66.What's cosine similarity and why use it for retrieval?
Medium
67.What's semantic search?Easy
67.What's semantic search?
Easy
68.What's the difference between keyword search and semantic search?Easy
68.What's the difference between keyword search and semantic search?
Easy
69.What's a vector database?Easy
69.What's a vector database?
Easy
70.Compare vector databases: Pinecone vs Weaviate vs Chroma vs Qdrant.Medium
70.Compare vector databases: Pinecone vs Weaviate vs Chroma vs Qdrant.
Medium
71.What's FAISS and when would you use it?Medium
71.What's FAISS and when would you use it?
Medium
72.How do you choose k in top-k retrieval?Medium
72.How do you choose k in top-k retrieval?
Medium
73.What's re-ranking and why is it important?Hard
73.What's re-ranking and why is it important?
Hard
74.What's hybrid search (keyword + semantic)?Hard
74.What's hybrid search (keyword + semantic)?
Hard
75.How do you handle multi-modal RAG (text + images)?Hard
75.How do you handle multi-modal RAG (text + images)?
Hard
76.What's metadata filtering in vector search?Medium
76.What's metadata filtering in vector search?
Medium
77.How do you update documents in a RAG system?Medium
77.How do you update documents in a RAG system?
Medium
78.What's the cold start problem in RAG?Easy
78.What's the cold start problem in RAG?
Easy
79.How do you evaluate RAG system performance?Hard
79.How do you evaluate RAG system performance?
Hard
80.What's retrieval precision vs recall in RAG?Medium
80.What's retrieval precision vs recall in RAG?
Medium
81.How do you reduce hallucinations in RAG?Medium
81.How do you reduce hallucinations in RAG?
Medium
82.What's context stuffing in RAG?Medium
82.What's context stuffing in RAG?
Medium
83.How do you handle multi-document retrieval?Hard
83.How do you handle multi-document retrieval?
Hard
84.What's query transformation/rewriting?Hard
84.What's query transformation/rewriting?
Hard
85.What's HyDE (Hypothetical Document Embeddings)?Hard
85.What's HyDE (Hypothetical Document Embeddings)?
Hard
86.What's an LLM agent?Easy
86.What's an LLM agent?
Easy
87.Explain the ReAct framework for agents.Medium
87.Explain the ReAct framework for agents.
Medium
88.What's function calling in LLMs?Medium
88.What's function calling in LLMs?
Medium
89.How do you implement tool use with LLMs?Medium
89.How do you implement tool use with LLMs?
Medium
90.What's the difference between function calling and tool use?Medium
90.What's the difference between function calling and tool use?
Medium
91.How do you design tools for LLM agents?Hard
91.How do you design tools for LLM agents?
Hard
92.What's LangChain and what problems does it solve?Easy
92.What's LangChain and what problems does it solve?
Easy
93.What's LlamaIndex (formerly GPT Index)?Easy
93.What's LlamaIndex (formerly GPT Index)?
Easy
94.Compare LangChain vs LlamaIndex.Medium
94.Compare LangChain vs LlamaIndex.
Medium
95.What's AutoGPT and how does it work?Medium
95.What's AutoGPT and how does it work?
Medium
96.What's the planning-execution loop in agents?Hard
96.What's the planning-execution loop in agents?
Hard
97.How do you handle agent errors and retries?Medium
97.How do you handle agent errors and retries?
Medium
98.What's multi-agent systems?Hard
98.What's multi-agent systems?
Hard
99.How do you implement memory in LLM agents?Medium
99.How do you implement memory in LLM agents?
Medium
100.What's the difference between short-term and long-term memory?Easy
100.What's the difference between short-term and long-term memory?
Easy
101.What's conversation history management?Medium
101.What's conversation history management?
Medium
102.How do you implement agent observability?Hard
102.How do you implement agent observability?
Hard
103.What's agent evaluation metrics?Hard
103.What's agent evaluation metrics?
Hard
104.What's the difference between autonomous and semi-autonomous agents?Easy
104.What's the difference between autonomous and semi-autonomous agents?
Easy
105.How do you prevent infinite loops in agents?Medium
105.How do you prevent infinite loops in agents?
Medium
106.What's temperature in text generation?Easy
106.What's temperature in text generation?
Easy
107.How does temperature affect output randomness?Medium
107.How does temperature affect output randomness?
Medium
108.What's top-k sampling?Easy
108.What's top-k sampling?
Easy
109.What's top-p (nucleus) sampling?Medium
109.What's top-p (nucleus) sampling?
Medium
110.When would you use top-k vs top-p?Medium
110.When would you use top-k vs top-p?
Medium
111.What's beam search?Hard
111.What's beam search?
Hard
112.What's the difference between greedy decoding and sampling?Easy
112.What's the difference between greedy decoding and sampling?
Easy
113.What's repetition penalty?Medium
113.What's repetition penalty?
Medium
114.What's frequency penalty vs presence penalty?Hard
114.What's frequency penalty vs presence penalty?
Hard
115.How do you control output length?Easy
115.How do you control output length?
Easy
116.What's the role of max_tokens parameter?Easy
116.What's the role of max_tokens parameter?
Easy
117.What's early stopping in generation?Medium
117.What's early stopping in generation?
Medium
118.How do you ensure deterministic outputs?Medium
118.How do you ensure deterministic outputs?
Medium
119.What's logit bias and when would you use it?Hard
119.What's logit bias and when would you use it?
Hard
120.What's constrained generation?Hard
120.What's constrained generation?
Hard
121.How do you deploy an LLM to production?Easy
121.How do you deploy an LLM to production?
Easy
122.What's model serving for LLMs?Medium
122.What's model serving for LLMs?
Medium
123.What's the difference between hosted APIs (OpenAI) vs. self-hosted models?Medium
123.What's the difference between hosted APIs (OpenAI) vs. self-hosted models?
Medium
124.When should you use OpenAI API vs. open-source models?Easy
124.When should you use OpenAI API vs. open-source models?
Easy
125.How do you optimize LLM inference latency?Hard
125.How do you optimize LLM inference latency?
Hard
126.What's batching in LLM inference?Hard
126.What's batching in LLM inference?
Hard
127.What's KV cache and how does it speed up inference?Hard
127.What's KV cache and how does it speed up inference?
Hard
128.What's quantization and how does it help?Medium
128.What's quantization and how does it help?
Medium
129.Explain 8-bit vs. 4-bit quantization.Medium
129.Explain 8-bit vs. 4-bit quantization.
Medium
130.What's the trade-off between quantization and quality?Medium
130.What's the trade-off between quantization and quality?
Medium
131.What's model distillation?Hard
131.What's model distillation?
Hard
132.How do you reduce LLM API costs?Medium
132.How do you reduce LLM API costs?
Medium
133.What's caching in LLM applications?Easy
133.What's caching in LLM applications?
Easy
134.How do you implement semantic caching?Hard
134.How do you implement semantic caching?
Hard
135.What's streaming responses and when to use it?Easy
135.What's streaming responses and when to use it?
Easy
136.How do you handle rate limits with LLM APIs?Medium
136.How do you handle rate limits with LLM APIs?
Medium
137.What's exponential backoff for API retries?Easy
137.What's exponential backoff for API retries?
Easy
138.How do you implement fallback strategies for LLMs?Medium
138.How do you implement fallback strategies for LLMs?
Medium
139.What's load balancing for LLM services?Medium
139.What's load balancing for LLM services?
Medium
140.How do you monitor LLM applications in production?Hard
140.How do you monitor LLM applications in production?
Hard
141.How do you evaluate LLM outputs?Easy
141.How do you evaluate LLM outputs?
Easy
142.What's BLEU score and is it good for LLMs?Medium
142.What's BLEU score and is it good for LLMs?
Medium
143.What's ROUGE score?Medium
143.What's ROUGE score?
Medium
144.What's perplexity?Hard
144.What's perplexity?
Hard
145.How do you measure hallucination?Hard
145.How do you measure hallucination?
Hard
146.What's faithfulness in RAG evaluation?Medium
146.What's faithfulness in RAG evaluation?
Medium
147.What's answer relevancy?Medium
147.What's answer relevancy?
Medium
148.What's context precision and recall in RAG?Hard
148.What's context precision and recall in RAG?
Hard
149.How do you implement LLM-as-judge evaluation?Medium
149.How do you implement LLM-as-judge evaluation?
Medium
150.What's human evaluation vs automated evaluation?Easy
150.What's human evaluation vs automated evaluation?
Easy
151.What's A/B testing for LLM applications?Easy
151.What's A/B testing for LLM applications?
Easy
152.How do you track LLM performance over time?Medium
152.How do you track LLM performance over time?
Medium
153.What metrics do you use for chatbots?Easy
153.What metrics do you use for chatbots?
Easy
154.How do you measure response quality?Medium
154.How do you measure response quality?
Medium
155.What's the role of user feedback?Easy
155.What's the role of user feedback?
Easy
156.What's hallucination in LLMs and how do you reduce it?Easy
156.What's hallucination in LLMs and how do you reduce it?
Easy
157.What's prompt injection?Medium
157.What's prompt injection?
Medium
158.How do you prevent jailbreaking?Hard
158.How do you prevent jailbreaking?
Hard
159.What's data privacy in LLM applications?Easy
159.What's data privacy in LLM applications?
Easy
160.How do you handle PII (Personally Identifiable Information)?Medium
160.How do you handle PII (Personally Identifiable Information)?
Medium
161.What's model alignment?Hard
161.What's model alignment?
Hard
162.What's red teaming for LLMs?Medium
162.What's red teaming for LLMs?
Medium
163.How do you implement content filtering?Easy
163.How do you implement content filtering?
Easy
164.What's toxicity detection?Easy
164.What's toxicity detection?
Easy
165.How do you handle bias in LLMs?Medium
165.How do you handle bias in LLMs?
Medium
166.What's constitutional AI?Hard
166.What's constitutional AI?
Hard
167.How do you implement guardrails in LLM apps?Medium
167.How do you implement guardrails in LLM apps?
Medium
168.What's responsible AI deployment?Easy
168.What's responsible AI deployment?
Easy
169.How do you handle controversial topics?Medium
169.How do you handle controversial topics?
Medium
170.What's explainability in LLMs?Hard
170.What's explainability in LLMs?
Hard
171.Write code to call OpenAI API.Easy
171.Write code to call OpenAI API.
Easy
172.Implement a basic chatbot with conversation history.Medium
172.Implement a basic chatbot with conversation history.
Medium
173.Code a RAG pipeline from scratch.Hard
173.Code a RAG pipeline from scratch.
Hard
174.Write a function to chunk documents.Easy
174.Write a function to chunk documents.
Easy
175.Implement semantic search with embeddings.Medium
175.Implement semantic search with embeddings.
Medium
176.Code a simple LangChain agent.Medium
176.Code a simple LangChain agent.
Medium
177.Write prompt template with variables.Easy
177.Write prompt template with variables.
Easy
178.Implement streaming responses.Medium
178.Implement streaming responses.
Medium
179.Code error handling for LLM API calls.Easy
179.Code error handling for LLM API calls.
Easy
180.Write a retry mechanism with exponential backoff.Medium
180.Write a retry mechanism with exponential backoff.
Medium
181.Implement token counting.Easy
181.Implement token counting.
Easy
182.Code a prompt caching system.Medium
182.Code a prompt caching system.
Medium
183.Write a function to truncate text to token limit.Medium
183.Write a function to truncate text to token limit.
Medium
184.Implement conversation summarization.Medium
184.Implement conversation summarization.
Medium
185.Code a simple vector store.Hard
185.Code a simple vector store.
Hard
186.Write evaluation metrics (cosine similarity).Easy
186.Write evaluation metrics (cosine similarity).
Easy
187.Implement hallucination detection.Hard
187.Implement hallucination detection.
Hard
188.Code a multi-turn conversation handler.Easy
188.Code a multi-turn conversation handler.
Easy
189.Write unit tests for LLM functions.Medium
189.Write unit tests for LLM functions.
Medium
190.Implement logging for LLM calls.Easy
190.Implement logging for LLM calls.
Easy
191.Design a customer support chatbot system.Hard
191.Design a customer support chatbot system.
Hard
192.Design a document Q&A system (like ChatPDF).Medium
192.Design a document Q&A system (like ChatPDF).
Medium
193.Design a code generation assistant.Hard
193.Design a code generation assistant.
Hard
194.Design a content moderation system using LLMs.Medium
194.Design a content moderation system using LLMs.
Medium
195.Design a personalized email generation system.Medium
195.Design a personalized email generation system.
Medium
196.Design a multi-language translation service.Medium
196.Design a multi-language translation service.
Medium
197.Design a resume screening system.Hard
197.Design a resume screening system.
Hard
198.Design a meeting summarization tool.Medium
198.Design a meeting summarization tool.
Medium
199.Design a SQL query generator from natural language.Hard
199.Design a SQL query generator from natural language.
Hard
200.Design a research assistant that searches academic papers.Hard
200.Design a research assistant that searches academic papers.
Hard
201.How would you build a multi-tenant LLM application?Hard
201.How would you build a multi-tenant LLM application?
Hard
202.Design a system to handle millions of LLM requests.Hard
202.Design a system to handle millions of LLM requests.
Hard
203.How would you implement user-specific customization?Medium
203.How would you implement user-specific customization?
Medium
204.Design monitoring for LLM quality degradation.Medium
204.Design monitoring for LLM quality degradation.
Medium
205.How would you version control prompts in production?Medium
205.How would you version control prompts in production?
Medium
206.What's the difference between GPT-4 and Claude?Easy
206.What's the difference between GPT-4 and Claude?
Easy
207.What's Gemini (Google)?Easy
207.What's Gemini (Google)?
Easy
208.What's LLaMA and LLaMA 2?Easy
208.What's LLaMA and LLaMA 2?
Easy
209.What's Mistral AI models?Medium
209.What's Mistral AI models?
Medium
210.What's Mixtral (Mixture of Experts)?Hard
210.What's Mixtral (Mixture of Experts)?
Hard
211.What's the difference between Llama-2-7B, 13B, and 70B?Easy
211.What's the difference between Llama-2-7B, 13B, and 70B?
Easy
212.What's Falcon LLM?Medium
212.What's Falcon LLM?
Medium
213.What's MPT (MosaicML)?Hard
213.What's MPT (MosaicML)?
Hard
214.What's Vicuna?Medium
214.What's Vicuna?
Medium
215.What's Alpaca?Medium
215.What's Alpaca?
Medium
216.What's the difference between base models and instruct models?Easy
216.What's the difference between base models and instruct models?
Easy
217.What's chat-tuned vs base model?Easy
217.What's chat-tuned vs base model?
Easy
218.What's Code Llama?Medium
218.What's Code Llama?
Medium
219.What's StarCoder?Medium
219.What's StarCoder?
Medium
220.What's Codex?Easy
220.What's Codex?
Easy
221.What's mixture of experts (MoE)?Hard
221.What's mixture of experts (MoE)?
Hard
222.How does sparse activation work in MoE?Hard
222.How does sparse activation work in MoE?
Hard
223.What's retrieval-augmented pre-training?Hard
223.What's retrieval-augmented pre-training?
Hard
224.What's continual learning for LLMs?Hard
224.What's continual learning for LLMs?
Hard
225.How do you adapt LLMs to new domains?Medium
225.How do you adapt LLMs to new domains?
Medium
226.What's federated learning for LLMs?Hard
226.What's federated learning for LLMs?
Hard
227.What's on-device LLM inference?Medium
227.What's on-device LLM inference?
Medium
228.What's speculative decoding?Hard
228.What's speculative decoding?
Hard
229.What's flash attention?Hard
229.What's flash attention?
Hard
230.What's grouped-query attention (GQA)?Hard
230.What's grouped-query attention (GQA)?
Hard
231.What's sliding window attention?Hard
231.What's sliding window attention?
Hard
232.What's RoPE (Rotary Position Embedding)?Hard
232.What's RoPE (Rotary Position Embedding)?
Hard
233.What's ALiBi (Attention with Linear Biases)?Hard
233.What's ALiBi (Attention with Linear Biases)?
Hard
234.What's model merging?Medium
234.What's model merging?
Medium
235.What's GGUF format?Easy
235.What's GGUF format?
Easy
236.Your LLM is hallucinating 30% of the time. What do you do?Medium
236.Your LLM is hallucinating 30% of the time. What do you do?
Medium
237.Your RAG system retrieves irrelevant documents. How do you fix it?Hard
237.Your RAG system retrieves irrelevant documents. How do you fix it?
Hard
238.Your API costs are too high. How do you optimize?Medium
238.Your API costs are too high. How do you optimize?
Medium
239.Your users complain responses are too slow. What's your approach?Hard
239.Your users complain responses are too slow. What's your approach?
Hard
240.Your prompt works 80% of the time. How do you improve it?Easy
240.Your prompt works 80% of the time. How do you improve it?
Easy
241.You need to process 100k documents for RAG. What's your strategy?Medium
241.You need to process 100k documents for RAG. What's your strategy?
Medium
242.Your model violates content policy. How do you handle it?Medium
242.Your model violates content policy. How do you handle it?
Medium
243.You need to support 50 languages. What's your approach?Hard
243.You need to support 50 languages. What's your approach?
Hard
244.Your embeddings don't capture domain-specific meaning. What do you do?Hard
244.Your embeddings don't capture domain-specific meaning. What do you do?
Hard
245.You need to explain LLM decisions to compliance team. How?Medium
245.You need to explain LLM decisions to compliance team. How?
Medium
246.Your context window is full but conversation must continue. Solutions?Easy
246.Your context window is full but conversation must continue. Solutions?
Easy
247.You detect prompt injection in production. Immediate steps?Medium
247.You detect prompt injection in production. Immediate steps?
Medium
248.Your fine-tuned model forgets general knowledge. What happened?Hard
248.Your fine-tuned model forgets general knowledge. What happened?
Hard
249.Users report inconsistent responses. How do you debug?Easy
249.Users report inconsistent responses. How do you debug?
Easy
250.You need to migrate from GPT-3.5 to GPT-4. Migration strategy?Medium
250.You need to migrate from GPT-3.5 to GPT-4. Migration strategy?
Medium
251.Walk me through an LLM project you built end-to-end.Easy
251.Walk me through an LLM project you built end-to-end.
Easy
252.What's the most challenging part of working with LLMs?Medium
252.What's the most challenging part of working with LLMs?
Medium
253.How do you stay updated with rapidly evolving LLM space?Easy
253.How do you stay updated with rapidly evolving LLM space?
Easy
254.Describe a time when your LLM application failed.Medium
254.Describe a time when your LLM application failed.
Medium
255.How do you explain LLM limitations to stakeholders?Easy
255.How do you explain LLM limitations to stakeholders?
Easy
256.What's your process for debugging LLM issues?Medium
256.What's your process for debugging LLM issues?
Medium
257.How do you balance innovation vs. production stability?Medium
257.How do you balance innovation vs. production stability?
Medium
258.Describe your approach to prompt engineering.Easy
258.Describe your approach to prompt engineering.
Easy
259.How do you make build vs. buy decisions for LLM features?Medium
259.How do you make build vs. buy decisions for LLM features?
Medium
260.What excites you most about LLMs?Easy
260.What excites you most about LLMs?
Easy