Data Scientist Interview Questions

Likely questions and prep pointers, drawn from current hiring patterns.

About Data Scientist interviews

Data Scientist interviews are among the most structurally demanding in tech, because hiring managers are screening for three quite different things at once: applied statistics and ML fundamentals, production-aware engineering, and the business judgement to translate a fuzzy stakeholder request into a measurable problem. Expect a four-to-five stage process: a recruiter screen, a hiring manager conversation focused on past projects and business impact, a technical loop covering coding (usually Python and SQL), an ML/statistics deep-dive, and often a take-home case study or live product-sense case. Larger firms (FAANG, fintech, top consultancies) add a system design round focused on ML system architecture. Panels are typically a mix of senior data scientists, an ML engineer, and a product or business stakeholder — the latter being the round candidates most often underestimate. The most common stumbles are: collapsing into excessive technical detail when asked a business-framed question, failing to articulate why a model choice was appropriate versus simply listing what was done, weak SQL under time pressure, and being unable to defend evaluation metrics beyond accuracy or AUC. Candidates who progress tend to lead with the problem framing, quantify impact in business units (revenue, retention, cost), and are honest about model limitations and what they would do differently with more time or data.

Typical stages

  • Recruiter screen
  • Hiring manager interview
  • Technical coding round (Python + SQL)
  • ML and statistics deep-dive
  • Case study or take-home
  • System design / ML design
  • Final stakeholder or values round

Common formats

  • Behavioral STAR
  • Live coding in Python and SQL
  • Whiteboard ML system design
  • Product sense / business case study
  • Take-home modelling exercise
  • Past-project deep-dive
  • Statistics and probability problem-solving

What hiring managers screen for

  • Ability to translate ambiguous business questions into well-scoped, measurable ML or analytical problems
  • Solid grounding in statistics, experimentation, and the assumptions behind common models
  • Production awareness: knows the difference between a notebook prototype and a deployed model
  • Clear communication of trade-offs (bias/variance, precision/recall, complexity/interpretability) to non-technical stakeholders
  • Evidence of measurable business impact, not just model metrics

Red flags to avoid

  • Defaulting to deep learning or complex models when a simpler baseline would suffice
  • Inability to explain why a chosen evaluation metric matches the business cost structure
  • No awareness of data leakage, train/test contamination, or how their model behaves in production
  • Talking about projects in 'we' terms with no clear personal contribution
  • Treating A/B testing as a checkbox rather than understanding power, MDE, and novelty effects

Primary questions (15)

Behavioural

Tell me about a data science project where the business outcome ended up being very different from what was originally requested. How did you navigate that?

Why this comes up: Data Scientists are constantly scoping fuzzy requests, and hiring managers want to see judgement, not just delivery.

Prep pointers
  • Pick a project where you actively reshaped the brief — not one where you just delivered what was asked.
  • STAR Situation should establish the original (flawed or vague) ask in one or two sentences.
  • STAR Action should focus on how you diagnosed the real problem — stakeholder interviews, exploratory analysis, framing alternatives.
  • STAR Result should quantify the business outcome and contrast it with what the original ask would have produced.
  • Avoid sounding like you ignored the stakeholder — frame it as collaborative reframing.
Technical

Walk me through how you would build a model to predict customer churn for a subscription business. Be explicit about your choices.

Why this comes up: This is a canonical end-to-end question that tests framing, feature engineering, modelling, evaluation, and deployment thinking in one prompt.

Prep pointers
  • Start with problem framing: define churn precisely (voluntary vs involuntary, time window) before touching models.
  • Discuss label leakage risks explicitly — features that only exist because the customer is about to churn.
  • Justify your baseline (logistic regression or gradient boosting) before mentioning anything more complex.
  • Choose an evaluation metric that maps to the business cost — e.g. PR-AUC or expected value under intervention cost.
  • Close with how you would monitor for drift and retrain cadence.
Technical

Explain the bias-variance tradeoff and how it has actually influenced a modelling decision you made.

Why this comes up: Tests whether the candidate understands core ML theory beyond textbook definitions and can apply it.

Prep pointers
  • Don't just define the terms — interviewers expect that. Move quickly to a concrete decision.
  • Reference a specific moment: choosing tree depth, regularisation strength, or rejecting a deep model for a small dataset.
  • Tie the decision to what you observed in learning curves or cross-validation behaviour.
  • Be ready for the follow-up: 'How did you know it was bias and not variance?'
Technical

How would you design and evaluate an A/B test for a new recommendation algorithm? What are the pitfalls?

Why this comes up: Experimentation literacy is a core Data Scientist competency, especially at product-led companies.

Prep pointers
  • Cover hypothesis, primary metric, guardrail metrics, and minimum detectable effect before discussing test mechanics.
  • Discuss power analysis and sample size calculation — interviewers expect numbers, not vibes.
  • Address pitfalls specific to recommenders: network effects, novelty effects, exposure bias, interleaving alternatives.
  • Be ready to discuss what you'd do if results were flat or inconclusive.
Behavioural

Describe a time you had to explain a complex modelling decision to a non-technical stakeholder who disagreed with you.

Why this comes up: Stakeholder communication is consistently cited as the differentiator between senior and junior Data Scientists.

Prep pointers
  • Choose an example where the stakeholder had legitimate concerns — not one where you 'won'.
  • STAR Action should describe the specific analogies, visualisations, or framings you used.
  • Show you adjusted based on their input, not just that you persuaded them.
  • Avoid jargon when retelling the story — the interviewer is testing how you communicate in real time.
Situational

Your model is performing well in offline evaluation but underperforming in production. Walk me through how you'd diagnose this.

Why this comes up: Tests production awareness, a common gap for candidates who come from purely academic or notebook backgrounds.

Prep pointers
  • Structure the answer: data drift, concept drift, training/serving skew, label delay, infrastructure bugs.
  • Mention concrete diagnostics: feature distribution comparisons, PSI, monitoring dashboards.
  • Distinguish between problems you fix in code versus problems you escalate to data engineering.
  • Show you would not retrain blindly — first understand the cause.
Situational

A stakeholder asks for a model in two weeks, but your analysis suggests the data isn't reliable enough. How do you handle it?

Why this comes up: Probes judgement, communication, and willingness to push back — qualities that separate competent practitioners from yes-men.

Prep pointers
  • Show you would diagnose data issues concretely (volume, labelling quality, recency) rather than dismissing in the abstract.
  • Discuss offering alternatives: heuristic baseline, scoped pilot, or instrumentation to improve data first.
  • Frame the pushback around business risk, not technical purity.
  • Avoid being either dogmatic ('I refused') or compliant ('I just built it anyway').
Competency

Walk me through a project end-to-end where you owned it from problem definition to deployed impact. What would you do differently now?

Why this comes up: The portfolio deep-dive is almost universal — interviewers want depth and reflection, not a CV recital.

Prep pointers
  • Pick one project — depth beats breadth. Two minutes on framing, four on technical detail, two on impact and learnings.
  • Be specific about your individual contribution versus the team's.
  • Have numbers ready: dataset size, model performance, business impact in currency or percent.
  • The 'what would you do differently' answer is where seniority is judged — be concrete and unflinching.
  • Anticipate drilling: 'Why that model?', 'Why that feature?', 'How did you validate?'
Competency

How do you decide when a problem warrants a machine learning solution versus a simpler heuristic or rules-based approach?

Why this comes up: Tests judgement and maturity — senior hires don't reach for ML by default.

Prep pointers
  • Reference factors: data availability, pattern complexity, cost of error, maintainability, stakeholder trust.
  • Have a concrete example where you chose the simpler approach and it was the right call.
  • Discuss the operational cost of ML — monitoring, retraining, on-call — not just build cost.
  • Avoid sounding anti-ML; you're showing discernment, not skepticism.
Behavioural

Tell me about a time you discovered a significant error in your own analysis after sharing it. How did you handle it?

Why this comes up: Integrity and self-correction are deal-breaker traits — interviewers want to see candidates who own mistakes early.

Prep pointers
  • Pick a real, material error — interviewers can smell sanitised examples.
  • STAR Action should emphasise the speed and transparency of escalation.
  • Discuss what you changed in your process afterwards (code review, validation checks, peer review).
  • Avoid blaming data, tools, or other people.
Technical

Given a SQL table of user events, how would you calculate 7-day rolling retention? Talk me through the query logic.

Why this comes up: SQL fluency is screened in nearly every Data Scientist loop, and rolling/window calculations are a common stumbling block.

Prep pointers
  • Clarify the definition of retention before writing anything — anchor day, return window, distinct users.
  • Be comfortable with self-joins or window functions; verbalise which you're using and why.
  • Mention edge cases: timezones, deduplication, users created mid-window.
  • Practise saying your SQL out loud — interviewers judge clarity of thought as much as syntax.
Behavioural

Describe a time you had to influence the roadmap or product direction using data. What was the resistance and how did you overcome it?

Why this comes up: Senior Data Scientists are expected to drive decisions, not just answer questions — this question screens for that maturity.

Prep pointers
  • Choose an example where the data pointed away from the prevailing view.
  • STAR Action should detail how you packaged the insight — memo, dashboard, exec readout.
  • Show you anticipated objections and addressed them pre-emptively.
  • STAR Result should describe the decision that was made and the downstream outcome.
Situational

You're given a dataset with 30% missing values in a key feature. How do you decide what to do?

Why this comes up: Missing data handling reveals whether a candidate thinks mechanically or thinks about the data-generating process.

Prep pointers
  • Start by asking why the data is missing (MCAR, MAR, MNAR) — not by reaching for imputation methods.
  • Discuss the trade-offs: imputation, indicator features, dropping rows, model-based handling.
  • Tie your choice to downstream model and use case — production constraints matter.
  • Show awareness that 'missingness' is sometimes itself a signal.
Culture fit

How do you stay current with developments in data science, and how do you decide what's worth adopting versus hype?

Why this comes up: Teams want curious practitioners who don't chase every new paper — filtering is a sign of maturity.

Prep pointers
  • Name specific sources (papers, podcasts, communities) — vague answers feel rehearsed.
  • Have one example of something you adopted and one you deliberately ignored, with reasoning.
  • Tie learning to applied outcomes, not just consumption.
  • Avoid LLM hype name-dropping unless you can speak about real production use.
Culture fit

What kind of data science team and environment do you do your best work in?

Why this comes up: Mismatch on team structure (centralised vs embedded, research vs product-focused) is a common cause of early attrition.

Prep pointers
  • Be honest — saying you thrive in 'any environment' lands as evasive.
  • Reference specific preferences: pace, autonomy, collaboration with engineering, proximity to product.
  • Have done your homework on how this specific team operates and reflect it back authentically.
  • Acknowledge what you find harder, not just what you love — self-awareness reads well.

More practice questions (15)

Technical

Explain the difference between L1 and L2 regularisation and when you'd choose each.

Why this comes up: Common warm-up question to verify foundational ML knowledge.

Technical

How does a random forest differ from gradient boosting, and when would you prefer one over the other?

Why this comes up: Tree-based models are the workhorse of applied data science and interviewers expect fluency.

Technical

What is p-hacking, and how do you guard against it in your own analyses?

Why this comes up: Probes statistical integrity and experimentation discipline.

Technical

How would you detect and handle multicollinearity in a regression model?

Why this comes up: Tests practical statistics knowledge beyond surface-level model fitting.

Technical

Explain how you would handle a severely imbalanced classification problem (e.g. 1% positive class).

Why this comes up: Imbalanced classes are extremely common in fraud, churn, and conversion problems.

Technical

Walk me through how cross-validation works and when standard k-fold is the wrong choice.

Why this comes up: Time series and grouped data require non-standard validation and many candidates miss this.

Situational

Your model's accuracy drops 10 points after a feature pipeline change. What do you do first?

Why this comes up: Tests debugging instincts and discipline under pressure.

Situational

A senior leader asks for a one-number forecast for next quarter. You only have noisy data. How do you respond?

Why this comes up: Probes communication of uncertainty to executive audiences.

Behavioural

Tell me about a time you collaborated closely with an ML engineer or data engineer. What worked and what didn't?

Why this comes up: Cross-functional collaboration is a core competency for productionising models.

Behavioural

Describe a project that failed or was deprioritised. What did you learn?

Why this comes up: Self-awareness and resilience are screened more in senior loops than candidates expect.

Competency

How do you prioritise between competing requests from different stakeholders?

Why this comes up: Data Scientists are often a shared resource and prioritisation is a daily skill.

Competency

How do you document and hand off a model so that others can maintain it?

Why this comes up: Tests engineering maturity and team-orientation — frequently missing in solo practitioners.

Technical

Given two coins, one fair and one biased 70/30, how would you statistically determine which is which with the fewest flips?

Why this comes up: Probability puzzles are common in quant-leaning interview loops.

Culture fit

What does 'impact' mean to you in a Data Scientist role?

Why this comes up: Reveals whether the candidate is metric-driven, research-driven, or product-driven — important for fit.

Situational

You realise after launch that your model is producing biased predictions against a subgroup of users. What's your response?

Why this comes up: Fairness and responsible ML are increasingly screened, particularly in regulated industries.

Get a prep pack tailored to your experience

describe.me matches these questions against your real work history, flags your prep priorities, and gives you a STAR scaffold per question.

Start free →

Your prep stays yours. Opt-in by design, never shared without your say-so. Read the data promise