The Spectrum Problem in Data Science Hiring
Data science spans a large range: exploratory analysis, statistical modeling, machine learning engineering, and decision support. These require different skills and should be evaluated differently. Hire a model-builder using an analysis-focused rubric and you'll get a poor-fit hire who looks good on paper.
Define the job shape first. Is this role primarily about producing insights that influence decisions, or about building and deploying production models? The evaluation process should map to that โ not to a generic 'data scientist' template.
Core Evaluation Dimensions
Statistical Rigor: Can they identify when a result is not actually significant? Ask them to critique an analysis โ poor candidates describe what the data shows; strong candidates immediately ask about confounders, sample size, and measurement validity.
Business Translation: Can they explain the implications of a model's output to a non-technical stakeholder, without losing what actually matters? This is rarer than it should be.
Uncertainty Communication: How do they handle not knowing? Strong data scientists are explicit about confidence levels and error ranges. Candidates who present every result as clean and conclusive are a risk.
Tooling Judgment: Do they know when a complex model is not needed? Over-engineering is as common a failure mode as under-engineering in data science work.
Assessment Design
A structured take-home using a real (or realistic) dataset outperforms whiteboard statistics for most data science roles. Reviewing the approach โ not just the output โ tells you how they think. Ask candidates to walk through their choices: why this model, why these features, what would change with more data?
Avoid 'trick question' statistical problems that test knowledge of obscure concepts rather than practical judgment. The goal is to surface how they think through real analytical problems, not whether they've memorized edge cases.
What References Should Confirm
Ask references whether the candidate's work actually changed decisions โ not just whether their analyses were correct. Data science that isn't used is indistinguishable from data science that doesn't exist. Impact evidence is the differentiating signal at the senior level.
Ask also about communication under uncertainty: 'Were they comfortable telling stakeholders that the data didn't support a clear answer?' Candidates who always produce confident conclusions are a risk in any analytical role.