Scoring Methodology Fields
The 6 methodology fields that control how the scorer evaluates responses.
These are the most critical science-owned fields. They directly determine scoring quality and fairness.
1. Scoring Philosophy (scoring_philosophy)
Current default: "Score conservatively. A 3 is solid performance that meets expectations. A 5 requires truly exceptional, specific evidence with measurable outcomes. When in doubt between two adjacent scores, choose the lower score."
What it controls: The overall anchoring of the scoring scale. A permissive philosophy inflates scores; a conservative one may deflate them.
Your review question: Is "conservative" the right stance? Should borderline cases round up or down?
2. Ownership Rule (ownership_rule)
Current default: "Exclusively collective language ('we', 'our team', 'the group') without any indication of the candidate's personal contribution scores no higher than 2 on the ownership dimension. First-person language alone is not sufficient — the candidate must describe their specific decisions, actions, and impact."
What it controls: How the scorer treats "we did X" vs "I did X." This significantly affects scores on Ownership and related competencies.
Your review question: Is a hard cap at 2 too strict? Should there be a gradient?
3. Specificity Rule (specificity_rule)
Current default: "Vague answers without a specific situation, concrete actions taken, or measurable outcomes score no higher than 2. Specificity means naming the project, the timeline, the stakeholders, or the metrics — not just describing an approach in general terms."
What it controls: The penalty for abstract or theoretical answers without concrete examples.
Your review question: Should the standard be different for adaptive vs. spine interviews? Should it vary by competency?
4. Quantification Bonus (quantification_bonus)
Current default: "Quantified outcomes (percentages, dollar amounts, time saved, team size, user counts) tip borderline 3-vs-4 cases toward 4. Quantification alone does not guarantee a 4 — the underlying response must still demonstrate strong competency evidence."
What it controls: Whether numbers and metrics in answers provide a scoring boost.
Your review question: Is this fair? Could it disadvantage candidates from roles where outcomes are harder to quantify?
5. Coached Response Flag (coached_response_flag)
Current default: "Suspiciously polished, buzzword-heavy answers that lack concrete specifics may indicate coaching. Note these in red_flags but do not automatically reduce the score — let the employer decide."
What it controls: Whether the scorer flags rehearsed-sounding answers.
Your review question: What actually constitutes "coached"? Is this operationalized well enough?
6. Adverse Impact Guidance (adverse_impact_guidance)
Current default: "If any question or probe touched a protected characteristic (race, gender, age, disability, religion, national origin), or if the candidate's response revealed protected information that could bias scoring, flag the entire interview as 'REVIEW' in the recommendation field. Do not adjust scores — flag for human review."
What it controls: The trigger for escalating to human review.
Your review question: Is "touched a protected characteristic" specific enough? What's the threshold?
The current per-candidate adverse impact check is NOT real adverse impact analysis. Real adverse impact is aggregate across subgroups (4/5ths rule). The current field only flags individual interviews for human review. See Phase 14 for the full methodology design.