Practical Assessments
How practical assessments work — disciplines, problems, scoring, and integrity
Practical Assessments
Practical assessments evaluate candidates through hands-on work — writing code, fixing bugs, reviewing code, drafting documents, or analyzing data. This is a separate evaluation phase from the conversational interview.
Three Assessment Modes
Each posting can be configured as one of:
- Conversational — voice/text interview only (existing behavior)
- Practical — hands-on assessment only, no Retell, zero voice API cost
- Both — conversational interview first, then practical assessment (sequential)
Disciplines
Disciplines organize problems by domain. Each defines a workspace type:
| Discipline | Workspace | Example Problems |
|---|---|---|
| Software Engineering | Code Editor (Monaco) | FizzBuzz, Bug fixes, Code review |
| Technical Writing | Rich Text Editor | README writing, API docs, Editing |
| Data Science | Code Editor (Python/SQL/R) | SQL queries, Statistical analysis |
Disciplines are admin-managed at /admin/practical-assessments.
Problems
Each problem belongs to a discipline and has:
- Title and description — what the candidate sees
- Difficulty tier — Foundational, Intermediate, or Advanced
- Available languages — which programming languages the candidate can use (from the
programming_languagestable — all Monaco-supported languages) - Starter content (optional) — pre-loaded code or text for bug-fix / review problems
- Starter language (optional) — locks the editor to this language when starter content exists
- Scorer hints (optional) — freeform text giving the AI specific things to look for beyond what the competency anchors already cover
- Time limit (optional) — per-problem time limit in minutes
Problems are stored as editable JSON files in data/practical-problems/*.json and seeded via scripts/seed-practical-problems.ts. This makes them easy to review and edit by SMEs.
Practical Scorer
The practical scorer (lib/practical-scorer-prompt.ts) evaluates candidate submissions against the same competency anchors used for conversational interviews. It receives:
- The candidate's submitted work (code or text)
- The problem description and optional scorer hints
- Starter content (for diff context on bug-fix problems)
- Integrity flags (paste events, tab switches)
- Time taken and whether the candidate timed out
The scorer outputs the same scorecard format as the conversational scorer, but with source: "practical" and evidence (from the work product) instead of quote (from a transcript).
Configuring the Practical Scorer
The practical scorer rubric is fully configurable at Admin > Practical Scorer (/admin/practical-scorer). This page follows the same versioned pattern as the conversational scorer:
- Scoring Philosophy — what a 3 vs 5 means, how to handle partial work
- Code Evaluation — how to assess correctness, quality, edge cases
- Text Evaluation — how to assess clarity, accuracy, structure
- Integrity Weighting — how paste events and tab switches affect scoring
- Scoring Rules — ordered bullet-point rules the AI follows
- Thresholds — Advance/Hold/Decline cutoffs
All changes are versioned with change type, name, and description. The output format is engineering-managed and not shown in the UI.
Unified Scoring
When a posting uses "Both" mode:
- One scorecard is created per practical assessment
- One scorecard is created for the conversational interview
- A unified scorecard merges all of them
Evidence in practical scorecards comes from the code/text artifacts, not transcript quotes.
Posting Configuration
Postings no longer have a discipline_id — employers select problems directly from any discipline via the Problem Browser. Assessment config is stored in the posting_assessments table (one row per posting with the problem IDs). Each problem carries its own time limit.
Status
Sprint 2 complete — full candidate-facing UI (Monaco workspace, timer, auto-save, problem sidebar), PostingForm with assessment type selector and expandable problem cards, admin practical scorer page with versioning. 895 unit tests.