Phase 13: Validity Data Collection
Schema review questions — 4 items, ~3 hours of your time.
Time estimate: ~3 hours
Impact: BLOCKS validity data collection infrastructure
Full design doc: docs/PHASE_13_VALIDITY_DATA_COLLECTION.md in the repo
Context
We're building infrastructure to collect data for future validity studies. Every interview from this point forward will contribute to the evidence base. The data model needs to capture everything needed for criterion-related validity, content validity documentation, and reliability analysis.
4 Questions for You
1. Outcome Categories
Are the following outcome categories correct and complete?
- Hired
- Rejected
- Withdrew
- Terminated
- Performing well
- Performing poorly
Should we add:
- Promoted?
- Still employed at X months?
- Performance rating (if employer uses a scale)?
- Manager satisfaction?
2. Follow-Up Intervals
Are these follow-up intervals right for validity data?
- 30 days — initial hire/reject outcome
- 90 days — early performance signal
- 6 months — medium-term performance
- 12 months — long-term validation
Should any be added, removed, or adjusted?
3. Additional Interview Metadata
What interview metadata should we capture NOW that's hard to reconstruct later?
Currently captured:
- Prompt hash (exact prompt used)
- Score vector (all competency scores)
- Scorer version and interviewer version
- Timestamp and duration
- Interview mode, style, depth
Missing anything needed for:
- Reliability analysis?
- Content validity documentation?
- Criterion-related validity studies?
4. Data Points for Criterion-Related Validity
What data points needed for criterion-related validity are not listed above?
Consider:
- Job family classification
- Level/seniority
- Location (for group comparisons)
- Interviewer language (the AI is consistent, but worth tracking model version)
- Time of day (candidate alertness patterns)
How to Submit
Provide answers directly — we'll implement the schema and seed the tables. The answers become column definitions and enum values in the database.
What's Already Decided
- In-app notification bell — employers get reminders to report outcomes
- Automated follow-up notifications at 30d/90d/6mo/12mo
notificationstable with type, message, read/unread, link, recipient- Email reminder pipeline deferred to follow-up sprint