HR AI Skill
Interview Scorecard
Generate, distribute, collect, and aggregate interview scorecards. Use when standardizing post-interview evaluations, collecting structured interviewer feedback, scoring candidates against competency rubrics, identifying rating outliers, or triggering hire/...
Interview Scorecard
Standardize post-interview evaluation and aggregate feedback into hire-ready recommendations.
Workflow
- Auto-generate a scorecard tailored to the interview type and role level.
- Distribute to each interviewer immediately after their interview session (push notification + email).
- Set deadline: scorecards must be submitted within 24 hours of interview completion.
- Send reminders at 12h and 23h if incomplete.
- Once all scorecards are collected, aggregate scores, identify consensus and outliers.
- Flag "Strong Yes" and "Strong No" signals for rapid decision-making.
- Generate a consolidated recommendation report for the hiring manager.
- If scores are mixed, trigger a calibration session with the interview panel.
Scorecard Structure
Every scorecard includes these sections:
Section 1: Interview Metadata
Interviewer: [Name, Title]
Candidate: [Name]
Role: [Title]
Interview Type: [Technical / Behavioral / Culture / Executive]
Date: [auto-filled]
Duration: [auto-tracked from calendar]
Section 2: Competency Ratings
Rate each competency on a 5-point scale. No "pass/fail" — use the full scale.
| Competency | 1 — Does Not Meet | 3 — Meets | 5 — Exceeds | |------------|-------------------|-----------|-------------| | Technical Ability | Cannot perform core tasks | Performs all core tasks independently | Solves novel problems beyond role scope | | Communication | Unclear, disorganized | Clear, structured, adapts to audience | Persuasive, influences without authority | | Problem Solving | Cannot break down problems | Methodical, reaches sound conclusions | Innovative approach, considers edge cases | | Collaboration | Works in silo, dismissive | Cooperative, active listener | Elevates team, mentors others | | Role-Specific Skill | [custom per role] | [custom per role] | [custom per role] |
Section 3: Evidence-Based Notes
Required: For each competency rated, provide one specific example from the interview.
Example (good):
"Technical Ability: 4 — Candidate designed a caching layer for our API problem.
Correctly chose Redis over Memcached for our use case and handled cache-invalidation
edge cases. Did not consider multi-region replication, which dropped from 5 to 4."
Example (bad — do not accept):
"Technical Ability: 4 — Seemed knowledgeable."
Section 4: Overall Recommendation
| Recommendation | Meaning | When to Use | |---------------|---------|-------------| | Strong Hire | No doubts; advance immediately | All competencies ≥ 4, no red flags | | Hire | Good fit; minor gaps acceptable | Most competencies ≥ 3, at least one 4–5 | | Lean Hire | Tendency toward yes; needs calibration | Mixed scores, some 3s and 4s | | Lean No-Hire | Tendency toward no; specific concerns | Some competencies at 2, concerns noted | | No Hire | Clear mismatch | Any competency at 1, or multiple at 2 | | Strong No-Hire | Immediate rejection | Critical failure, values misalignment, dishonesty |
Section 5: Open-Ended Fields
- Biggest strength observed: [free text]
- Biggest concern observed: [free text]
- Would you want this person on your team? Yes / No / Unsure
- Additional context: [free text]
Calibration Rules
When to Trigger Calibration
Auto-trigger a calibration session when:
- Score range across interviewers spans ≥ 2 points (e.g., one rates 2, another rates 5 on same competency)
- Any "Strong No-Hire" mixed with any "Hire" or "Strong Hire"
- More than 50% of interviewers select "Lean Hire" or "Lean No-Hire"
- Hiring manager's score differs from panel median by ≥ 2 points
Calibration Meeting Format
Duration: 30 minutes
Participants: All interviewers + Hiring Manager (facilitator)
Agenda:
1. Each interviewer shares their rating + key evidence (3 min each)
2. Discuss outliers: "Why did Alex rate technical 5 while Sam rated 2?"
3. Re-rate if convinced by evidence; otherwise, stand by original score
4. Converge on recommendation: Strong Hire / Hire / Lean Hire / Lean No-Hire / No Hire
5. Document final decision + rationale
Aggregation Algorithm
For each competency:
→ Calculate median score (not mean — reduces outlier influence)
→ If range ≥ 2, flag for calibration
Overall recommendation:
→ Count recommendations: [Strong Hire, Hire, Lean Hire, Lean No-Hire, No Hire, Strong No-Hire]
→ If majority (≥ 60%) agree on one band → that's the recommendation
→ If split → recommend calibration session
→ Any "Strong No-Hire" requires explicit discussion and override justification
Scorecard Collection SLA
| Timing | Action | |--------|--------| | T+0 (end of interview) | Push scorecard to interviewer's phone/email | | T+1h | Gentle reminder if not started | | T+12h | Firm reminder: "Scorecard due in 12 hours" | | T+23h | Final reminder: "1 hour remaining" | | T+24h | Auto-escalate to interviewer's manager if still incomplete | | T+48h | Hiring manager proceeds with available scorecards; note missing feedback |
Output
Consolidated Scorecard Report
CANDIDATE EVALUATION SUMMARY
============================
Name: Jane Doe | Role: Senior Backend Engineer
Interview Date: Jan 23, 2025 | Scorecards: 4/4 complete
COMPETENCY SUMMARY:
┌─────────────────────┬──────┬──────┬──────┬──────┬──────┐
│ Competency │ Alex │ Sam │ Lee │ Zara │ Med │
├─────────────────────┼──────┼──────┼──────┼──────┼──────┤
│ Technical Ability │ 5 │ 4 │ 5 │ 4 │ 4 │
│ Communication │ 4 │ 4 │ 3 │ 5 │ 4 │
│ Problem Solving │ 4 │ 5 │ 4 │ 4 │ 4 │
│ Collaboration │ 4 │ 3 │ 4 │ 4 │ 4 │
│ System Design │ 5 │ 4 │ 5 │ 3*│ 4 │
└─────────────────────┴──────┴──────┴──────┴──────┴──────┘
* Zara noted concern about multi-region design — flag for HM discussion
RECOMMENDATIONS:
Alex: Hire Sam: Hire Lee: Strong Hire Zara: Lean Hire
CONSENSUS: HIRE (75% agreement)
CONCERN: Multi-region system design depth — recommend exploring in HM round
NEXT STEP: Schedule Hiring Manager round or proceed to offer discussion
Bias Guardrails
- No anchoring: Interviewers submit scorecards independently before seeing others' ratings
- No halo/horn effect: Each competency rated separately with evidence requirement
- Calibration training: All interviewers complete scorecard calibration training before first use
- Audit: HR reviews scorecard distributions monthly for rater leniency/severity patterns
Integration Points
- ATS (Greenhouse, Lever): Auto-create scorecard records, push results back
- Email/Slack: Distribution and reminders
- Calendar: Trigger scorecard delivery post-interview
- HRIS: Store completed scorecards in candidate file
- Analytics: Track rater consistency, time-to-complete, recommendation accuracy