---
name: interview-scorecard
description: Generate, distribute, collect, and aggregate interview scorecards. Use when standardizing post-interview evaluations, collecting structured interviewer feedback, scoring candidates against competency rubrics, identifying rating outliers, or triggering hire/no-hire recommendations. Supports behavioral, technical, cultural, and executive interview scorecards. Triggers on phrases like "interview feedback", "scorecard", "post-interview evaluation", "rate this candidate", "collect interviewer notes", "calibration session".
---

# Interview Scorecard

Standardize post-interview evaluation and aggregate feedback into hire-ready recommendations.

## Workflow

1. Auto-generate a scorecard tailored to the interview type and role level.
2. Distribute to each interviewer immediately after their interview session (push notification + email).
3. Set deadline: scorecards must be submitted within 24 hours of interview completion.
4. Send reminders at 12h and 23h if incomplete.
5. Once all scorecards are collected, aggregate scores, identify consensus and outliers.
6. Flag "Strong Yes" and "Strong No" signals for rapid decision-making.
7. Generate a consolidated recommendation report for the hiring manager.
8. If scores are mixed, trigger a calibration session with the interview panel.

## Scorecard Structure

Every scorecard includes these sections:

### Section 1: Interview Metadata

```
Interviewer: [Name, Title]
Candidate: [Name]
Role: [Title]
Interview Type: [Technical / Behavioral / Culture / Executive]
Date: [auto-filled]
Duration: [auto-tracked from calendar]
```

### Section 2: Competency Ratings

Rate each competency on a 5-point scale. **No "pass/fail" — use the full scale.**

| Competency | 1 — Does Not Meet | 3 — Meets | 5 — Exceeds |
|------------|-------------------|-----------|-------------|
| Technical Ability | Cannot perform core tasks | Performs all core tasks independently | Solves novel problems beyond role scope |
| Communication | Unclear, disorganized | Clear, structured, adapts to audience | Persuasive, influences without authority |
| Problem Solving | Cannot break down problems | Methodical, reaches sound conclusions | Innovative approach, considers edge cases |
| Collaboration | Works in silo, dismissive | Cooperative, active listener | Elevates team, mentors others |
| Role-Specific Skill | [custom per role] | [custom per role] | [custom per role] |

### Section 3: Evidence-Based Notes

**Required:** For each competency rated, provide one specific example from the interview.

```
Example (good):
"Technical Ability: 4 — Candidate designed a caching layer for our API problem.
Correctly chose Redis over Memcached for our use case and handled cache-invalidation
edge cases. Did not consider multi-region replication, which dropped from 5 to 4."

Example (bad — do not accept):
"Technical Ability: 4 — Seemed knowledgeable."
```

### Section 4: Overall Recommendation

| Recommendation | Meaning | When to Use |
|---------------|---------|-------------|
| Strong Hire | No doubts; advance immediately | All competencies ≥ 4, no red flags |
| Hire | Good fit; minor gaps acceptable | Most competencies ≥ 3, at least one 4–5 |
| Lean Hire | Tendency toward yes; needs calibration | Mixed scores, some 3s and 4s |
| Lean No-Hire | Tendency toward no; specific concerns | Some competencies at 2, concerns noted |
| No Hire | Clear mismatch | Any competency at 1, or multiple at 2 |
| Strong No-Hire | Immediate rejection | Critical failure, values misalignment, dishonesty |

### Section 5: Open-Ended Fields

- **Biggest strength observed:** [free text]
- **Biggest concern observed:** [free text]
- **Would you want this person on your team? Yes / No / Unsure**
- **Additional context:** [free text]

## Calibration Rules

### When to Trigger Calibration

Auto-trigger a calibration session when:

- Score range across interviewers spans ≥ 2 points (e.g., one rates 2, another rates 5 on same competency)
- Any "Strong No-Hire" mixed with any "Hire" or "Strong Hire"
- More than 50% of interviewers select "Lean Hire" or "Lean No-Hire"
- Hiring manager's score differs from panel median by ≥ 2 points

### Calibration Meeting Format

```
Duration: 30 minutes
Participants: All interviewers + Hiring Manager (facilitator)

Agenda:
  1. Each interviewer shares their rating + key evidence (3 min each)
  2. Discuss outliers: "Why did Alex rate technical 5 while Sam rated 2?"
  3. Re-rate if convinced by evidence; otherwise, stand by original score
  4. Converge on recommendation: Strong Hire / Hire / Lean Hire / Lean No-Hire / No Hire
  5. Document final decision + rationale
```

## Aggregation Algorithm

```
For each competency:
  → Calculate median score (not mean — reduces outlier influence)
  → If range ≥ 2, flag for calibration

Overall recommendation:
  → Count recommendations: [Strong Hire, Hire, Lean Hire, Lean No-Hire, No Hire, Strong No-Hire]
  → If majority (≥ 60%) agree on one band → that's the recommendation
  → If split → recommend calibration session
  → Any "Strong No-Hire" requires explicit discussion and override justification
```

## Scorecard Collection SLA

| Timing | Action |
|--------|--------|
| T+0 (end of interview) | Push scorecard to interviewer's phone/email |
| T+1h | Gentle reminder if not started |
| T+12h | Firm reminder: "Scorecard due in 12 hours" |
| T+23h | Final reminder: "1 hour remaining" |
| T+24h | Auto-escalate to interviewer's manager if still incomplete |
| T+48h | Hiring manager proceeds with available scorecards; note missing feedback |

## Output

### Consolidated Scorecard Report

```
CANDIDATE EVALUATION SUMMARY
============================
Name: Jane Doe | Role: Senior Backend Engineer
Interview Date: Jan 23, 2025 | Scorecards: 4/4 complete

COMPETENCY SUMMARY:
┌─────────────────────┬──────┬──────┬──────┬──────┬──────┐
│ Competency          │ Alex │ Sam  │ Lee  │ Zara │ Med  │
├─────────────────────┼──────┼──────┼──────┼──────┼──────┤
│ Technical Ability   │   5  │   4  │   5  │   4  │  4   │
│ Communication       │   4  │   4  │   3  │   5  │  4   │
│ Problem Solving     │   4  │   5  │   4  │   4  │  4   │
│ Collaboration       │   4  │   3  │   4  │   4  │  4   │
│ System Design       │   5  │   4  │   5  │   3*│  4   │
└─────────────────────┴──────┴──────┴──────┴──────┴──────┘
* Zara noted concern about multi-region design — flag for HM discussion

RECOMMENDATIONS:
  Alex: Hire        Sam: Hire        Lee: Strong Hire    Zara: Lean Hire

CONSENSUS: HIRE (75% agreement)
CONCERN: Multi-region system design depth — recommend exploring in HM round

NEXT STEP: Schedule Hiring Manager round or proceed to offer discussion
```

## Bias Guardrails

- **No anchoring**: Interviewers submit scorecards independently before seeing others' ratings
- **No halo/horn effect**: Each competency rated separately with evidence requirement
- **Calibration training**: All interviewers complete scorecard calibration training before first use
- **Audit**: HR reviews scorecard distributions monthly for rater leniency/severity patterns

## Integration Points

- ATS (Greenhouse, Lever): Auto-create scorecard records, push results back
- Email/Slack: Distribution and reminders
- Calendar: Trigger scorecard delivery post-interview
- HRIS: Store completed scorecards in candidate file
- Analytics: Track rater consistency, time-to-complete, recommendation accuracy
