Performance metrics and operating rules for the Techies engineering team as we shift from traditional SDLC to an AI-assisted workflow built on Claude Code and GitHub.
Owner: Paul (Techies)Scope: All developers (team of 1–10)Use: Individual + team evaluationStatus: Outline for review
1Purpose & Principles
Why this policy exists and the values it must protect.
1.1 Purpose
Define clear, fair, and measurable expectations for developers in an AI-assisted workflow.
Align individual output with team and business outcomes.
Make the transition from traditional to AI-oriented SDLC explicit and accountable.
1.2 Guiding principles
Outcomes over activity — measure shipped value, not raw volume (lines, commits).
AI is a multiplier, not a metric — adoption is encouraged, but quality and reliability remain the bar.
Balanced scorecard — no single metric decides performance; velocity is weighed against quality.
Transparency — every developer can see how their numbers are calculated.
Coaching first — KPIs drive improvement conversations before they drive ratings.
2The SDLC Shift
What changes when we move to an AI-oriented lifecycle — and what it means for how we measure people.
Automation note: aim to auto-collect GitHub + CI metrics into a dashboard; reserve manual input for the qualitative items only. [Define dashboard owner & tooling.]
9Review Cadence & Scoring
How and how often KPIs are reviewed.
Cadence
Focus
Use
Weekly
Team flow metrics (velocity, CI, reviews)
Process tuning
Monthly
Quality & reliability trends
Coaching
Quarterly
Full scorecard, individual + team
Performance review
9.1 Scoring approach
Each category scored against target/baseline → weighted composite score.
Trend matters more than a single snapshot — improvement is rewarded.
Individual ratings always read alongside team context (small team = high variance).
10Guardrails & Anti-Gaming
Protect the metrics from distortion.
No single metric determines an outcome — balanced scorecard only.
Pairing metrics: velocity is always read with change-failure rate; coverage with escaped defects.
Banned-as-targets: commit count, lines of code, hours logged, raw AI usage volume.
Splitting PRs or padding tests to inflate numbers is treated as a quality issue, not a win.
KPIs inform conversations; they don't auto-trigger penalties.
11Rollout Plan
Phased introduction so the policy lands as support, not surveillance.
Phase
Activity
Timing
1 · Baseline
Collect current metrics, no targets yet
[Weeks 1–4]
2 · Calibrate
Set targets/weights from real data, share openly
[Weeks 5–8]
3 · Coach
Run cadence; metrics drive 1:1s only
[Quarter 1]
4 · Formalize
Scorecard enters review cycle
[Quarter 2+]
12Appendix & Glossary
DORA metrics — four research-backed delivery metrics: deployment frequency, lead time, change failure rate, MTTR.
Cycle time — elapsed time from starting work to merging it.
Change failure rate — share of changes that cause a failure requiring remediation.
CLAUDE.md — per-repo context file that guides Claude Code's behavior.
AI-assisted PR — a pull request where Claude Code materially contributed.
[Add company-specific definitions, SLAs, and links here.]