AIG-022 Human Oversight of AI Outputs

Tier 2+AI

Description

AI systems that produce outputs used in decisions affecting individuals have documented human oversight mechanisms. Oversight measures are proportionate to risk: Tier 3 systems require human review of AI-generated outputs before action is taken; Tier 2 systems require reviewable audit trails and human escalation paths. Oversight persons have defined competencies, appropriate training, and sufficient time to exercise meaningful review. Automation bias risks are explicitly addressed in operator guidance.

Rationale

Human oversight is the last line of defence against harmful AI outputs; it must be substantively designed, not nominal.

Framework Mappings (6)

EU AI Act 2024

EU-AI-Art.14.1	Human Oversight — System Design for Oversight	full
EU-AI-Art.14.2	Human Oversight — Capabilities Assigned to Oversight Persons	full
EU-AI-Art.26.2	Deployer Obligations — Human Oversight Assignment	full

NIST AI RMF 1.0

GOVERN 3.2	Human-AI Configuration Roles	full
MAP 3.4	Operator Proficiency Processes	partial
MAP 3.5	Human Oversight Process Definition	full

Evidence (2)

recordmanual

Human oversight design document or operational procedures for each Tier 2+ AI system, specifying oversight mechanism, reviewer competency requirements, time allocation, and automation bias mitigation guidance.

Example: Human Oversight Procedure — AI Credit Decisioning System (Confluence), specifying that all AI-flagged decline decisions require human review within 4 hours, reviewer qualification requirements (credit underwriting certification), automation bias awareness training requirement, and escalation path for reviewer disagreement

Test: Request human oversight documentation for each Tier 2+ production AI system. Verify: (1) oversight mechanism is described (human review before action, audit trail with escalation, etc.), (2) oversight is proportionate to tier (Tier 3 requires pre-action review), (3) reviewer competency requirements are defined, (4) automation bias risk is explicitly addressed in operator guidance, (5) the oversight person has sufficient time allocation to conduct meaningful review (not nominal sign-off).

logautomated

Audit trail records showing human review and override events for AI-generated outputs, demonstrating that oversight is operationally active and not merely nominal.

Example: AI-Credit-System override log (Splunk, last 90 days): 1,247 AI decisions reviewed, 89 overrides recorded with reviewer ID, timestamp, and override reason category; override rate 7.1%, consistent with expected range 5–10%

Test: Request override and review event logs for a 90-day sample. Verify: (1) human review events are recorded with reviewer identifier, timestamp, and decision, (2) override events include a reason category, (3) override rate is within the documented expected range (an override rate of 0% may indicate rubber-stamping), (4) log confirms oversight is occurring at the required frequency and volume.

Questions (2)

boolean

Do AI systems that produce outputs used in decisions affecting individuals have documented human oversight mechanisms proportionate to their risk level?

Human oversight is the last line of defence against harmful AI outputs. It must be substantively designed — not nominal. Oversight persons must have defined competencies, training, and sufficient time to conduct meaningful review.

multi

Which of the following are true of your AI human oversight programme?

Oversight mechanism is documented and specific to each AI systemOversight persons have defined competency requirementsOversight persons receive training that includes automation bias awarenessTier 3 systems require human review before action is taken on AI outputOverride and escalation paths are documented and accessibleOverride rates are monitored to detect rubber-stamping

All six characteristics indicate substantive oversight. An override rate of 0% over extended periods is a strong signal of nominal (rubber-stamp) oversight rather than genuine review.

Search controls

AIG-022 Human Oversight of AI Outputs

Framework Mappings (6)

Evidence (2)

Questions (2)