From NIST AI RMF to a shipping checklist

You've heard of the NIST AI Risk Management Framework — the acronym "AI RMF" gets thrown around in every vendor deck, every compliance briefing, every LinkedIn post from someone selling something. If you've ever tried to read the actual document, you probably closed it within five minutes.

It's 48 pages of careful, well-intentioned, deeply abstract policy language. Which is fine — it has to be. But it means most people walk away with the vague sense that they should be doing something and no idea what that thing is.

This post fixes that. In plain English, with a checklist at the end.

Who this is for

Anyone responsible for deploying AI inside an organisation — CTO, VP Engineering, head of IT, tech lead, PM. You don't need a compliance background. You do need the ability to say "we will actually do this" and mean it.

What NIST AI RMF actually is

It's a checklist for whether your AI system is safe to deploy. That's it. It doesn't tell you which model to use, which cloud to host on, or how to prompt. It tells you what to think about before, during, and after deployment — so you don't accidentally cause harm, leak data, or get your company sued.

The framework has four functions. Four. That's the whole structure. Each one is a verb.

Govern — who's in charge, who's accountable.
Map — understand the context you're deploying into.
Measure — test the thing before it hurts someone.
Manage — keep watching after it's live.

That's the whole framework. Everything else is explanation, sub-categories, and examples. If you get the four verbs right, you've got 80% of it.

1. Govern — "who's in charge?"

If nothing else lands from this post, get this: one named person has to be accountable for your AI. Not a committee, not "IT", not "the vendor". A person with a job title and a calendar invite.

Why? Because AI fails differently from software. It can be 99% correct and 1% catastrophic. Software either works or it throws an exception. AI smiles at you and lies. If nobody owns the outcome, nobody's watching for the lie.

If you can't name the person accountable for your AI system in ten seconds, you don't have governance. You have a party everyone attends but no one hosts.

What this looks like in practice

A one-page written AI policy, signed by leadership.
A named owner per system — accountable, not responsible.
A living inventory of every AI tool in use. Yes, including the ones embedded in your CRM, IDE, HR platform. Those count too.
A quarterly review where someone actually looks at the list.

2. Map — "what are we getting into?"

Before you deploy anything, sit down and write three things on a piece of paper:

What is this AI system for? (one sentence)
Who's affected by its decisions? (one list)
What's the worst thing that could happen? (one paragraph)

If you can't answer those three questions, don't deploy. Not yet. Mapping is the NIST AI RMF way of saying "look before you leap."

For any system that touches regulated processes — credit decisions, hiring, healthcare, insurance — you also need to classify its risk tier. The EU AI Act has become the de facto vocabulary: prohibited, high-risk, limited-risk, minimal. Use it.

Tip

Don't reinvent the risk classification. Grab the EU AI Act tiers even if you're not in the EU. Everyone else will eventually align to them. First-mover cost: 30 minutes.

3. Measure — "does it actually work?"

This is where most teams get lazy. "We tested it" means they typed a few prompts and it gave reasonable answers. That's not measurement.

Measurement is: you have a set of test cases — call them evals — that you run every time the model changes, or the prompt changes, or the data changes. Pass rate goes below a threshold, the change doesn't ship.

Start small. Twenty hand-written test cases are infinitely better than zero. Build up from there.

What to measure

Accuracy — does it produce the right answer on known cases?
Refusal rate — does it know when to say "I don't know"?
Bias — does it perform equally well across the groups it affects?
Robustness — does it break under prompt injection or weird inputs?
Latency & cost — because real users will leave if it's slow or expensive.

And once it's live, keep measuring. Production behaviour drifts. What passed your eval yesterday may not pass today — same model, same prompt, different user behaviour.

4. Manage — "what if something goes wrong?"

You have a runbook. If you don't, write one. It has four sections:

How we notice. Monitoring, alerts, the "something's off" channel.
How we respond. Who gets paged, what they do in the first fifteen minutes.
How we roll back. The single command, the single button, the single process. Tested.
How we learn. The post-incident review. The eval case that gets added. The guardrail that gets tightened.

Also: have a retirement process. Models have end-of-life. Users expect warning. You need a written procedure for graceful shutdown, same as any other service.

The core idea

NIST AI RMF isn't a compliance document. It's a checklist for "is this thing safe to turn on?" Run it before, during, and after deployment. Everything else is commentary.

The Monday-morning checklist

If you only do ten things this week, do these. Copy, paste into a doc, tick them off.

Checklist

governance:
  - [ ] One-page AI policy, signed by leadership
  - [ ] Named accountable owner for each AI system
  - [ ] Live inventory of AI tools in use (including embedded)
  - [ ] Quarterly review on the calendar

map:
  - [ ] One-page use-case brief per system (purpose, users, worst case)
  - [ ] EU AI Act risk tier assigned to each system
  - [ ] Stakeholders identified and documented

measure:
  - [ ] Eval harness with at least 20 test cases
  - [ ] Pass-rate threshold that blocks shipping
  - [ ] Production monitoring with drift detection

manage:
  - [ ] Incident runbook (detect, respond, rollback, learn)
  - [ ] Rollback tested monthly by someone who didn't write it
  - [ ] Retirement process documented

That's it. 48 pages of framework, 13 bullets. If you get these thirteen right, you're ahead of 95% of organisations currently deploying AI. No exaggeration.

What next

If you want to know where your organisation stands today — before you start ticking boxes — take our AI Maturity Assessment. Fifteen questions across the same five pillars NIST uses. You'll get a score, a breakdown, and the next five moves in each dimension.

If you want a second pair of eyes on your biggest gap, write to us. Senior-led engagements only. No decks.

From NIST AI RMF to a shipping checklist.

What NIST AI RMF actually is

1. Govern — "who's in charge?"

What this looks like in practice

2. Map — "what are we getting into?"

3. Measure — "does it actually work?"

What to measure

4. Manage — "what if something goes wrong?"

The Monday-morning checklist

What next

Next up.