How we score risk

Methodology.

The exact formulas, sources and safeguards behind your risk scores and governance scorecard — reproducible, auditable and published in full.

Risk model Bands Governance score Formulas Registry Frameworks Integrity FAQ

Per-tool risk model

Four multipliers, applied in sequence.

Every tool starts with a base risk score between 1.0 and 5.0 reflecting its intrinsic data-handling profile. Four multipliers sharpen that into your organisation’s realised risk.

Tier multiplier
Free vs paid-consumer vs enterprise. Captures whether the tier contractually excludes training-on-inputs and supports SSO, audit logs and admin controls.
Regulated-data multiplier
Rises with PHI, MNPI, CJIS, PCI, GDPR-EU or generic PII. Tools handling regulated data get scored more harshly than those that only see marketing copy.
Governance-gap multiplier
Amplifies realised risk when controls are weak. A dangerous tool with strong controls lands lower than a mild tool with zero oversight.
Tier-level flags
Every tier-specific attribute that pushed a score (no-DPA, no-SSO, no-data-residency) is recorded as a rationale string on the tool profile.

Risk bands

Final score clamps to 1-10, mapped to four bands.

Low
Score < 4
Everyone uses it safely — enterprise tier, DPA in place, admin controls wired up.
Medium
4 – 5.99
Acceptable when data is bounded. Watch tier drift and shadow installs.
High
6 – 7.99
Needs intervention. Move to a sanctioned tier or replace with a safer alternative.
Critical
≥ 8
Get it off your network today. Training-on-inputs default, regulated data, zero admin control.

Governance score

Twelve questions. Zero to a hundred.

Each answer converts to a magnitude between 0 and 4, weighted by the question’s weight, expressed as a 0-100 score. The result feeds directly into the governance-gap multiplier.

Question domains cover policy, procurement, visibility, access, privacy, training, incident response and monitoring.

01
Policy · Public AI-use policy signed by exec team
02
Procurement · AI tools go through vendor review
03
Visibility · Inventory of tools touching customer data
04
Access · SSO enforced on sanctioned AI tools
05
Privacy · DPAs on file for every in-use tool
06
Training · Annual AI-use training for every employee

+ 6 more questions covering monitoring, incident response, training and drills.

Cost-of-risk formulas

The math, laid out.

Per-tool risk and overall organisational risk are both reproducible from your answers. Here are the formulas verbatim.

# Per-tool risk (clamped 1.0 - 10.0)
tool_risk = base_risk
          × tier_multiplier           # 0.75  free-consumer  1.4
          × regulated_data_multiplier # 1.0   (no regulated)  1.8 (multi-reg)
          × governance_gap_multiplier # 0.85  (mature)        1.35 (none)

# Overall organisation risk
if any(tool_risk >= 8.0):              overall = CRITICAL
elif count(tool_risk >= 6.0) >= 3:     overall = HIGH
else:                                   overall = weighted_avg(tools)
                                          × (1.0 + 0.01 × (70 - gov_score))

The weighted_avg step favours higher-risk tools so a single critical outlier isn’t diluted by dozens of safe ones.

Registry verification

Verified every 90 days. Audited every change.

Step 01
Every 90 days
Editorial re-verification
Each registry entry is checked against vendor-published terms, pricing, and certifications. Timestamped as last_verified_at.
Step 02
Weekly
Freshness cron
Flags entries past the 90-day window for editorial re-review. No entry is allowed to decay past that window silently.
Step 03
Ad hoc
Incident & change log
New incidents, policy changes and risk shifts land in ai_tools_changelog and surface on tool profile pages within 24 hours.

Peer benchmarks

Anonymous. Bucketed. Suppressed when thin.

Benchmarks aggregate across audits completed in the last 12 months, bucketed by industry + company size. Minimum sample is 15 per bucket — below that we suppress the number rather than mislead.

Last 12 months · rolling window
Bucketed by industry + company size
Minimum 15 audits per bucket
Suppressed when data is thinner than threshold

Framework alignment

Tagged to the frameworks auditors already reference.

NIST AI RMF
Govern · Map · Measure · Manage
Recommendations carry the NIST AI RMF sub-category they address so your risk team can map findings onto existing tracking.
EU AI Act
Article-level tagging
Tools that touch high-risk uses (Annex III) are flagged with the relevant Article references so compliance sees the exposure immediately.
ISO/IEC 42001
AIMS control mapping
Governance findings map back to the specific AIMS controls they belong to — useful for teams already on the ISO path.
Sector rules
HIPAA · SR 11-7 · CJIS · PCI-DSS
Sector overlays are auto-applied based on your industry + regulated-data selections in the context step.

What we don’t do

Three rules that keep the data honest.

No vendor pay-to-play.
Tool ranking is never influenced by commercial relationships. The weights are published and the data is auditable.
No guessed scores.
If a tool hasn’t been verified, it’s flagged — not invented. Missing inputs fall back to the category median rather than a fabricated number.
No vibes.
Every score comes from a formula you can reproduce. If you disagree with the output, you can trace exactly which multiplier changed it.

FAQ

Methodology — in more detail.

How often are tools re-verified?

Every 90 days at minimum, with a weekly freshness cron that flags entries needing attention. Changes — new incidents, policy updates, tier reshuffles — land in the changelog and propagate to tool profiles within 24 hours.

What happens when a tool splits its tiers (free vs paid vs enterprise)?

Each tier is scored independently. The "ChatGPT" row in your audit resolves to the tier your team actually uses. If you mix tiers (some on free, some on Enterprise), we score both and flag the exposure from the weaker tier.

Do you take sponsorships from vendors in the registry?

No. Inclusion and scoring are editorial. Vendors can request review of inaccuracies and we respond within 48 hours, but they cannot alter the weights or the score formula.

Why does my score change when I change governance answers?

The governance-gap multiplier is applied to every tool. Stronger controls (DPA on file, SSO enforced, quarterly vendor review) bring down realised risk for the same tool. The rationale string on each tool profile shows the multipliers that produced the number.

What’s the minimum sample size for peer benchmarks?

15 completed audits per industry + size bucket. Below that threshold we suppress the benchmark rather than mislead with thin data.

Can I see the exact formula?

Yes — see the Cost-of-risk formula section on this page. Every multiplier is published and the math is reproducible from our documentation.

Found an error?

Corrections welcome.

Spotted a stale score, an out-of-date DPA or a missing incident? Email us with the link to the source. We correct within 48 hours and log every change.

hello@buzzi.ai

Ready to audit?

Start the 12-minute audit.

The formulas above plug into your actual answers — with a risk score per tool, a governance scorecard and a block-list for IT at the end.

Back to audit

About

Insights

Streamline

Integration

Solutions

Healthcare AI

Use Cases

Industries

Methodology.

Four multipliers, applied in sequence.

Tier multiplier

Regulated-data multiplier

Governance-gap multiplier

Tier-level flags

Final score clamps to 1-10, mapped to four bands.

Twelve questions. Zero to a hundred.

The math, laid out.

Verified every 90 days. Audited every change.

Editorial re-verification

Freshness cron

Incident & change log

Anonymous. Bucketed. Suppressed when thin.

Tagged to the frameworks auditors already reference.

Govern · Map · Measure · Manage

Article-level tagging

AIMS control mapping

HIPAA · SR 11-7 · CJIS · PCI-DSS

Three rules that keep the data honest.

No vendor pay-to-play.

No guessed scores.

No vibes.

Methodology — in more detail.

Corrections welcome.

Start the 12-minute audit.