Back to the mapper

How we score MCP risk and effort.

Rule-based, deterministic, OWASP-aligned. No ML in the hot path. Same inputs always produce the same architecture, the same auth recommendations, and the same effort range.

Architecture rules

When the engine picks each pattern

  • Local stdio — single-user dev tool, OR strict on-prem deployment with high/critical sensitivity throughout.
  • Remote (SSE / HTTP) — default for cloud deployments at any meaningful scale.
  • Hybrid — mixed sensitivity at scale: high-risk systems behind a gateway, low-risk direct.
  • Gateway — high-autonomy + write-capable servers, OR scale (10K+ MAU) + multi-tenant + ≥1 regulatory constraint.

Effort formula

Always a range, never a point estimate

base_days = 5
per_server_days = {
  standard:  1.5  // official server
  community: 3.0  // community server, light adaptation
  custom:    8.0  // build in-house
  gateway:  15.0  // one-time add-on when gateway recommended
}
autonomy_multiplier = {
  read_only: 1.0,    suggest: 1.1,    pre_approval: 1.4,
  post_review: 1.6,  autonomous: 2.0
}
regulatory_multiplier = min(1.0 + count_regulatory * 0.3, 2.0)

total_days = (base_days + sum(per_server_days) + gateway_days)
           * autonomy_multiplier
           * regulatory_multiplier

weeks_low  = total_days / 5
weeks_high = weeks_low * 1.5

Risk hotspots

OWASP LLM Top 10 mapping

Pipeline nodeOWASP refsMitigations
User → LLMLLM01input filtering · prompt hardening
LLM → MCPLLM06, LLM08tool allowlist · per-tool confirmation
MCP → DownstreamLLM08, LLM09per-session scope · rate limit gateway
Downstream → LLMLLM01, LLM03output scanning · injection filter
LLM → UserLLM02pii redaction · audit log

Integrity

Three commitments.

No vendor pay-to-play.

Official servers are listed when they exist, regardless of vendor marketing spend.

No demos as benchmarks.

We never cite a vendor demo as evidence of production readiness.

No hand-waved risks.

Every severity label traces to a specific OWASP ID or published advisory.