AVE does not invent its own scoring system. Every record is scored using OWASP AIVSS v0.8 — the AI Vulnerability Severity Scoring standard. AVE implements AIVSS; it does not own or modify it. This page documents how that scoring works inside an AVE record, so a researcher writing a new record or a reviewer checking an existing one can verify the number is right.
The formula
Four inputs, each explained below:
| Term | Range | What it is |
|---|---|---|
| CVSS_Base | 0–10 | Standard CVSS 4.0 base score for the underlying flaw, independent of agentic context. Stored as aivss.cvss_base. |
| AARS | 0–10 | Agentic Amplification & Reachability Score — the sum of the 10 AARF factors below. Stored as aivss.aars. |
| ThM | 0.5–1.5 | Threat & Heuristic Multiplier — how real-world the threat is. Stored as aivss.thm. |
| Mitigation_Factor | 0–1 | How much available mitigation reduces real-world risk. Stored as aivss.mitigation_factor. |
AIVSS does not replace CVSS — it extends it. Averaging CVSS_Base with AARS means a class with a low traditional severity but high agentic amplification (or the reverse) lands in the middle rather than at either extreme. A record is never scored purely on how “agentic” it is.
AARS — the 10 AARF factors
AARS is the sum of 10 Agentic Amplification and Risk Factors (AARF), each scored 0.0 (not applicable) to 1.0 (fully applicable). They live in aivss.aarf as an optional breakdown object.
| Factor | Score 1.0 when… |
|---|---|
| autonomy | the agent acts without human confirmation |
| tool_use | the component grants access to external tools or APIs |
| multi_agent | the attack chains across multiple agents |
| non_determinism | behavior varies unpredictably across runs |
| self_modification | the component can alter its own instructions at runtime |
| dynamic_identity | the component assumes roles or personas |
| persistent_memory | state is retained across sessions |
| natural_language_input | instructions are delivered via natural language |
| data_access | the component reads sensitive data (files, env, databases) |
| external_dependencies | the component loads remote code or content |
Intermediate values (0.5) are used when a factor partially applies. AARS is simply the sum of all 10 — a record where every factor scores 1.0 has an AARS of 10.
ThM — Threat & Heuristic Multiplier
ThM reflects how real the threat is right now, independent of severity. It is set by the researcher authoring the record, based on observed evidence.
| ThM | When to use it |
|---|---|
| 0.75 | Theoretical — no known proof of concept |
| 0.90 | A working proof of concept exists |
| 1.0 | Exploited in the wild, or weaponised |
Severity bands
The final AIVSS score maps to severity, which must agree with aivss.aivss_score:
| Band | AIVSS range |
|---|---|
| CRITICAL | ≥ 9.0 |
| HIGH | 7.0–8.9 |
| MEDIUM | 4.0–6.9 |
| LOW | < 4.0 |
Worked example — AVE-2026-00046
AVE's only CRITICAL-severity record: MCP tool hook hijacking. Here is its real aivss object and how the final score was derived.
Worked example — AVE-2026-00014
The lowest-severity record in the set: false authority claim via trust escalation. A useful contrast — high CVSS_Base alone does not guarantee a high final score.
confidence is never part of this calculation and never appears in an AVE record. AIVSS answers “how bad would this be if it fires”; confidence answers “how sure is the scanner that it fired.” The two are separate fields on a scanner Finding, computed independently. See Architecture for the full declares-vs-assigns contract.
Where this lives in a record
Field-level reference for every part of the aivss object — required/optional status and types — is on the Schema page. This page explains how the numbers are derived; Schema documents the field shapes that hold them.
OWASP AIVSS v0.8 specification: aivss.owasp.org