AVE Agentic Vulnerability Enumeration
Architecture · implementer guide

Architecture

The structure of the standard itself — what any tool must honor to implement AVE correctly and stay consistent with every other implementation.

AVE is a standard, not software. Its “architecture” is the structure of the standard: the record schema, the record/rule/fixture validation triangle, the contract between a static record and a runtime finding, the framework mappings, and the SARIF emission convention. None of this describes any one product’s internals — it describes what every implementer must honor.

The standard at a glance

Existing standards CVE · CVSS · OSV Map to package + version Blind to agent behavior Agent component threats Prompt injection, toxic flows, rug pulls, tool poisoning No package. No version. AVE fills the gap AVE — Agentic Vulnerability Enumeration Open, neutral, behavioral standard — the CVE for AI agents Stable IDs · AIVSS v0.8 scored · behavioral fingerprints Trusted frameworks OWASP MCP Top 10 MITRE ATLAS OWASP AIVSS v0.8 Scanner interop Bawbel · SkillSpector ClawScan · others one shared vocabulary Open governance Apache 2.0 Path to OWASP project No vendor lock-in The prize: findings from any tool interoperate One vocabulary the field shares — the CVE moment for AI agents
Problem → standard → why it wins → the prize

Conventional standards stop at the package: CVE identifies a flaw, OSV maps it to a version range. Agent component threats have no package and no version — the danger is behavioral. AVE assigns a stable id to each distinct behavioral vulnerability class, scores it with OWASP AIVSS v0.8, and maps it to the frameworks the field already uses. One record catches many textual variants of the same underlying attack.

Why this matters for implementers

If two scanners read the same record and produce divergent results, the standard has failed at its one job. The architecture below exists so any implementation agrees with any other.

How a record works

AVE record records/AVE-YYYY-NNNNN.json · validates against schema v1.1 Definition (static) ave_id · attack_class behavioral_fingerprint severity · aivss{ } owasp_mcp · mitre_atlas remediation · iocs Evidence declarations (v1.1) confidence_baseline evidence_kind_default detection_stage · detection_layer evidence_basis_engines derivable_into VALIDATION Rule pattern / yara / semgrep Positive fixture must trigger Negative fixture must NOT trigger has a CONSUMPTION — the record DECLARES, the scanner ASSIGNS Record DECLARES (static) Scanner ASSIGNS to Finding (runtime) confidence_baselineconfidence (then FP-adjusted) evidence_kind_defaultevidence_kind detection_stageevidence_stage (floor) evidence_basis_enginesevidence_basis derivable_intoToxicFlow.derived_from_findings Finding: confidence ≠ aivss_score — separate fields, always declares baselines OUTPUTS Finding → SARIF ave_id in ruleId + taxonomies → GitHub Security tab / CI Record set → PiranhaDB → api.piranha.bawbel.io → ave.bawbel.io Crosswalks SkillSpector and ClawScan finding types map to AVE ids — one vocabulary across scanners
anatomy → validation → consumption → output

Anatomy

A record has two halves. The static definition describes the vulnerability class and never changes per scan: ave_id (immutable), attack_class, behavioral_fingerprint, severity, aivss, and the framework mappings. The evidence declarations (added in v1.1) are optional and declare the defaults a scanner uses to assign per-finding metadata.

Validation — the record/rule/fixture triangle

Every record requires a detection rule (pattern, YARA, or Semgrep) that references the ave_id, a positive fixture that must trigger it, and a negative fixture — a benign lookalike that must not. A rule without a negative fixture is an incomplete false-positive guard.

Consumption — the record declares, the scanner assigns

This is the most important rule for implementers. A record is static; a finding is a runtime instance. Confidence belongs to the finding, never the record — the same class in a docs folder and in a live skill file deserves different confidence. So the record declares a confidence_baseline and the scanner does the per-detection math.

Record declares (static)Scanner assigns to finding (runtime)
confidence_baselineconfidence (then FP-adjusted)
evidence_kind_defaultevidence_kind
detection_stageevidence_stage (floor)
evidence_basis_enginesevidence_basis
derivable_intoToxicFlow.derived_from_findings
Core invariant

In any finding, confidence and aivss_score are separate fields and are never merged. AIVSS answers “how bad would this be”; confidence answers “how sure are we.” Putting baselines in the standard rather than in scanner code is what keeps two implementations consistent.

Output — how AVE travels

A scanner emits findings as SARIF with the ave_id in ruleId and under taxonomies, plus aivss_score, confidence, owasp_mcp, and mitre_atlas in the properties bag. Because SARIF is already consumed by the GitHub Security tab and CI, AVE ids reach those surfaces for free. The open record set can be consumed by any tool, and crosswalks let other scanners’ finding types resolve to AVE ids.

Build your own implementation

The schema, records, rules, and fixtures are all open under Apache 2.0. Honor the declares-vs-assigns contract and the SARIF convention, and your tool’s findings will interoperate with every other AVE implementation. See the schema reference and the repository.