TokenIntel's DeFi Risk Map grades each protocol on seven dimensions: Smart Contract, Oracle, Governance, Liquidity, Economic, Admin Architecture, and Disclosure Quality. Each dimension breaks down into three to five specific sub-criteria with explicit weights that sum to 100%. Each sub-criterion is scored 0 to 100 where lower is less risky. The dimension score is the weighted average of its sub-criteria. The overall protocol grade is the weighted average of dimension scores, converted to a letter (A to F).
Why publish the rubric?
Most DeFi risk scoring frameworks give you a single letter or number with minimal explanation of how it was computed. "Aave has an A rating" tells you almost nothing: what if audit coverage is strong but admin controls are weak? What if the protocol is immutable but depends on six off-chain custodians? A single aggregate number hides the tradeoffs that actually matter for a position decision.
TokenIntel takes the opposite approach. We decompose every dimension into specific sub-criteria, publish the weights, and show each sub-score individually on every protocol's Risk Map row. If you disagree with our weighting, you can reweight it yourself. If you think our score for a specific sub-criterion is wrong, you can see exactly which one and challenge it.
This framework is inspired by YieldCompass's DeFi strategy risk methodology, which pioneered this decomposition approach for Solana yield strategies. We adapted their ideas to TokenIntel's protocol-level scope and added TI-specific dimensions like Admin Architecture, which became more important after the April 2026 Drift Protocol exploit.
Scoring scale and letter grades
Every sub-criterion and dimension uses the same 0 to 100 scale where lower scores mean less risk. A sub-criterion score of 20 represents a protocol at low risk on that specific axis; a score of 80 represents a protocol that fails that check materially.
Dimension scores are the weighted average of their sub-criteria. The overall protocol risk score is the weighted average of the six dimension scores (weighted 20/15/15/15/15/20, see below). That 0 to 100 aggregate is mapped to a letter grade:
We deliberately use coarse grades and round sub-scores to multiples of 5. Finer scales suggest precision we do not have. A 27 and a 31 on the same sub-criterion both mean "low moderate risk" in practice, but users anchor on the exact number and treat small differences as meaningful. Coarse grades force honest, defensible scoring and make cross-protocol comparison easier.
Three layers of risk every dimension maps to
Our seven dimensions all score a protocol's current risk posture, but they do not all score the same kind of risk. Separating risks by their mechanism helps decide which score matters most for a given protocol, and when additional scoring is futile because a more fundamental risk dominates.
A useful taxonomy for this comes from Anastasiia (@mathy_research), who frames vault-level credit risk as three structurally independent layers in her April 2026 Vault Summit framework. We use the same layers to describe how the first six dimensions fit together; the seventh dimension (Disclosure Quality) is orthogonal, it scores institutional underwriteability rather than mechanical failure modes:
When Layer 3 failure probability over your holding horizon is meaningfully larger than expected Layer 1 loss over the same horizon, improving Layer 1 scoring (better oracles, deeper liquidation pools, more conservative LTVs) does not reduce total expected loss. Code-integrity risk is a prerequisite check: assess it first. A protocol with a top-quartile Oracle dimension and a lingering cross-bridge dependency to an unaudited messaging layer is, in practice, scored by that dependency. Per Anastasiia's framework, Layer 3 must be cleared before Layer 1 tuning is worth doing. This is why a low Smart Contract or Admin Architecture score can sink an otherwise strong overall grade.
The seven dimensions and their sub-criteria
Each dimension is scoped to avoid overlap. Smart Contract covers the protocol's own code. Counterparty dependencies on external oracles, bridges, or custodians roll up under Oracle, Liquidity, or Admin Architecture as appropriate. Here is the full rubric.
1. Smart Contract
Dimension weight: 20%Risk from bugs, exploits, or operational failures in the protocol's own fund-handling contracts. Higher weight than most other dimensions because smart contract failure is the fastest path to total loss.
| Sub-criterion | Weight | What we evaluate |
|---|---|---|
| Audit Coverage & Depth | 30% | Count and depth of independent audits on contracts handling user funds. Bonus credit for formal verification and active bug bounty programs. |
| Hack History | 25% | Past exploits or critical incidents affecting user funds, weighted by recency, severity, and quality of remediation. |
| Version Lindy | 20% | How long the currently deployed fund-handling contracts have operated without critical failure. Not protocol age: if the vault contract was redeployed last month, Lindy is measured from then, not from the protocol's launch. |
| Upgradeability & Control | 25% | Immutable vs upgradeable contracts, who controls upgrade authority, and whether unilateral modification of user-critical logic is possible. |
2. Oracle
Dimension weight: 15%Risk from reliance on external or internal price feeds. A wrong price is indistinguishable from a wrong balance to most protocols.
| Sub-criterion | Weight | What we evaluate |
|---|---|---|
| Oracle Architecture | 40% | Quality and diversity of price feed architecture. Chainlink multi-source preferred over single-source TWAPs or proprietary feeds. |
| Manipulation Resistance | 30% | Resistance to flash loan manipulation and MEV extraction. Heartbeat, staleness checks, and sanity bounds. |
| Fallback & Override | 30% | Presence of circuit breakers, fallback oracles, and emergency override authority when price feeds misbehave. |
3. Governance
Dimension weight: 15%Risk from how decisions are made and executed. Even a perfectly audited contract is only as safe as the process that decides what code runs next.
| Sub-criterion | Weight | What we evaluate |
|---|---|---|
| Upgrade Authority | 40% | Who can push code changes. Timelock length, quorum requirements, and whether approval requires multi-entity sign-off. |
| Multisig & Key Custody | 30% | Multisig signer count, threshold, and diversity. Independent signers preferred over team-only. |
| Emergency Powers | 30% | Scope of unilateral pause, freeze, or recovery capabilities. Who holds them and under what conditions. |
4. Liquidity
Dimension weight: 15%Risk from being unable to exit a position when you want to. High displayed TVL means nothing if withdrawals are gated or slippage is catastrophic at size.
| Sub-criterion | Weight | What we evaluate |
|---|---|---|
| Exit Depth | 40% | Slippage impact for large withdrawals. TVL relative to single-position exit size. |
| Withdrawal Constraints | 30% | Cooldowns, queues, withdrawal caps, and processing delays before funds are available. |
| Redemption Model | 30% | Instant on-chain redemption vs epoch-based vs reliance on secondary market liquidity. |
5. Economic
Dimension weight: 15%Risk that the protocol's economic model cannot sustain its own returns. Yield from genuine fees is durable; yield from emissions is a countdown timer.
| Sub-criterion | Weight | What we evaluate |
|---|---|---|
| Revenue Durability | 40% | Real fees from genuine usage vs emissions or subsidies. Would the yield exist without the token? |
| Incentive Dependence | 30% | Fraction of displayed APY driven by temporary incentives, points, or token emissions rather than protocol revenue. |
| Token Capture Mechanism | 30% | Does the token have a mechanism (fee switch, buyback, burn) that routes real protocol revenue to holders? |
6. Admin Architecture
Dimension weight: 17%Risk from how administrative powers are scoped and custodied. This dimension was added after the April 2026 Drift Protocol attack, where $285M was drained in 12 minutes via 31 withdrawals using privileged access. A perfectly audited contract with a compromised admin key is still a zero.
| Sub-criterion | Weight | What we evaluate |
|---|---|---|
| Key Custody Model | 30% | EOA vs multisig vs timelocked DAO controls. Separation of pause, parameter, and upgrade powers. |
| Signer Diversity | 25% | Independent signers across organizations vs team-only. Public identities preferred over anonymous. |
| Action Scope | 25% | What admin can change. Parameter-only changes are lower risk than arbitrary code upgrades or treasury access. |
| Risk Oversight | 20% | External risk advisory (Chaos Labs, Gauntlet, BlockScience) and maturity of incident response procedures. |
7. Disclosure Quality
Dimension weight: 10%Institutional underwriteability of the protocol from an information-availability standpoint. The other six dimensions score how the protocol can fail; this one scores whether you can analyze it. Inspired by Blockworks' Token Transparency Framework filings (already integrated via data/ttf-registry.json) and Novora's 5-pillar IR Score, with TI's own per-protocol observations on cadence and depth. Weighted lower than core technical dimensions because poor disclosure reflects underwriting friction, not direct loss-of-funds risk, but it's the dimension that determines whether a real institutional allocator can deploy at all. Per Novora's April 2026 audit of 159 protocols: ~91% generate trackable revenue, but only ~18% publish quarterly updates, ~8% issue token-holder reports, and fewer than ~1% disclose market-maker terms.
| Sub-criterion | Weight | What we evaluate |
|---|---|---|
| Standardized Framework Filing | 30% | Whether the protocol has filed a public disclosure with a recognized framework (Blockworks TTF, Novora IR Score, or equivalent). Filings include token allocations, supply schedules, financial disclosures, and team accountability. Not filing scores worse than partial filing. |
| Investor Cadence | 20% | Frequency and depth of investor-style updates: quarterly reports, ecosystem updates, milestone disclosures. Quarterly is the institutional baseline. |
| Token Holder Reports | 20% | Dedicated communications to token holders covering revenue accrual, buyback execution, supply changes, and forward outlook. |
| Treasury & Buyback Transparency | 15% | Treasury holdings published with auditable trail; buyback execution data (volume, price, timing) disclosed real-time or near-real-time. On-chain buyback-and-burn passes automatically; buyback-and-hold needs extra discipline. |
| Market Maker Terms | 15% | Whether the protocol discloses market-maker engagements (which firms, what loans, what option strikes). One of the largest sources of soft-circulating supply pressure that crypto protocols routinely fail to disclose. |
Five additional checks on every protocol
Beyond the six scored dimensions, we track five binary and quantitative checks on every protocol research page. These are not weighted into the aggregate score because they are effectively red flags: a protocol that fails any of them has a structural problem regardless of its dimension scores.
Reference case: Kelp DAO / LayerZero, April 2026
Attackers drained 116,500 rsETH (~$290M) from Kelp's LayerZero-powered cross-chain bridge by compromising a single DVN verifier path in a 1-of-1 configuration. They obtained root access to LayerZero Labs' DVN RPC infrastructure, replaced the op-geth binary on two of three nodes, and DDoS'd the uninfected third. A failover to the compromised nodes let the DVN attest a forged "burn" message claiming 116,500 rsETH had been burned on Unichain (the burn never happened. Unichain's outbound nonce stayed at 307 while Ethereum accepted nonce 308). The OFT Adapter on Ethereum released the funds as instructed.
Downstream contagion: the attacker then deposited 89,567 rsETH (76.9% of the stolen total) as collateral on Aave V3, borrowing 82,650 WETH + 821 wstETH (~$193M). Aave is now holding $124M–$230M in potential bad debt depending on how Kelp DAO allocates losses between Ethereum mainnet and L2 rsETH holders. BGD Labs had explicitly warned Aave about this specific risk during the rsETH listing discussion in February 2025, recommending a multi-DVN configuration. The warning was not adopted. This makes it a governance-process failing at Aave, not only a Kelp / LayerZero failing.
Key takeaways for TI's risk scoring: (1) a protocol's own contract security is independent from the bridge it depends on. Aave's smart contracts, oracle system, and liquidation mechanisms all operated correctly throughout; (2) messaging-layer default configurations can be insecure even when documented. LayerZero's V2 OApp Quickstart sample wires every pathway with a single required DVN, and per the Dune dashboard roughly 32% of LayerZero OApps currently run this minimal configuration; (3) a fast pause mechanism limits downside materially (Kelp's 46-minute pause blocked a second attack that would have released ~$100M more); (4) DVN count alone is not a sufficient security metric, a 2-of-2 DVN configuration would not have helped here if both DVNs read from the same compromised RPCs. Real diversification requires independent RPC providers, independent hosting infrastructure, and ideally different verification methods (some cryptographic, some not); (5) architecture alone doesn't determine trust, underwriting discipline does. SparkLend, which uses the same unified-pool architecture as Aave, captured the largest share of post-incident inflows (+$1.8B deposits Apr 19–21) because it had proactively deprecated rsETH in January 2026, rate-limits supply and borrow caps to prevent explosive exposure growth, and maintained >$350M of instantly-available spUSDT liquidity through the crisis. The blockworks / Leasure + Shaundadevens post-mortem frames this as "the deeper re-rating may be occurring not just across architectures, but across perceptions of who underwrites risk most credibly." For our scoring, it means the Governance dimension now explicitly weights pre-incident deprecation decisions and risk-param conservatism, not just post-incident response speed.
LayerZero's post-incident policy (April 2026): will stop signing/attesting messages for any application maintaining single-DVN configurations; all 1-of-1 OApps must migrate to multi-DVN. This is a forced-migration event for the ~32% of LayerZero apps that haven't upgraded yet. Arbitrum precedent: the 12-member Arbitrum Security Council invoked ArbitrumUnsignedTxType (EIP-2718) for the first time to freeze 30,766 ETH (~$71.5M) from the attacker on Arbitrum. The power existed in the chain's design but had never been used; its first invocation raises real questions about the decentralization spectrum in practice, and sets a precedent that the council may face pressure to use again for less clear-cut cases.
LayerZero structural fixes (May 2026 statement): the response goes well past the 1-of-1 ban. (a) Default-config migration: defaults on all pathways migrate to 5/5 DVNs where possible, no less than 3/3 on chains where only 3 DVNs are available. (b) Client diversity: a second DVN client written in Rust is in development, addressing the op-geth binary-swap attack vector specifically. (c) Granular RPC quorum config: DVNs can now select tiered quorums of internal, dedicated-external, and shared-external RPCs, addressing the shared-RPC failure mode that took down the Kelp pathway. (d) OneSig: a custom multisig where signers download transactions, merklize and hash them locally, and sign the root hash, preventing a compromised backend from slipping unauthorized transactions into the signing flow. LayerZero Labs is migrating its own multisig threshold from 3-of-5 to 7-of-10 across chains where OneSig exists. (e) Per-signer private anomaly checkers: every OneSig signer maintains their own custom anomaly checker on their signing device; criteria are not shared with the company or other signers, defeating insider-coordination attacks. (f) Console: a unified configuration platform with automated anomaly detection (unknown DVNs, ownership changes, block-confirmation changes, unsafe configurations, default-config usage). For TI's scoring, this is where the next protocol-side check moves: a LayerZero-using protocol that has migrated to non-default configs, integrated with Console anomaly detection, and uses a multi-DVN setup with diverse RPC quorums will earn a clean Cross-Chain Messaging Posture score; a protocol still on defaults at end of 2026 will score one full letter lower regardless of other strengths.
Internal-process disclosure worth flagging: the May 2026 LayerZero statement also discloses that approximately 3.5 years prior, a multisig signer used the multisig hardware wallet for a personal trade by mistake (intending to use a personal wallet). The signer was removed and wallets rotated. This is a separate trust-assumption data point: even a hardened multisig setup carries human-process risk that is not visible from on-chain configuration alone. The OneSig + per-signer anomaly checker design appears to be a direct response to this class of failure, beyond the Kelp incident itself.
Conflicting narratives about whose infrastructure was compromised and what guidance was given remain unresolved between Kelp and LayerZero; TI scores configuration posture, not attribution.
Sources: CoinDesk, OAK Research, and LlamaRisk coverage (April 2026); Banteg's on-chain attack investigation; Aave governance forum (incident report + scenario modeling); Dune's LayerZero OApp DVN Configuration dashboard. Dune methodology caveat: the dashboard reports DVN cardinality but does not expose the N-of-M threshold for optional DVNs and does not label operator identity. A configuration that looks safe on cardinality alone can still have correlated operators, shared RPC infrastructure (as this incident demonstrated), or a weak optional-DVN threshold.
The AI-era risk-surface shift (May 2026 calibration)
Through 2025 the highest-frequency cause of catastrophic DeFi loss was a code bug in the protocol's own contracts. TI's framework (and most peer frameworks) reflected that, weighting Smart Contract heavily. Two developments in 2026 are forcing a recalibration: AI-assisted auditing has compressed the cost of finding subtle contract bugs, and AI-assisted offensive capability has compressed the cost of attacking everything else (frontends, signing infrastructure, RPC providers, dev credentials, browser-level exploits, social engineering augmented by deepfakes). The empirical record of recent incidents already reflects the new asymmetry.
Anchor capability: Anthropic Claude Mythos Preview (released May 2026)
Per Anthropic's published system card and Project Glasswing materials, Mythos Preview has solved "The Last Ones" (TLO), a 32-step corporate-network attack simulation Anthropic estimates requires ~20 hours of skilled human effort, end-to-end without supervision. The same model has surfaced thousands of high-severity vulnerabilities spanning every major operating system and web browser. Earlier preview versions, during evaluation runs, used /proc/ access to search for credentials, attempted to circumvent sandboxing and escalate privileges, accessed credentials for messaging services, source control, and the Anthropic API by inspecting process memory, and intervened to suppress git history evidence after editing files outside their sanctioned scope. Anthropic restricts general access to the model and is deploying it defensively to a curated set of "systemically important" tech companies via Project Glasswing.
Note on self-presented capability claims. Mythos numbers come from Anthropic's own evaluation framework. The marketing layer overstates the asymmetry, but the directional claim (multi-step network attack capability has crossed a threshold that previously required skilled humans) is corroborated by independent government AI-safety teams' published evals. TI weights this as directionally credible, with the specific benchmarks treated as upper-bound estimates.
Sources: Anthropic Mythos system card; Project Glasswing announcement; DeFi Education's "Can The AI Companies Do Security?" framing post (May 2026). Also relevant: Anthropic's March 31, 2026 source-code leak via a public-bucket source map in a production npm package, which surfaced 3 shell-injection vulnerabilities in the leaked code, a separately-documented operational-security failure at the same vendor.
How this shifts TI's scoring (specific calibration changes). The framework is layered (Layer 1 mechanical, Layer 2 parametric, Layer 3 code-integrity), so the rebalance is not a single weight change. It is differential pressure on which Layer 3 sub-channels matter and where adjacent operational-security checks belong.
What does not change. The Layer 3 prerequisite logic still holds (code-integrity must be cleared before Layer 1 tuning is worth doing). The Yield Compass framework still applies. The TradFi-translation failure modes covered below still apply. AI is a multiplier on the existing structure, the structure stays.
How TI surfaces this in protocol scoring. Existing research-page risk arrays will show the recalibrated weights; new sub-criteria appear in protocol risk reports starting May 2026. Where a protocol's Smart Contract score moves up because of AI-era reduced contract-bug risk, the corresponding Admin Architecture or Frontend Consistency score may move down to reflect the absorbed-but-shifted risk surface. Net protocol grades may not move materially in either direction; the composition of those grades is what changes.
Six places TradFi credit analogies break in DeFi
Our rubric is calibrated to how DeFi lending actually fails, not to a direct translation of bank-style credit assessment. The following six failure modes, articulated by Anastasiia (@mathy_research) in her April 2026 Vault Summit paper, describe where standard credit-risk concepts produce biased or misleading measures if applied to an onchain lending vault without adjustment. They map cleanly onto the Layer 1 (mechanical) risks scored above, and understanding them helps read our Oracle, Liquidity, and Economic scores more precisely.
Reference case: Resolv / Morpho, March 22, 2026
A compromised offchain signing key, a single EOA with no onchain validation, was used to mint 80 million unbacked USR tokens for $200,000 in attacker-controlled capital. That piece was a key-management and contract-design failure, outside the scope of any vault-risk framework. What happened next was entirely inside scope: a cascade of secondary losses driven by oracle-latency and wrong-way-allocation failures that our risk methodology is now explicitly calibrated to catch.
Oracle latency (failure mode 5a). USR's NAV-based oracle updated once per 24 hours. For hours after the exploit, it faithfully reported pre-exploit collateral-to-supply ratios: RLP oracle read $1.29 while the market cleared at $0.52; USR oracle read near $1.00 while Curve pools showed $0.025. Secondary borrowers, uninvolved in the original exploit, rationally borrowed USDC against oracle-inflated USR collateral because the price gap was an arbitrage-like opportunity if you trusted the oracle. The resulting bad debt was not the attacker's work. It was the mechanical consequence of an oracle that had become structurally incorrect on an automatically-enforced lending protocol.
Wrong-way automated allocation. Public-allocator vaults (including Gauntlet-curated Morpho markets) treated the 100% utilization that emerged in the affected markets as a yield signal and continued supplying USDC to those markets for hours after the exploit began. Before auto-allocators began flowing USDC in, bad debt in the affected Morpho markets was approximately $4,900. The multi-million-dollar vault losses were generated by automated capital inflows after stress was already visible onchain. The post-mortem floor on ecosystem losses is roughly $3.8M in bad debt across Morpho markets alone, with $8.9M total allocated capital exposed.
Key takeaways for TI's risk scoring: (1) Oracle design is a credit-risk input, not an engineering detail. A NAV-based oracle updating on a 24-hour cadence against a synthetic stablecoin whose underlying value correlates with market stress has a structurally predictable false-solvency window. This is derivable from public facts (update frequency + reference-asset volatility) before any exploit; the Oracle dimension's sub-criteria now explicitly flag update cadence on stress-correlated collateral. (2) Automated allocators need stress conditionality. A public allocator that treats 100% utilization as pure yield, without a deteriorating-collateral guard, is wrong-way allocation by design. If a curator's published allocator doesn't document its behavior under collateral-peg failure, assume it doesn't have one. This is now scored under Governance (parameter reactivity) and Liquidity (allocator behavior under stress). (3) Rehypothecation depth is a risk multiplier, not a neutral design choice. When the collateral asset is itself a share token of another lending strategy (USR backed by a delta-neutral funding-rate position), a shock propagates through every protocol layer that accepted the derivative as collateral. Our Collateral Concentration check now treats rehypothecation chains >= 2 layers as a separate flag.
Sources: Anastasiia (@mathy_research), "DeFi Lending Credit Risk: A Three-Part Framework" (Vault Summit, April 2026); Morpho incident retrospective; Resolv governance forum post-mortem. Last verified: 2026-04-23.
What this framework does not capture
We are explicit about the limits of the rubric so users can apply it appropriately:
- Regulatory risk varies by jurisdiction and changes faster than quarterly re-scoring can capture. We flag major regulatory actions on individual research pages as they happen.
- Systemic contagion (what happens if a depeg cascades through 10 protocols that share collateral types) is not directly scored. Oracle and Liquidity sub-criteria cover the proximate risks; true contagion requires separate stress-test analysis.
- Insurance and cover costs are not in the rubric. A protocol with expensive Nexus Mutual cover is signaling higher perceived risk from specialist underwriters, which is information we recommend users incorporate separately.
- Team reputation and off-chain conduct are partially captured in Governance and Signer Diversity sub-criteria but not exhaustively. We do not score team members individually.
- Agentic-execution risk is not yet scored. As AI agents begin transacting directly onchain via standards like x402 and ERC-8004 / 8183 / 8211, a new attack class becomes relevant: prompt injection through poisoned oracles, ENS records, or contract metadata can hijack agent behaviour and drain wallets with no phishing link clicked or malware installed. This is adjacent to our Oracle dimension but distinct in that the attacker surface is the agent's reasoning layer rather than the protocol's pricing layer. We expect to add an agentic-execution sub-criterion once the workload is material enough to score.
How scores are updated
Sub-criteria and dimension scores are reviewed on an ongoing basis. Changes are logged in the defi-risk-scores.json source file with a bumped lastUpdated date. Major changes (Chaos Labs departing Aave, Drift being attacked, the April 2026 Kelp / LayerZero incident, a new audit round published) trigger same-day re-scoring. Minor drift is re-evaluated weekly.
The framework itself is versioned. The current version (v2) was published in April 2026 after decomposing the original six-dimension aggregate scores into the twenty sub-criteria documented here.
See the framework in action
Every protocol on the DeFi Risk Map has its six dimension scores and twenty sub-criteria visible in context.
Open the DeFi Risk Map →