TokenIntel Research · Methodology

How TokenIntel Scores DeFi Protocol Risk

Six dimensions. Twenty published sub-criteria. Explicit weights. No vibes, no vague ratings. Every score on the DeFi Risk Map traces back to a decomposed rubric you can inspect.

The short version

TokenIntel's DeFi Risk Map grades each protocol on six dimensions: Smart Contract, Oracle, Governance, Liquidity, Economic, and Admin Architecture. Each dimension breaks down into three or four specific sub-criteria with explicit weights that sum to 100%. Each sub-criterion is scored 0 to 100 where lower is less risky. The dimension score is the weighted average of its sub-criteria. The overall protocol grade is the weighted average of dimension scores, converted to a letter (A to F).

Why publish the rubric?

Most DeFi risk scoring frameworks give you a single letter or number with minimal explanation of how it was computed. "Aave has an A rating" tells you almost nothing: what if audit coverage is strong but admin controls are weak? What if the protocol is immutable but depends on six off-chain custodians? A single aggregate number hides the tradeoffs that actually matter for a position decision.

TokenIntel takes the opposite approach. We decompose every dimension into specific sub-criteria, publish the weights, and show each sub-score individually on every protocol's Risk Map row. If you disagree with our weighting, you can reweight it yourself. If you think our score for a specific sub-criterion is wrong, you can see exactly which one and challenge it.

This framework is inspired by YieldCompass's DeFi strategy risk methodology, which pioneered this decomposition approach for Solana yield strategies. We adapted their ideas to TokenIntel's protocol-level scope and added TI-specific dimensions like Admin Architecture, which became more important after the April 2026 Drift Protocol exploit.

Scoring scale and letter grades

Every sub-criterion and dimension uses the same 0 to 100 scale where lower scores mean less risk. A sub-criterion score of 20 represents a protocol at low risk on that specific axis; a score of 80 represents a protocol that fails that check materially.

Dimension scores are the weighted average of their sub-criteria. The overall protocol risk score is the weighted average of the six dimension scores (weighted 20/15/15/15/15/20, see below). That 0 to 100 aggregate is mapped to a letter grade:

A (0 to 24)
Low risk across all dimensions. Top-tier audits, no hack history, deep liquidity, battle-tested admin architecture.
B (25 to 39)
Mostly strong with 1-2 moderate risks. Suitable for meaningful exposure with standard risk management.
C (40 to 54)
Mixed profile. Some dimensions strong, others concerning. Requires understanding which specific risks you are taking.
D (55 to 69)
Multiple material risks. Only appropriate for small, informed positions.
F (70 and up)
Severe risk on multiple dimensions. We do not recommend exposure regardless of yield.
Why not a finer scale?

We deliberately use coarse grades and round sub-scores to multiples of 5. Finer scales suggest precision we do not have. A 27 and a 31 on the same sub-criterion both mean "low moderate risk" in practice, but users anchor on the exact number and treat small differences as meaningful. Coarse grades force honest, defensible scoring and make cross-protocol comparison easier.

Three layers of risk every dimension maps to

Our six dimensions all score a protocol's current risk posture, but they do not all score the same kind of risk. Separating risks by their mechanism helps decide which score matters most for a given protocol — and when additional scoring is futile because a more fundamental risk dominates.

A useful taxonomy for this comes from Anastasiia (@mathy_research), who frames vault-level credit risk as three structurally independent layers in her April 2026 Vault Summit framework. We use the same layers to describe how our six dimensions fit together:

Layer 1 — Mechanical
Risk that arises from how the protocol executes when its contracts and parameters work exactly as designed. Oracle-to-execution price gaps, liquidation slippage against finite pool depth, gas and MEV competition consuming liquidation margin, utilization ceilings forcing withdrawal freezes. Our Oracle, Liquidity, and Economic dimensions primarily score this layer, and four of the five additional checks (Collateral Concentration, Dependency Count, Cross-Chain Messaging Posture, Frontend Contract Consistency) interact with it.
Layer 2 — Governance
Risk that the parameter-setting process cannot react faster than stress evolves. The gap between a protocol's response window (how quickly a parameter change would need to land to be protective) and its timelock duration (how long governance takes to ship that change) is a structural risk in itself, not just a process observation. Our Governance dimension scores this, with recent emphasis (post-Kelp) on pre-incident deprecation decisions and risk-param conservatism, not just post-incident response speed.
Layer 3 — Code integrity
Risk that the contract dependency graph executes contrary to specification — an undiscovered vulnerability, an exploit, an unauthorized upgrade. When this layer fails, none of the Layer 1 or Layer 2 scores provide advance warning: the loss is a code failure, not a risk-management failure. Our Smart Contract and Admin Architecture dimensions score this layer. Empirically, single-protocol vaults with strong audit + bug-bounty track records show annualized exploit probability in the low fractions of a percent; cross-protocol or bridge-dependent vaults are materially higher, because code-integrity risk compounds with each additional protocol layer in the dependency chain.
Dominance condition

When Layer 3 failure probability over your holding horizon is meaningfully larger than expected Layer 1 loss over the same horizon, improving Layer 1 scoring (better oracles, deeper liquidation pools, more conservative LTVs) does not reduce total expected loss. Code-integrity risk is a prerequisite check: assess it first. A protocol with a top-quartile Oracle dimension and a lingering cross-bridge dependency to an unaudited messaging layer is, in practice, scored by that dependency. Per Anastasiia's framework, Layer 3 must be cleared before Layer 1 optimization is worth doing. This is why a low Smart Contract or Admin Architecture score can sink an otherwise strong overall grade.

The six dimensions and their sub-criteria

Each dimension is scoped to avoid overlap. Smart Contract covers the protocol's own code. Counterparty dependencies on external oracles, bridges, or custodians roll up under Oracle, Liquidity, or Admin Architecture as appropriate. Here is the full rubric.

1. Smart Contract

Dimension weight: 20%

Risk from bugs, exploits, or operational failures in the protocol's own fund-handling contracts. Higher weight than most other dimensions because smart contract failure is the fastest path to total loss.

Sub-criterionWeightWhat we evaluate
Audit Coverage & Depth30%Count and depth of independent audits on contracts handling user funds. Bonus credit for formal verification and active bug bounty programs.
Hack History25%Past exploits or critical incidents affecting user funds, weighted by recency, severity, and quality of remediation.
Version Lindy20%How long the currently deployed fund-handling contracts have operated without critical failure. Not protocol age: if the vault contract was redeployed last month, Lindy is measured from then, not from the protocol's launch.
Upgradeability & Control25%Immutable vs upgradeable contracts, who controls upgrade authority, and whether unilateral modification of user-critical logic is possible.

2. Oracle

Dimension weight: 15%

Risk from reliance on external or internal price feeds. A wrong price is indistinguishable from a wrong balance to most protocols.

Sub-criterionWeightWhat we evaluate
Oracle Architecture40%Quality and diversity of price feed architecture. Chainlink multi-source preferred over single-source TWAPs or proprietary feeds.
Manipulation Resistance30%Resistance to flash loan manipulation and MEV extraction. Heartbeat, staleness checks, and sanity bounds.
Fallback & Override30%Presence of circuit breakers, fallback oracles, and emergency override authority when price feeds misbehave.

3. Governance

Dimension weight: 15%

Risk from how decisions are made and executed. Even a perfectly audited contract is only as safe as the process that decides what code runs next.

Sub-criterionWeightWhat we evaluate
Upgrade Authority40%Who can push code changes. Timelock length, quorum requirements, and whether approval requires multi-entity sign-off.
Multisig & Key Custody30%Multisig signer count, threshold, and diversity. Independent signers preferred over team-only.
Emergency Powers30%Scope of unilateral pause, freeze, or recovery capabilities. Who holds them and under what conditions.

4. Liquidity

Dimension weight: 15%

Risk from being unable to exit a position when you want to. High displayed TVL means nothing if withdrawals are gated or slippage is catastrophic at size.

Sub-criterionWeightWhat we evaluate
Exit Depth40%Slippage impact for large withdrawals. TVL relative to single-position exit size.
Withdrawal Constraints30%Cooldowns, queues, withdrawal caps, and processing delays before funds are available.
Redemption Model30%Instant on-chain redemption vs epoch-based vs reliance on secondary market liquidity.

5. Economic

Dimension weight: 15%

Risk that the protocol's economic model cannot sustain its own returns. Yield from genuine fees is durable; yield from emissions is a countdown timer.

Sub-criterionWeightWhat we evaluate
Revenue Durability40%Real fees from genuine usage vs emissions or subsidies. Would the yield exist without the token?
Incentive Dependence30%Fraction of displayed APY driven by temporary incentives, points, or token emissions rather than protocol revenue.
Token Capture Mechanism30%Does the token have a mechanism (fee switch, buyback, burn) that routes real protocol revenue to holders?

6. Admin Architecture

Dimension weight: 20%

Risk from how administrative powers are scoped and custodied. This dimension was added after the April 2026 Drift Protocol attack, where $285M was drained in 12 minutes via 31 withdrawals using privileged access. A perfectly audited contract with a compromised admin key is still a zero.

Sub-criterionWeightWhat we evaluate
Key Custody Model30%EOA vs multisig vs timelocked DAO controls. Separation of pause, parameter, and upgrade powers.
Signer Diversity25%Independent signers across organizations vs team-only. Public identities preferred over anonymous.
Action Scope25%What admin can change. Parameter-only changes are lower risk than arbitrary code upgrades or treasury access.
Risk Oversight20%External risk advisory (Chaos Labs, Gauntlet, BlockScience) and maturity of incident response procedures.

Five additional checks on every protocol

Beyond the six scored dimensions, we track five binary and quantitative checks on every protocol research page. These are not weighted into the aggregate score because they are effectively red flags: a protocol that fails any of them has a structural problem regardless of its dimension scores.

Frontend Contract Consistency
Does the official user interface route transactions exclusively to documented and verified contract addresses? A "no" here means the UI could be swapped or modified without users noticing, which is a real attack vector.
Deployment Address Clarity
Are the deployed contract addresses clearly documented in the protocol's official documentation and independently verifiable on-chain? "No" means users cannot confirm what code they are interacting with.
Dependency Count
Count of independent external entities (oracles, bridges, custodians, off-chain service providers, upstream protocols) whose correct functioning is required for the strategy to operate safely. More dependencies equals broader blast radius in the event of any single failure.
Collateral Concentration
For any pooled lending reserve, what percentage of borrow collateral is sourced from a single asset class (e.g. ETH LSTs, a single stablecoin, RWA tokens)? A reserve where >70% of collateral comes from one class is not a diversified lending book — it is effectively financing a single concentrated strategy, and depositors are bearing the tail risk of that strategy without being compensated for it. Applies especially to protocols with a unified-pool architecture where all suppliers of an asset earn the same APR regardless of what their capital is ultimately financing. The April 2026 Kelp incident revealed that 98.5% of Aave's WETH borrow collateral came from ETH LSTs, turning aWETH depositors into de facto third-loss capital in a concentrated LST carry trade. We now flag any reserve where single-class concentration exceeds 70%, and score higher (worse) when the protocol doesn't offer collateral-specific borrow rates (which v4 Risk Premiums and modular platforms like Morpho do).
Cross-Chain Messaging Posture
If the protocol uses a cross-chain messaging layer (LayerZero, Wormhole, CCIP, Axelar, Hyperlane), evaluate the configuration along four dimensions: (1) DVN / validator redundancy (1-of-1 vs 3-of-5, etc.); (2) operator independence (signers from different organizations vs correlated operators); (3) infrastructure diversity — critically, do the DVNs read from different RPC providers with different hosting, and do they use different verification methods? (Multi-DVN with shared RPCs fails identically to single-DVN, as the April 2026 Kelp incident demonstrated when 2 of 3 RPCs on the same DVN were compromised and the third was DDoS'd.); (4) default vs hardened config (is the protocol using the messaging layer's quickstart / GitHub-default config, or did they customize?). This check was added after the April 2026 Kelp / LayerZero incident, where a 1-of-1 DVN configuration allowed attackers to drain ~$290M by compromising a single verifier pathway. Per Dune analytics, roughly 32% of LayerZero OApps ran 1-of-1 configurations at the time of the incident. LayerZero has since announced it will no longer sign messages for apps on single-DVN configurations, forcing migration.

Reference case: Kelp DAO / LayerZero, April 2026

Attackers drained 116,500 rsETH (~$290M) from Kelp's LayerZero-powered cross-chain bridge by compromising a single DVN verifier path in a 1-of-1 configuration. They obtained root access to LayerZero Labs' DVN RPC infrastructure, replaced the op-geth binary on two of three nodes, and DDoS'd the uninfected third. A failover to the compromised nodes let the DVN attest a forged "burn" message claiming 116,500 rsETH had been burned on Unichain (the burn never happened — Unichain's outbound nonce stayed at 307 while Ethereum accepted nonce 308). The OFT Adapter on Ethereum released the funds as instructed.

Downstream contagion: the attacker then deposited 89,567 rsETH (76.9% of the stolen total) as collateral on Aave V3, borrowing 82,650 WETH + 821 wstETH (~$193M). Aave is now holding $124M–$230M in potential bad debt depending on how Kelp DAO allocates losses between Ethereum mainnet and L2 rsETH holders. BGD Labs had explicitly warned Aave about this specific risk during the rsETH listing discussion in February 2025, recommending a multi-DVN configuration. The warning was not adopted. This makes it a governance-process failing at Aave, not only a Kelp / LayerZero failing.

Key takeaways for TI's risk scoring: (1) a protocol's own contract security is independent from the bridge it depends on — Aave's smart contracts, oracle system, and liquidation mechanisms all operated correctly throughout; (2) messaging-layer default configurations can be insecure even when documented — LayerZero's V2 OApp Quickstart sample wires every pathway with a single required DVN, and per the Dune dashboard roughly 32% of LayerZero OApps currently run this minimal configuration; (3) a fast pause mechanism limits downside materially (Kelp's 46-minute pause blocked a second attack that would have released ~$100M more); (4) DVN count alone is not a sufficient security metric — a 2-of-2 DVN configuration would not have helped here if both DVNs read from the same compromised RPCs. Real diversification requires independent RPC providers, independent hosting infrastructure, and ideally different verification methods (some cryptographic, some not); (5) architecture alone doesn't determine trust — underwriting discipline does. SparkLend, which uses the same unified-pool architecture as Aave, captured the largest share of post-incident inflows (+$1.8B deposits Apr 19–21) because it had proactively deprecated rsETH in January 2026, rate-limits supply and borrow caps to prevent explosive exposure growth, and maintained >$350M of instantly-available spUSDT liquidity through the crisis. The blockworks / Leasure + Shaundadevens post-mortem frames this as "the deeper re-rating may be occurring not just across architectures, but across perceptions of who underwrites risk most credibly." For our scoring, it means the Governance dimension now explicitly weights pre-incident deprecation decisions and risk-param conservatism, not just post-incident response speed.

LayerZero's post-incident policy (April 2026): will stop signing/attesting messages for any application maintaining single-DVN configurations; all 1-of-1 OApps must migrate to multi-DVN. This is a forced-migration event for the ~32% of LayerZero apps that haven't upgraded yet. Arbitrum precedent: the 12-member Arbitrum Security Council invoked ArbitrumUnsignedTxType (EIP-2718) for the first time to freeze 30,766 ETH (~$71.5M) from the attacker on Arbitrum. The power existed in the chain's design but had never been used; its first invocation raises real questions about the decentralization spectrum in practice, and sets a precedent that the council may face pressure to use again for less clear-cut cases.

Conflicting narratives about whose infrastructure was compromised and what guidance was given remain unresolved between Kelp and LayerZero; TI scores configuration posture, not attribution.

Sources: CoinDesk, OAK Research, and LlamaRisk coverage (April 2026); Banteg's on-chain attack investigation; Aave governance forum (incident report + scenario modeling); Dune's LayerZero OApp DVN Configuration dashboard. Dune methodology caveat: the dashboard reports DVN cardinality but does not expose the N-of-M threshold for optional DVNs and does not label operator identity. A configuration that looks safe on cardinality alone can still have correlated operators, shared RPC infrastructure (as this incident demonstrated), or a weak optional-DVN threshold.

Six places TradFi credit analogies break in DeFi

Our rubric is calibrated to how DeFi lending actually fails, not to a direct translation of bank-style credit assessment. The following six failure modes — articulated by Anastasiia (@mathy_research) in her April 2026 Vault Summit paper — describe where standard credit-risk concepts produce biased or misleading measures if applied to an onchain lending vault without adjustment. They map cleanly onto the Layer 1 (mechanical) risks scored above, and understanding them helps read our Oracle, Liquidity, and Economic scores more precisely.

1. Oracle-to-execution divergence
In a bank, the valuation agent may mark a collateral position slightly off clearing prices, but deviations are bounded and institutions absorb them. Onchain, the oracle is the protocol's pricing agent, and liquidations execute automatically against oracle marks. When oracle-reported price exceeds the true fillable execution price by more than the coverage buffer, a vault looks solvent on paper but is factually insolvent at the price it would actually realize. The oracle-to-execution wedge is a credit risk variable, not operational noise.
2. Recovery endogeneity
Standard Loss-Given-Default in TradFi is a fixed haircut derived from historical recoveries, treating the collateral market as deep enough that one liquidation doesn't move it. DeFi liquidations route through onchain venues with finite depth — the larger the liquidation, the worse the fill, the lower the recovery, the bigger the shortfall. Expected shortfall is nonlinear in liquidation mass, and accelerates faster than position size under correlated stress (all looped LST positions unwinding into the same pool at once is the canonical case). This is why concentration concentration matters so much: a 90% LTV that's safe on one position is structurally unsafe across 119 positions sharing the same exit liquidity.
3. Full-information bank runs
In TradFi, depositors can't observe each other's intentions. That information asymmetry dampens run incentives because you don't know whether other depositors are about to run. Onchain, all state is public at the block level — utilization rate, withdrawal queue, vault composition are observable to every depositor. Under common information, what is probabilistic in TradFi becomes close to deterministic in DeFi: when users see utilization approaching the withdrawal ceiling and collateral quality deteriorating simultaneously, the rational move is to exit first. Runs onchain are faster, sharper, and more complete than their TradFi counterparts.
4. Parameter rigidity under timelocks
A TradFi collateral manager facing a deteriorating position can adjust eligibility and coverage tests intraday. Onchain parameter changes are gated by governance timelocks, often 24–72 hours. If a protocol's critical response window (the maximum horizon over which a parameter change remains protective) is shorter than its timelock duration, curator intervention is structurally unreachable during stress — not because governance is slow, but because the math doesn't close. This is why our Governance dimension now explicitly weights pre-incident deprecation decisions: the only effective parameter changes in fast stress are the ones already made.
5. Oracle latency and manipulation
The TradFi analogy treats oracle risk as valuation-agent risk, bounded by redundancy. Onchain, oracle error is adversarially targeted, correlated with stress regimes, and automatically trips liquidation logic. Two distinct failure channels: latency (a stale oracle reflects pre-stress prices during fast drawdowns, keeping undercollateralized positions open and letting secondary actors rationally borrow against an incorrect mark) and manipulation (feasible when the cost to displace the oracle's reference market falls below the manipulation payoff, a threshold that can be crossed on thin reference markets). The March 22, 2026 Resolv exploit is a worked case of the latency channel — see the callout below.
6. Congestion-dependent liquidation (wrong-way risk)
TradFi clearing infrastructure has dedicated default-management resources independent of market stress. Onchain liquidation requires blockspace, and blockspace cost is endogenous to the same stress that triggers the liquidation: gas prices and MEV competition spike exactly when large-scale liquidation is needed. The probability that a triggered liquidation is economically unexecutable is strictly higher conditional on stress than unconditionally. This is wrong-way risk in the formal sense — the protection mechanism weakens precisely when it is most needed.

Reference case: Resolv / Morpho, March 22, 2026

A compromised offchain signing key — a single EOA with no onchain validation — was used to mint 80 million unbacked USR tokens for $200,000 in attacker-controlled capital. That piece was a key-management and contract-design failure, outside the scope of any vault-risk framework. What happened next was entirely inside scope: a cascade of secondary losses driven by oracle-latency and wrong-way-allocation failures that our risk methodology is now explicitly calibrated to catch.

Oracle latency (failure mode 5a). USR's NAV-based oracle updated once per 24 hours. For hours after the exploit, it faithfully reported pre-exploit collateral-to-supply ratios: RLP oracle read $1.29 while the market cleared at $0.52; USR oracle read near $1.00 while Curve pools showed $0.025. Secondary borrowers — uninvolved in the original exploit — rationally borrowed USDC against oracle-inflated USR collateral because the price gap was an arbitrage-like opportunity if you trusted the oracle. The resulting bad debt was not the attacker's work. It was the mechanical consequence of an oracle that had become structurally incorrect on an automatically-enforced lending protocol.

Wrong-way automated allocation. Public-allocator vaults (including Gauntlet-curated Morpho markets) treated the 100% utilization that emerged in the affected markets as a yield signal and continued supplying USDC to those markets for hours after the exploit began. Before auto-allocators began flowing USDC in, bad debt in the affected Morpho markets was approximately $4,900. The multi-million-dollar vault losses were generated by automated capital inflows after stress was already visible onchain. The post-mortem floor on ecosystem losses is roughly $3.8M in bad debt across Morpho markets alone, with $8.9M total allocated capital exposed.

Key takeaways for TI's risk scoring: (1) Oracle design is a credit-risk input, not an engineering detail. A NAV-based oracle updating on a 24-hour cadence against a synthetic stablecoin whose underlying value correlates with market stress has a structurally predictable false-solvency window. This is derivable from public facts (update frequency + reference-asset volatility) before any exploit; the Oracle dimension's sub-criteria now explicitly flag update cadence on stress-correlated collateral. (2) Automated allocators need stress conditionality. A public allocator that treats 100% utilization as pure yield, without a deteriorating-collateral guard, is wrong-way allocation by design. If a curator's published allocator doesn't document its behavior under collateral-peg failure, assume it doesn't have one. This is now scored under Governance (parameter reactivity) and Liquidity (allocator behavior under stress). (3) Rehypothecation depth is a risk multiplier, not a neutral design choice. When the collateral asset is itself a share token of another lending strategy (USR backed by a delta-neutral funding-rate position), a shock propagates through every protocol layer that accepted the derivative as collateral. Our Collateral Concentration check now treats rehypothecation chains >= 2 layers as a separate flag.

Sources: Anastasiia (@mathy_research), "DeFi Lending Credit Risk: A Three-Part Framework" (Vault Summit, April 2026); Morpho incident retrospective; Resolv governance forum post-mortem. Last verified: 2026-04-23.

What this framework does not capture

We are explicit about the limits of the rubric so users can apply it appropriately:

  • Regulatory risk varies by jurisdiction and changes faster than quarterly re-scoring can capture. We flag major regulatory actions on individual research pages as they happen.
  • Systemic contagion (what happens if a depeg cascades through 10 protocols that share collateral types) is not directly scored. Oracle and Liquidity sub-criteria cover the proximate risks; true contagion requires separate stress-test analysis.
  • Insurance and cover costs are not in the rubric. A protocol with expensive Nexus Mutual cover is signaling higher perceived risk from specialist underwriters, which is information we recommend users incorporate separately.
  • Team reputation and off-chain conduct are partially captured in Governance and Signer Diversity sub-criteria but not exhaustively. We do not score team members individually.

How scores are updated

Sub-criteria and dimension scores are reviewed on an ongoing basis. Changes are logged in the defi-risk-scores.json source file with a bumped lastUpdated date. Major changes (Chaos Labs departing Aave, Drift being attacked, the April 2026 Kelp / LayerZero incident, a new audit round published) trigger same-day re-scoring. Minor drift is re-evaluated weekly.

The framework itself is versioned. The current version (v2) was published in April 2026 after decomposing the original six-dimension aggregate scores into the twenty sub-criteria documented here.

See the framework in action

Every protocol on the DeFi Risk Map has its six dimension scores and twenty sub-criteria visible in context.

Open the DeFi Risk Map →