A comprehensive guide to protocols building the future of open-source AI
30 min read
Last reviewed: January 2025
Advanced
Landscape Overview
The decentralized AI training space has evolved rapidly from theoretical concepts to functioning networks. Several teams are tackling different aspects of the challenge, each with distinct technical approaches and philosophies.
The Core Challenges
All decentralized training projects must solve these fundamental problems:
Communication Overhead — Reducing the data that must be shared between nodes during training
Verification — Proving that compute contributors are doing valid work
Incentives — Motivating participation without central coordination
Fault Tolerance — Handling nodes that go offline or behave unpredictably
Heterogeneous Hardware — Coordinating different GPU types with varying capabilities
Current State of Progress
Milestone
Status
Achieved By
1B parameter distributed training
Achieved
Prime Intellect (OpenDiLoCo)
10B parameter distributed training
Achieved
Prime Intellect (INTELLECT-1)
15B distributed training
Achieved
Nous Research (DisTrO)
40B parameter training
In Progress
Nous Research (Consilience)
70B parameter training
In Progress
Templar (Templar III)
Fully permissionless network
Live
Templar (Bittensor)
Context Check
For comparison, leading centralized models like GPT-4 are estimated to have trillions of parameters. Decentralized training has made meaningful progress but a significant gap to frontier scale remains.
N
Nous Research
Open-source AI research organization focused on creating and serving the best models in the open
Founded in 2022, Nous Research has become one of the most prolific contributors to both open-source AI models and decentralized training infrastructure. Their approach combines fundamental optimizer research with practical deployment systems.
Key Innovations
DeMo (Decoupled Momentum Optimization)
DeMo reduces communication overhead by 10x to 1,000x by splitting momentum into local and shared components. Instead of sharing all gradient updates, it selectively focuses on parameters changing most rapidly, then uses compression techniques (similar to JPEG image compression) to further reduce data transfer.
DisTrO (Distributed Training Optimizer)
Building on DeMo, DisTrO is a broader framework that addresses GPU synchronization, fault tolerance, and load balancing. In December 2024, Nous demonstrated DisTrO by training a 15B parameter model on a LlaMA-style architecture.
Psyche Network
Psyche is Nous's coordination framework for decentralized training. Key features include:
Epoch-based participation — Nodes can join and leave at natural breakpoints
Witness verification — Random subset of nodes verify each other's work
Solana integration — Blockchain tracks contributions and distributes rewards
Overlapping computation — Training continues while synchronization happens
Current Training Runs
Nous launched Consilience, a 40B parameter transformer being trained on roughly 20 trillion tokens across the Psyche network. This represents the largest decentralized training run by Nous to date.
Model Releases: Hermes Series
Beyond infrastructure, Nous has established credibility through successful model releases. The Hermes series of instruction-tuned LLMs has achieved competitive results on open leaderboards. Most recently, Hermes-4 focused on step-by-step reasoning while maintaining strong general instruction-following capabilities.
Funding
In April 2025, Nous closed a $50M Series A led by Paradigm, reaching a $1B valuation and becoming a leading unicorn in the Web3 AI space.
PI
Prime Intellect
Infrastructure for decentralized AI development at scale
Founded in 2024 by Vincent Weisser and Johannes Hagemann, Prime Intellect began by aggregating compute from centralized and decentralized providers and evolved into building comprehensive infrastructure for distributed training.
Key Innovations
OpenDiLoCo
An open-source implementation of Google DeepMind's DiLoCo (Distributed Low-Communication) method. In July 2024, Prime Intellect demonstrated 90-95% GPU utilization while achieving comparable training results with 500x less communication than traditional approaches.
PRIME Framework
Allows training to adapt when compute unexpectedly enters and leaves ongoing runs. Innovations include ElasticDeviceMesh for dynamic node participation.
PRIME-RL
A fully asynchronous reinforcement learning framework that decouples the training process into three independent stages: generating candidate answers, training on selected ones, and broadcasting updated weights. This architecture works across unreliable, geographically dispersed networks.
SHARDCAST
A peer-to-peer system for distributing large files (like model weights) quickly across the network without centralized servers.
Training Milestones
INTELLECT-1 (October 2024): The first 10B parameter model trained in a distributed manner across three continents and five countries. Training took 42 days with 83% utilization across all compute. GPUs were sourced from both Web2 and Web3 providers, including Akash, Hyperbolic, and Olas.
INTELLECT-2 (April 2025): A 32B parameter reasoning model trained using reinforcement learning on QwQ-32B. This marked Prime Intellect's shift toward RL-based post-training, which is naturally suited to decentralized execution.
Funding
In February 2025, Prime Intellect raised $15M in seed funding led by Founders Fund, with participation from Andrej Karpathy, Clem Delangue, Dylan Patel, and Balaji Srinivasan. Total funding exceeds $20M.
P
Pluralis Research
Protocol Learning: decentralized, incentivized, trustless model training
Founded in April 2023 by Alexander Long, Pluralis takes a fundamentally different approach through "Protocol Learning"—a framework emphasizing model ownership and monetization in a decentralized context.
Protocol Learning Framework
Three key principles distinguish Pluralis:
1. Unmaterializable Models
Model weights are sharded across nodes such that no single participant ever has the full set. This ensures models are "in-protocol assets" with controlled access and leakage resistance.
2. Model Parallelism Over the Internet
Unlike Nous and Prime Intellect (which primarily use data parallelism), Pluralis employs model parallelism—splitting the model itself across nodes connected via low-bandwidth connections.
3. Partial Ownership for Incentives
Contributors earn ownership stakes proportional to their training contribution, granting future revenue share and governance rights. This is fundamentally different from pay-per-compute models.
Technical Achievements
In June 2025, Pluralis announced successful training of an 8B parameter LLM based on Meta's Llama 3. They demonstrated 99% compression of forward and backward passes through column-space sparsification, achieving 100x reduction in network traffic without hurting accuracy.
The training used low-end consumer GPUs across four continents, connected only by 80 megabyte per second home internet links—proving model-parallel decentralized training is feasible.
SWARM Asynchronous Training
Their paper on asynchronous pipeline parallel training was accepted by ICML (one of the leading AI conferences). SWARM removes two classic bottlenecks: memory capacity and tight synchronization—enabling consumer GPUs to participate meaningfully.
Unlock the full Pluralis analysis
Pro members get detailed coverage of Protocol Learning, tokenomics design, and competitive positioning.
Verifiable execution layer for decentralized AI training
Gensyn published its first litepaper in February 2022, making it one of the earliest protocols focused specifically on verification for AI workloads. Rather than reinventing training paradigms, Gensyn builds the execution and verification layer that enables trustless compute.
Core Components
RL Swarm
A decentralized coordination mechanism for post-training reinforcement learning. The protocol uses a three-step loop:
Answer — Each participant generates model output
Critique — Other participants evaluate using a shared reward function
Resolve — Best responses are incorporated into the next model version
Verde Verification
Gensyn's trust layer using "refereed delegation." Every training task is dispatched to independent providers. If outputs match, the job is accepted. If they differ, a referee protocol locates the first divergence and re-computes only that single operation. This adds only a few percent overhead, not the 10,000x of full cryptographic proofs.
Skip-Pipe
Dynamic scheduling that skips or re-orders layers that would create delays, cutting iteration time by up to 55% and staying usable even if half the nodes fail.
Testnet Status
In March 2025, Gensyn deployed its testnet on a custom Ethereum rollup. Users can participate in RL Swarm, BlockAssist (training Minecraft agents), and Judge (verifiable AI evaluation).
Get the Gensyn deep dive
Detailed coverage of Verde verification, the GHOSTLY principles, and testnet progress.
Incentive-driven marketplace for decentralized AI on Bittensor
Templar launched in November 2024 as a subnet on the Bittensor network, distinguishing itself as the only protocol with a live, permissionless economic layer already integrated into its training framework.
Architecture
Templar uses data parallelism with two main actors:
Miners — Perform training tasks, synchronize with global model, compress gradients, submit updates
Validators — Download and decompress updates, apply them locally, compute loss deltas to score contributions
SparseLoCo (formerly CCLoco)
Templar's communication-efficient training technique. Instead of sending full updates every step, it shares only the most important changes at set intervals while maintaining a running tally.
Gauntlet Scoring
Uses OpenSkill to track miner skill ratings. High-quality miners gain higher ratings, increasing their influence on model aggregation and earning more TAO (Bittensor's native token).
Training Runs
Templar I: 1.2B parameter model with ~200 GPUs globally
Templar II: 8B parameter model (in progress)
Templar III: 70B parameter model—the largest pre-training run in the decentralized space to date
TAO Incentives
Templar receives ~4% of daily Bittensor emissions, putting it in the top six of the network's 128 subnets. Rewards are split 41% to miners, 41% to validators/stakers, and 18% to subnet owners.
Explore Templar and Bittensor economics
Full coverage of TAO incentives, subnet dynamics, and cat-and-mouse game design.
Live proof-of-concepts are no longer hypothetical — Networks are coordinating hundreds of GPUs to train mid-sized models in real time
Model sizes are climbing — From single-digit billion to 40-70B parameter models in one year
Post-training is a growing focus — RL workflows demand less bandwidth, making them well-suited for decentralized execution
Stacks are converging — Projects combine bandwidth-aware optimizers, compute exchanges, and coordination layers into complete pipelines
Sentiment is shifting — Recognition is growing that scalable decentralized training may be possible
Key Risks
Hardware optimization is a moving target — NVIDIA's Blackwell GPUs posted 2-2.6x faster training than the previous generation. Decentralized networks must keep pace.
Incumbents have open-sourced models — Releases like Llama blur the distinction between open and closed development
Talent acquisition remains difficult — Projects can't match the compensation packages of leading AI labs
Regulatory headwinds — Permissionless training raises safety concerns that could invite scrutiny
Incentives lag technical innovation — Most verification and reward mechanisms remain experimental
Distribution and monetization — Even if technical problems are solved, getting models adopted and generating revenue are separate challenges
Reality Check
Decentralized training has made meaningful progress but a significant gap to frontier scale remains. Competing with centralized labs that train trillion-parameter models requires further breakthroughs in communication efficiency, verification, and incentive design.
The Thesis
The core premise remains unchanged: crypto provides AI with a permissionless, trustless, and composable coordination layer. The challenge is proving that decentralized approaches can deliver practical advantages over centralized counterparts.
If even one project demonstrates that openness translates into faster iteration, novel architectures, or more inclusive governance, it would mark a breakthrough moment for both crypto and AI. The road ahead is long, but the core ingredients for success are now firmly on the table.
Disclaimer: This is educational content about emerging technology and protocols, not investment advice. The decentralized AI training space is evolving rapidly. Always do your own research and consider significant technical and economic uncertainties.
Want the complete picture?
Pro members get full access to protocol comparisons, risk frameworks, and emerging project coverage.