vSAN / VergeFS: Software-Defined Storage
What is vSAN / VergeFS?
Section titled “What is vSAN / VergeFS?”vSAN (Virtual Storage Area Network), also known as VergeFS, is the software-defined distributed storage system built into every VergeOS deployment. It pools the physical (or virtual) drives across all storage-participating nodes into a single, shared storage resource for the entire system.
There is no external SAN, NAS, or third-party storage software required. vSAN is integrated directly into the VergeOS kernel and operates at the block level, providing storage for all VM disks, snapshots, ISO images, and system metadata.
Key characteristics:
- Block-level architecture — VM disks are divided into blocks, each identified by a cryptographic hash
- Distributed across nodes — Data blocks are spread across all storage-participating nodes in the cluster
- Tiered storage — Up to 6 tiers (0–5) let you match media type to workload requirements
- Inline deduplication — Hash-based block identification enables automatic deduplication across all tiers
- Self-healing — Automatic failure detection, failover, and data rebuild without manual intervention
The Tier System
Section titled “The Tier System”VergeOS vSAN organizes drives into tiers numbered 0 through 5. Each tier is designed for a different class of storage media and workload profile. During installation, each physical drive is assigned to a specific tier, and that assignment determines how the drive is used by the system.
Tier 0: Metadata
Section titled “Tier 0: Metadata”- Hardware: High-endurance NVMe SSDs
- Purpose: Stores the vSAN filesystem index and internal metadata exclusively
- Key requirement: Every node that participates in storage must have at least one tier-0 drive
- Best practice: Use the highest-endurance NVMe drives available; maintain at least 10% free space
Tier 0 is not used for workload data. It holds the hash map that tracks every data block’s location, redundancy state, and version. Because every read and write operation begins with a metadata lookup, tier-0 performance directly impacts overall system responsiveness.
Tiers 1–5: Workload Data
Section titled “Tiers 1–5: Workload Data”| Tier | Hardware | Purpose | Typical Use Cases |
|---|---|---|---|
| Tier 1 | High-endurance NVMe SSDs | Write-intensive workloads | High-performance databases, transaction logs |
| Tier 2 | Mid-range SSDs | Balanced read/write workloads | General-purpose VMs, mixed applications, dev environments |
| Tier 3 | Read-optimized SSDs | Read-intensive workloads | Content delivery, application repos, reference data |
| Tier 4 | High-capacity HDDs | Less frequently accessed data | File servers, backup targets |
| Tier 5 | Archival-grade HDDs | Cold storage and long-term retention | Compliance archives, backup archives |
Not every deployment uses all five workload tiers. A common production configuration might use only tier 1 (NVMe for performance-sensitive workloads) and tier 4 (HDD for capacity). The Terraform playground uses tier 0 and tier 1 only.
How Data is Distributed
Section titled “How Data is Distributed”vSAN uses a hash-based distribution algorithm to spread data blocks across all nodes in the cluster. Here is how it works:
Block Creation and Hashing
Section titled “Block Creation and Hashing”- When a VM writes data, vSAN divides the write into data blocks
- Each block is assigned a cryptographic hash that serves as its unique identifier
- The hash determines both the block’s storage location and enables deduplication — if two blocks produce the same hash, only one copy is stored
Cross-Node Distribution
Section titled “Cross-Node Distribution”Data blocks are distributed across multiple nodes in the cluster rather than stored on a single node. This design provides:
- Balanced performance — I/O load is spread across all storage-participating nodes
- Fault tolerance — No single node holds all copies of any dataset
- Efficient scaling — Adding a node automatically expands the storage pool and triggers rebalancing
Read and Write Paths
Section titled “Read and Write Paths”Reads:
- The system looks up the block’s location via the tier-0 hash map
- Reads prioritize the primary copy for efficiency
- If the VM is running on the same node as a redundant copy, vSAN reads the local copy to minimize network traffic
- If the primary copy is slow or unresponsive, vSAN automatically fails over to the redundant copy
Writes:
- New blocks are hashed and placed on the optimal node
- Both the primary and redundant copies are written simultaneously
- Write is only acknowledged after both copies are confirmed
- The tier-0 metadata is updated to track the new block’s location
Redundancy and Self-Healing
Section titled “Redundancy and Self-Healing”vSAN maintains multiple copies of every data block to protect against hardware failures. The redundancy level is configured at the system level and applies per tier.
Redundancy Levels
Section titled “Redundancy Levels”| Feature | N+1 (RF2) — Default | N+2 (RF3) |
|---|---|---|
| Copies of data | 2 | 3 |
| Simultaneous failures tolerated | 1 node | 2 nodes |
| Minimum controller nodes | 2 | 3 |
| Recommended nodes | 3 | 5 |
| Storage overhead (before dedup) | ~2× | ~3× |
- N+1 (RF2) is the default and is suitable for most production environments
- N+2 (RF3) is available for ultra-critical workloads or remote sites where hardware replacement is slow
- Redundancy level is typically set during installation and applies system-wide
- A failure only affects the tier where the failed drives reside — other tiers remain fully operational
Self-Healing Process
Section titled “Self-Healing Process”When a node or drive fails, vSAN automatically begins recovery:
- Detection — vSAN detects the failure automatically
- Failover — Reads and writes are redirected to redundant copies with no VM downtime
- Rebuild — Missing data blocks are re-replicated to remaining healthy nodes
- Restoration — Full redundancy is restored without manual intervention
Drive Assignment in Practice
Section titled “Drive Assignment in Practice”During VergeOS installation, each physical drive is assigned to a specific vSAN tier. The installer uses the YC_DRIVE_LIST and YC_VSAN_TIER_LIST variables (whether set interactively or via a seed file) to map drives to tiers.
Assignment Rules
Section titled “Assignment Rules”- Every storage-participating node needs at least one tier-0 drive for metadata
- Drives within the same tier should be of similar type and performance characteristics
- When scaling up (adding drives), you must add equal drives across all nodes in the cluster to maintain balanced distribution
- When scaling out (adding nodes), new nodes must match the existing cluster’s drive configuration
Example: 2-Node HCI Configuration
Section titled “Example: 2-Node HCI Configuration”In the Terraform playground’s simplest deployment, each controller node has:
| Drive | Tier | Purpose |
|---|---|---|
| 1× NVMe (small) | Tier 0 | Metadata — vSAN hash map and filesystem index |
| 1× NVMe (large) | Tier 1 | Workload data — VM disks, snapshots, ISOs |
Both nodes contribute their drives to the same vSAN pool. With N+1 redundancy (default), every block written to tier 1 on node 1 has a redundant copy on node 2, and vice versa.
Additional vSAN Features
Section titled “Additional vSAN Features”Inline Deduplication
Section titled “Inline Deduplication”Because every data block is identified by its cryptographic hash, vSAN automatically detects duplicate blocks. If two VMs (or two regions within the same VM disk) write identical data, only one copy of that block is stored. This operates inline — during the write path — with no separate deduplication job or schedule.
Encryption
Section titled “Encryption”vSAN supports AES-256 encryption at rest, configured during initial installation. Encryption keys can be stored on USB drives (plugged into the first two controller nodes) or entered manually at boot time. All data across all tiers is encrypted transparently.
Snapshots and Clones
Section titled “Snapshots and Clones”vSAN’s block-level architecture enables space-efficient snapshots — a snapshot records the hash map state at a point in time rather than copying data blocks. Clones similarly reference existing blocks, only consuming additional space when data diverges.
Key Takeaways
Section titled “Key Takeaways”| Concept | Summary |
|---|---|
| vSAN / VergeFS | Built-in distributed storage — no external SAN/NAS required |
| Tier 0 | Metadata only (NVMe). Required on every storage node. |
| Tiers 1–5 | Workload data, from high-performance NVMe to archival HDD |
| Data distribution | Hash-based, spread across all storage nodes |
| Redundancy | N+1 (2 copies, default) or N+2 (3 copies) — system-wide per tier |
| Self-healing | Automatic failover and rebuild on failure |
| Deduplication | Inline, hash-based, across all tiers |
| Compression | Not at rest — only during site-sync replication |
Next Steps
Section titled “Next Steps”Now that you understand how VergeOS stores data, the next topic covers the network fabric that connects all nodes and carries vSAN replication traffic: Core Fabric & Networking →