Skip to content

Clusters & Node Types

A cluster in VergeOS is a logical grouping of nodes with the same hardware characteristics, forming a resource pool presented as usable assets in the VergeOS user interface. Clusters enable efficient management, scaling, and high availability for virtualized workloads.

Every VergeOS system starts with at least one cluster — the initial two controller nodes form the first cluster during installation. From there, you can add nodes to the existing cluster or create additional clusters with different roles and hardware profiles.

Clusters serve several purposes:

  • Resource pooling — Nodes in a cluster share compute and/or storage resources, presented as a unified pool
  • Workload placement — VMs and tenants are assigned to specific clusters (with optional failover clusters), ensuring workloads run on appropriate hardware
  • Hardware optimization — Different clusters can have different hardware profiles: high-memory nodes for databases, GPU-equipped nodes for rendering, NVMe-dense nodes for storage-intensive workloads
  • Independent scaling — Add capacity to specific resource pools without affecting others

VergeOS supports three distinct cluster types that can be mixed and matched within a single system:

Cluster TypeProvidesvSAN ParticipationTypical Use Case
Combined (HCI)Compute + StorageYes — nodes contribute storage disks to vSAN tiersGeneral-purpose workloads, small-to-medium deployments
Storage-OnlyStorage onlyYes — nodes contribute storage onlyDedicated storage expansion in UCI architectures
Compute-OnlyCompute onlyNo — boot-only or PXE bootHigh-compute workloads (ML, rendering, data analytics)

Every physical or virtual server in a VergeOS system is a node. Nodes differ in how they join the system, what role they play, and which cluster they belong to. VergeOS defines four node types:

The first two nodes in every VergeOS system are controller nodes. They are special because:

  • Node 1 creates a brand-new VergeOS system. It initializes the vSAN, creates the first cluster, and runs post-install configuration (network setup, cluster creation for additional node types, etc.)
  • Node 2 joins the system created by Node 1 as the second controller, providing redundancy for all system management functions

Controller nodes always belong to Cluster 1. In an HCI topology, they provide both compute and storage. In a UCI topology, they manage the system but delegate storage and compute to dedicated clusters.

The first cluster must include at least two nodes with Tier 0 storage (metadata drives) — this is a hard requirement because Tier 0 holds the vSAN filesystem index and must be redundant.

Scale-out nodes expand an existing HCI cluster by adding more compute and storage capacity. Key characteristics:

  • Identical hardware to the controller nodes in the cluster they join (same CPU generation, similar storage layout, matching NIC configuration)
  • Join the existing cluster automatically via network auto-detection — the node discovers the VergeOS system on the core fabric and joins without manual cluster assignment
  • Disks integrate seamlessly into the existing vSAN tiers
  • Contribute both compute (run VMs) and storage (vSAN participation)

Scale-out nodes are the simplest way to grow an HCI deployment — add a node and the cluster’s compute and storage capacity increases proportionally.

Storage-only nodes are dedicated exclusively to expanding vSAN capacity. They:

  • Contribute disks to vSAN tiers but do not run VM workloads
  • Belong to a storage-only cluster (e.g., Cluster 2)
  • Require creating the storage cluster in the VergeOS UI before adding the first storage node
  • Are used in UCI architectures where storage and compute scale independently

Compute-only nodes provide processing power without participating in vSAN storage. They:

  • Run VM workloads but have no local vSAN storage (boot-only disk or PXE boot)
  • Belong to a compute-only cluster (e.g., Cluster 3)
  • Require creating the compute cluster in the VergeOS UI before adding the first compute node
  • Access storage over the core fabric from nodes in HCI or storage-only clusters

Compute-only nodes are ideal for workloads that need high CPU/RAM/GPU density without proportional storage growth — machine learning, rendering, data analytics, or VDI.

Node TypeRoleClustervSANRuns VMsJoin Method
Controller (Node 1)Creates new systemCluster 1Yes (Tier 0 + workload tiers)Yes (HCI) or No (UCI)New system creation
Controller (Node 2)Joins as redundant controllerCluster 1Yes (Tier 0 + workload tiers)Yes (HCI) or No (UCI)Joins Cluster 1
Scale-outAdds HCI capacityCluster 1Yes (workload tiers)YesAuto-detect on core fabric
Storage-onlyDedicated storage expansionCluster 2+Yes (workload tiers)NoJoins designated storage cluster
Compute-onlyDedicated compute expansionCluster 2+No (boot-only / PXE)YesJoins designated compute cluster

The node joining process follows a strict sequence to prevent race conditions:

Key rules for node joining:

  1. Node 1 must complete installation before Node 2 can join — Node 2 needs an existing system to connect to
  2. Nodes join sequentially within a cluster — Node 3 after Node 2, Node 4 after Node 3, etc. — to prevent race conditions during cluster membership changes
  3. Storage clusters must exist before storage nodes can join — create the cluster in the VergeOS UI first
  4. Compute clusters must exist before compute nodes can join — same prerequisite
  5. If deploying both storage and compute clusters, storage nodes should be added first so compute nodes can immediately access vSAN storage

Clusters are numbered starting from 1 and can be renamed in the VergeOS UI:

Cluster NumberDefault RoleTypical Name
Cluster 1HCI (controllers + optional scale-out)“HCI”, “Default”, or “Controllers”
Cluster 2Storage-only (if UCI) or Compute-only (if hybrid)“Storage” or “Compute”
Cluster 3Compute-only (in full UCI with 3 clusters)“Compute”

In a full UCI deployment with 3 clusters:

  • Cluster 1: Controllers (system management, Tier 0 metadata)
  • Cluster 2: Storage nodes (all vSAN workload storage)
  • Cluster 3: Compute nodes (all VM execution)

Minimum Requirements and High Availability

Section titled “Minimum Requirements and High Availability”
RequirementDetail
Minimum nodes per system2 (one controller pair)
Minimum nodes per cluster2 (for redundancy during maintenance or failure)
Controller nodesExactly 2 per system — must have Tier 0 storage for vSAN metadata
HA behaviorIf one node fails, its workloads migrate to the surviving node(s) in the same cluster
Maintenance modeNodes can be placed in maintenance mode; workloads are live-migrated to other nodes in the cluster before maintenance begins

VergeOS systems scale from a minimum 2-node HCI cluster to large multi-cluster deployments with 200+ nodes. The scaling strategy depends on your architecture:

Add scale-out nodes to Cluster 1. Each node adds both compute and storage proportionally.

Best for: Balanced growth where compute and storage needs increase together.

Add nodes to specific clusters based on which resource is the bottleneck:

  • Need more storage? Add nodes to the storage cluster
  • Need more compute? Add nodes to the compute cluster
  • Need more of both? Add to both clusters independently

Best for: Workloads with unbalanced resource demands (e.g., heavy storage with light compute, or GPU-dense compute with modest storage).

  • Hardware consistency within clusters — Use the same hardware specs for all nodes in a cluster. Mixing different hardware within a cluster can cause performance and reliability issues.
  • Plan for N+1 redundancy — Size each cluster so that losing one node still leaves enough capacity for all workloads
  • Monitor before scaling — Use VergeOS dashboard metrics (CPU utilization, RAM usage, vSAN capacity) to identify which resource needs expansion
  • Scale without downtime — New nodes can be added to a running system without interrupting existing workloads

The Terraform playground demonstrates four common topologies that map to real-world deployment patterns:

TopologyNodesClustersArchitectureWhen to Use
2-Node HCI2 controllers1 (HCI)HCISmall sites, edge, PoC, basic evaluation
HCI + Scale-Out2 controllers + N scale-out1 (HCI)HCIGrowing HCI deployments needing balanced scaling
Hybrid (2 clusters)2 controllers + N compute2 (HCI + Compute)HybridCompute-heavy workloads with modest storage
UCI (3 clusters)2 controllers + N storage + M compute3 (Controller + Storage + Compute)UCILarge deployments needing independent compute/storage scaling
ConceptSummary
ClusterLogical grouping of nodes with same hardware, forming a resource pool
Three cluster typesHCI (compute + storage), Storage-only, Compute-only — mixable within one system
Four node typesController, Scale-out, Storage-only, Compute-only — each with a specific role and join method
Minimum 2 nodesPer cluster for redundancy; controllers require Tier 0 storage
Sequential joiningNodes join one at a time to prevent race conditions
Hardware consistencyAll nodes in a cluster should have matching hardware specifications
Independent scalingUCI architecture allows adding compute or storage capacity independently
200+ nodesSystems scale from 2-node HCI to large multi-cluster deployments

You now understand how VergeOS organizes nodes into clusters and how different node types serve different roles. In the hands-on lab, you will explore these concepts using the Terraform playground: Lab: Architecture Exploration →