Skip to content

Lab: Explore the Architecture

In this lab, you will explore the VergeOS Terraform Playground — an open-source project that deploys virtual VergeOS systems using Terraform. By reading the code and documentation, you will reinforce the architecture concepts covered in this module: core fabric networking, vSAN storage tiers, cluster organization, and HCI vs UCI topologies.

  • Part 1 — Read the playground’s architecture documentation and Terraform code to identify how VergeOS concepts map to infrastructure-as-code
  • Part 2 — Given a customer scenario, recommend and diagram a deployment topology
  • Part 3 — Compare the four example deployment configurations and analyze their differences
  • A GitHub account (to clone the repository)
  • Git installed on your workstation
  • A text editor or IDE (VS Code recommended)
  • No VergeOS system access is required — this lab is a reading and design exercise

30 minutes


In this section, you will clone the Terraform playground repository and trace how VergeOS architecture concepts are expressed in infrastructure-as-code.

  1. Clone the repository

    Terminal window
    git clone https://github.com/verge-io/vergeos-terraform-playground.git
    cd vergeos-terraform-playground
  2. Read the architecture documentation

    Open docs/architecture.md and read through the entire document. As you read, identify the answers to these questions:

    • What are the four deployment scenarios supported by the playground?
    • What is an install seed file and how does it enable unattended installation?
    • What is the minimum deployment size?
  3. Examine the deployment scenario diagrams

    Open docs/deployment-scenarios.md and study the Mermaid topology diagrams for each scenario. For each one, note:

    • How many nodes are involved
    • How many clusters are created
    • Which node types appear (controller, scale-out, storage, compute)
    • How all nodes connect to the core fabric and external network
  4. Trace the core fabric in Terraform

    Open main.tf (the root module) and find the two core fabric network resources. Answer these questions:

    • What are the resources named? (core_fabric_1 and core_fabric_2)
    • What MTU is configured? (9142 — jumbo frames for vSAN replication)
    • Is DHCP enabled on these networks? (No — dhcp_enabled = false)
    • What ipaddress_type is set? (none — these are Layer 2 transports)
    # You should find resources like this in main.tf:
    resource "vergeio_network" "core_fabric_1" {
    name = "${var.system_name}-core-fabric-1"
    enabled = true
    dhcp_enabled = false
    on_power_loss = "power_on"
    mtu = 9142
    ipaddress_type = "none"
    }
  5. Examine how Node 1 differs from Node 2

    Open modules/controllers/main.tf and compare verge_node_1 and verge_node_2. Key differences to identify:

    • Cloud-init template — Node 1 uses user-data-node1.yaml (creates a new system with YC_VSAN_NEW=1). Node 2 uses user-data-node2.yaml (joins the existing system with YC_VSAN_NEW=0).
    • Post-install API setup — Node 1’s cloud-init includes a script that configures update sources, enables SSH, and optionally creates storage/compute clusters via the VergeOS API. Node 2 has no post-install script.
    • Dependency chain — Node 2 has a depends_on reference to Node 1, ensuring the system is fully initialized before the second controller attempts to join.

    Both nodes share the same VM structure: Linux OS family, nested virtualization enabled, three virtio NICs (external, core fabric 1, core fabric 2), CD-ROM with the VergeOS ISO, and a cloud-init nocloud datasource.

  6. Answer the comprehension questions

    Write your answers to the following (or discuss with your training partner):

    #QuestionExpected Answer
    1Why does the core fabric use two separate switches?Redundancy — if one switch or path fails, the other maintains inter-node connectivity
    2Why is DHCP disabled on the core fabric networks?Core fabric uses static IP addressing; the VergeOS installer configures addresses via the install seed
    3Why must Node 2 wait for Node 1 to complete before starting?Node 1 creates the VergeOS system; Node 2 needs an existing system to join
    4What traffic types flow over the core fabric?vSAN replication, cluster coordination, VM live migration, control plane communication
    5Why is quantity_tier_1_disks set to 0 for controllers when storage nodes are enabled?In UCI mode, dedicated storage nodes provide all tier-1 capacity; controllers only need tier-0 for metadata

Now apply what you have learned. Given a customer scenario, recommend a deployment topology and justify your decision.

Midwest Manufacturing Co. is migrating from a VMware vSphere environment with 3 ESXi hosts. They currently run 50 VMs (mix of Windows and Linux), have ~10 TB of usable storage, and expect moderate growth over the next 2 years. They have a small IT team (2 people) and want to minimize operational complexity. Budget is constrained.

  1. Choose HCI or UCI

    Based on the customer profile, which deployment model do you recommend? Consider:

    • Team size — A 2-person IT team favors simplicity
    • Growth pattern — “Moderate growth” suggests balanced compute/storage scaling
    • Budget — HCI requires fewer total nodes than UCI for the same capacity
    • Current environment — 3 ESXi hosts maps well to a small HCI cluster
  2. Determine the node count and layout

    Sketch or describe your proposed topology:

    • How many controller nodes? (Minimum 2 for HA)
    • Do you need scale-out nodes? (Consider: 50 VMs on 2 nodes may be tight; 2 scale-out nodes give headroom)
    • How many clusters? (1 for HCI)
    • What about storage capacity? (10 TB usable means ~20 TB raw with replication across nodes)

    A reasonable design:

  3. Map to a playground example

    Which Terraform playground example file most closely matches your design?


Compare all four example .tfvars files from the examples/ directory. Fill in the comparison table below.

Open each file and identify the configuration values. Use the table to record your findings.

File: examples/2-node-hci.tfvars

  • Scenario: 2-Node HCI (Single Cluster)
  • Total nodes: 2
  • Clusters: 1
  • Node types: 2 controllers (storage + compute)
  • Toggle variables: None (all defaults)
  • Tier-1 disks on controllers: Yes (2 × 1000 GB each)
  • Best for: Basic testing, evaluation, smallest possible deployment

Complete this table as you review each file:

Attribute2-Node HCI4-Node HCIHybrid 2-ClusterUCI 3-Cluster
Total nodes2446
Clusters1123
Controller nodes2222
Scale-out nodes0200
Storage-only nodes0002
Compute-only nodes0022
Controllers have tier-1 disks?YesYesYesNo
Storage scalingAdd HCI nodesAdd HCI nodesAdd controllersAdd storage nodes
Compute scalingAdd HCI nodesAdd HCI nodesAdd compute nodesAdd compute nodes
ComplexityLowLowMediumHigh
Ideal use caseSmall / evalMid-size balancedCompute burstLarge / independent scaling

After completing the table, consider these questions:

  1. Why do controllers in the UCI scenario have zero tier-1 disks?

    In UCI, dedicated storage nodes provide all workload storage. Controllers only need tier-0 disks for vSAN metadata. This is visible in main.tf where quantity_tier_1_disks is conditionally set to 0 when create_storage_nodes = true.

  2. What is the dependency chain when both storage and compute nodes are enabled?

    Controllers → Storage nodes → Compute nodes. The compute module has an explicit depends_on to the storage module, ensuring the storage cluster exists before compute nodes attempt to join. This mirrors how VergeOS cluster creation works: storage must be available before compute workloads can run.

  3. How would you modify the 4-node HCI example to support 6 HCI nodes?

    Change quantity_scale_out_nodes from 2 to 4. The Terraform module creates additional scale-out nodes sequentially, each joining the same HCI cluster. No additional toggle variables are needed.


After completing this lab, you should be able to:

  • ✅ Navigate the VergeOS Terraform playground and understand its structure
  • ✅ Identify how core fabric networks, vSAN storage tiers, and node types are expressed in Terraform
  • ✅ Explain the differences between the four deployment scenarios (2-node HCI, 4-node HCI, hybrid 2-cluster, UCI 3-cluster)
  • ✅ Recommend an appropriate VergeOS topology for a given customer scenario
  • ✅ Trace the dependency chain from controllers through optional node types

Proceed to Module 2: Sizing & Design to learn how to translate customer requirements into specific hardware configurations and deployment plans.