Lab: HCI + Compute Deployment
Objective
Section titled “Objective”Deploy VergeOS in an HCI + Dedicated Compute configuration using the Terraform playground. You will provision a two-cluster topology — an HCI foundation cluster (controller + storage) and a dedicated compute-only cluster — then configure workload placement across clusters, validate independent compute scaling, and compare the operational model to pure HCI deployments.
Prerequisites
Section titled “Prerequisites”- Completed all prior modules (1–9)
- Completed the HCI Deployment Lab and UCI Deployment Lab
- Access to the vergeos-terraform-playground repository (cloned locally)
- Terraform CLI installed and configured
- A VergeOS environment or lab that supports nested deployments
- Familiarity with Terraform basics (init, plan, apply)
Difficulty
Section titled “Difficulty”Intermediate — Requires understanding of VergeOS cluster architecture, multi-cluster networking, and basic Terraform usage
Estimated Time
Section titled “Estimated Time”1.5 hours
Background: HCI + Dedicated Compute Architecture
Section titled “Background: HCI + Dedicated Compute Architecture”Before starting the lab, review the two-cluster model that defines HCI + Dedicated Compute:
Key principles:
- Cluster 1 (HCI) always includes Nodes 1 & 2 with controllers and Tier 0 storage. Optional Nodes 3–4 add storage and (optionally) compute capacity.
- Cluster 2 (Compute Only) contains nodes dedicated entirely to running workloads — no storage overhead, maximum resources for VMs.
- The Compute toggle on the HCI cluster controls whether HCI nodes can also run workloads alongside storage/control functions.
- All compute-node storage I/O traverses the core network to the HCI cluster, making inter-cluster bandwidth critical.
Part 1: Review the HCI + Compute Topology
Section titled “Part 1: Review the HCI + Compute Topology”Understand the configuration before deploying.
- In the terraform playground repository, navigate to the
examples/directory and identify the HCI + Compute.tfvarsfile (look for files referencing “hci-compute” or “hybrid” topologies) - Examine the variables and identify:
- How many clusters are defined and their roles (HCI vs compute-only)
- Node count and assignments per cluster
- The Compute toggle setting on the HCI cluster — is it enabled or disabled?
- Storage tier configuration (Tier 0 for metadata on controller nodes, Tier 1 for workload data)
- Network configuration for inter-cluster communication
- Compare this
.tfvarsfile with the pure HCI configurations from the previous lab. Note the structural differences:- Additional cluster definition for compute-only nodes
- Storage tier assignment — compute-only nodes have no storage tiers
- Network bandwidth requirements between clusters
- Review the deployment scenario documentation (
docs/deployment-scenarios.md) for the HCI + Compute section
Part 2: Deploy the HCI + Compute Topology
Section titled “Part 2: Deploy the HCI + Compute Topology”Provision the two-cluster environment.
- Run
terraform initto initialize the provider (if not already done) - Run
terraform plan -var-file=<hci-compute>.tfvarsand carefully review the planned resources:- Verify two separate clusters will be created
- Confirm node assignments match the expected topology
- Check that storage tiers are only assigned to HCI cluster nodes
- Validate network interfaces are configured for inter-cluster communication
- Run
terraform apply -var-file=<hci-compute>.tfvarsto deploy - Log into the VergeOS UI and verify the deployment:
- Clusters: Both clusters appear — one labeled HCI, one labeled Compute
- Nodes: Each node is assigned to its correct cluster
- Storage: vSAN storage pools exist only on the HCI cluster; compute-only nodes show no storage
- Networking: Core fabric network connects both clusters; inter-cluster connectivity is established
- Controllers: Controller VMs are running on Nodes 1 and 2 in the HCI cluster
Part 3: Configure Workload Placement
Section titled “Part 3: Configure Workload Placement”Practice placing workloads across the two-cluster topology.
-
Create a VM on the compute-only cluster:
- In the VergeOS UI, create a new VM and select the compute-only cluster for placement
- Assign CPU and memory resources
- Attach a virtual disk — note that the storage is provisioned from the HCI cluster’s vSAN even though the VM runs on a compute-only node
- Start the VM and verify it boots successfully
-
Create a VM on the HCI cluster (if Compute is enabled):
- Create a second VM, this time placing it on the HCI cluster
- Compare the resource availability between the two clusters
- Note the difference: HCI nodes share resources between storage/control and compute, while compute-only nodes dedicate all resources to workloads
-
Test VM migration between clusters:
- Attempt to migrate a VM from the compute-only cluster to the HCI cluster (and vice versa)
- Observe how VergeOS handles cross-cluster migration
- Document any constraints or considerations for cross-cluster workload movement
-
Monitor inter-cluster I/O:
- Open the VergeOS dashboard and navigate to network monitoring
- Observe the storage I/O traffic flowing from compute-only nodes to the HCI cluster
- Note the bandwidth utilization on the core network — this is why inter-cluster bandwidth planning is critical
Part 4: Validate Independent Compute Scaling
Section titled “Part 4: Validate Independent Compute Scaling”Demonstrate the scaling advantage of the HCI + Compute model.
-
Review compute-only cluster capacity:
- In the VergeOS UI, check the total CPU and memory available on the compute-only cluster
- Compare this to the HCI cluster’s available compute resources (after storage/control overhead)
- Document the effective compute capacity difference
-
Simulate a scale-out scenario:
- Review the
.tfvarsfile and identify how to add additional compute-only nodes - Modify the node count for the compute-only cluster (e.g., add 1–2 more nodes)
- Run
terraform planto preview the change — note that only compute nodes are added; storage is unaffected - Apply the change and verify the new nodes join the compute-only cluster
- Confirm that the HCI cluster is completely unchanged — no rebalancing, no storage disruption
- Review the
-
Compare scaling models:
Scaling Action Pure HCI HCI + Compute Add compute capacity Must add full HCI node (with storage) Add lightweight compute-only node Add storage capacity Add HCI node or expand existing disks Add node to HCI cluster only Scale independently ❌ Compute and storage coupled ✅ Compute scales independently Hardware flexibility All nodes need storage-class hardware Compute nodes optimized for workloads Operational complexity Simple — single cluster Moderate — two clusters, inter-cluster networking
Part 5: Explore the Compute Toggle
Section titled “Part 5: Explore the Compute Toggle”Understand the impact of the HCI cluster’s Compute setting.
-
Check the current Compute toggle state:
- In the VergeOS UI, navigate to the HCI cluster settings
- Identify whether the Compute toggle is currently enabled or disabled
- If enabled, note which workloads (if any) are running on HCI nodes
-
Understand the two modes:
Setting Behavior Best For Compute Enabled HCI nodes run workloads alongside storage/control Smaller deployments where maximizing utilization is preferred Compute Disabled HCI cluster dedicated to storage and control only Performance-sensitive environments where storage/compute isolation is preferred -
Document your recommendation:
- Based on the current deployment size, which Compute toggle setting would you recommend?
- What factors would cause you to change the setting?
- Note: Changing the Compute toggle may require a rolling restart of nodes in the HCI cluster — review the impact with VergeOS support before making this change in production
Part 6: Design Decision Exercise
Section titled “Part 6: Design Decision Exercise”Apply what you’ve learned to a real-world scenario.
-
Scenario: A customer currently runs a 4-node VergeOS HCI cluster. They need to add 50 new VMs for a development environment but don’t need additional storage. Their current storage utilization is only 40%, but CPU is at 75%.
-
Evaluate the options:
- Option A: Add 2 more HCI nodes (6-node HCI cluster)
- Option B: Add a 2-node compute-only cluster (4-node HCI + 2-node compute)
- Option C: Migrate to full UCI architecture
-
For each option, document:
- Hardware cost implications
- Operational complexity change
- Network requirements
- Future scalability path
- Your recommendation with justification
-
Bonus: Identify which terraform playground example most closely matches Option B, and list the
.tfvarsmodifications needed to match the customer’s requirements
Cleanup
Section titled “Cleanup”When finished with the lab:
- Remove all test VMs created during the lab
- Run
terraform destroyto tear down the entire HCI + Compute deployment - Verify all resources have been cleaned up in the VergeOS UI
Verification
Section titled “Verification”Your HCI + Compute deployment lab is complete when you can answer yes to all of the following:
- Successfully deployed a two-cluster HCI + Compute topology via Terraform
- Verified cluster roles (HCI vs compute-only), storage allocation, and networking in the VergeOS UI
- Created VMs on the compute-only cluster and confirmed storage was served from the HCI cluster
- Monitored inter-cluster storage I/O traffic on the core network
- Successfully scaled the compute-only cluster independently (added nodes without affecting storage)
- Documented the Compute toggle behavior and your recommendation for the deployment
- Completed the design decision exercise comparing HCI, HCI+Compute, and UCI options
- Cleaned up all lab resources with
terraform destroy