Common Issues & Resolutions

Troubleshooting Quick-Reference

This page compiles the most common issues encountered by VergeOS administrators, organized by subsystem. Each section includes symptoms, root causes, and step-by-step resolution procedures.

VM Network Connectivity

Network connectivity problems are the most common support topic. Before diving in, verify whether other VMs in the same environment can reach the internet. If none can, the issue is likely upstream of VergeOS (switch, firewall, ISP). If other VMs work fine, the problem is almost always a configuration miss on the affected VM.

Missing NIC Configuration

Symptom: VM boots but has no network interfaces visible in the guest OS.

Resolution:

Open the VM dashboard and check the NICs section
If no NIC is listed, click Add NIC
Select the correct network and set the interface type to VirtIO (recommended) or E1000 for legacy compatibility
Power-cycle the VM for the NIC to appear

Wrong Network Assignment

Symptom: VM has a NIC but cannot reach other VMs or the internet.

Resolution:

Navigate to the VM dashboard → NICs
Verify the NIC status is Up
Confirm the Network column shows the correct network — compare against a working VM in the same environment
If incorrect, edit the NIC and reassign it to the proper network
Power-cycle the VM

Missing VirtIO Drivers

Symptom: Windows VM shows no network adapter in Device Manager, even though a NIC is configured in VergeOS.

Resolution:

Verify a NIC exists in the VM’s NICs section within VergeOS
Connect to the VM via the Remote Console
Install the VirtIO drivers from the guest agent ISO — refer to the VergeOS documentation on VM Guest Agent for download and installation steps
After driver installation, Windows will detect the network adapter automatically

Improper Guest IP Configuration

Symptom: NIC is present and drivers are installed, but the VM still cannot reach the network.

Resolution:

Inside the guest OS, verify the network adapter is detected and enabled
For DHCP: ensure the network has a DHCP service running (check Networks → [Network] → DHCP)
For static IP: confirm the IP address, subnet mask, gateway, and DNS settings match the network design
Use the Network Diagnostics tool (ping, ARP scan) from the VergeOS network context to verify Layer 2 connectivity

Guest Memory Reporting

Administrators migrating from VMware or Nutanix often notice that VergeOS reports higher memory usage than they expect. This is by design — not a problem.

Allocated vs. Active Memory

Symptom: VergeOS shows a VM using 8 GB of RAM, but the guest OS task manager shows only 2 GB in use.

Explanation: VergeOS displays allocated memory — the physical RAM reserved on the host for that VM. When you assign 8 GB to a VM, the hypervisor immediately reserves 8 GB of physical memory, regardless of what the guest is actively consuming. This is the real resource commitment on the host.

No Memory Ballooning

Unlike VMware (which uses balloon drivers to reclaim unused guest memory) or Nutanix AHV (which supports dynamic memory management), VergeOS intentionally does not use memory ballooning. This design choice provides:

Predictable performance — no balloon driver overhead or surprise memory pressure
Simplified capacity planning — allocated = committed; no guessing about overcommit ratios
Enhanced reliability — no risk of balloon-induced OOM conditions inside guests
Accurate migration sizing — what you allocate is what you need on the destination host

Capacity Planning Best Practices

Metric	Where to Check	What It Means
VM Allocated RAM	VM Dashboard	Physical RAM reserved for this VM
Guest Active RAM	Inside guest OS (Task Manager / `free -h`)	What the guest is actually using
Node Available RAM	Node Dashboard → Memory	How much host RAM remains unallocated
Cluster Target Max RAM %	System → Settings → Advanced	Threshold for VM placement decisions

SEL Noise (False-Positive IPMI Logs)

Some server hardware generates repetitive, benign IPMI log entries that fill the System Event Log (SEL) and trigger unnecessary alerts. The most common offender is the “Get SEL Info command failed” message.

Understanding the SEL

The System Event Log is stored in hardware (on the BMC/IPMI controller) with limited capacity. Once full, new events cannot be recorded until the log is cleared. The node dashboard shows SEL capacity as a percentage bar.

Filtering SEL Noise via API

To suppress false-positive messages without losing real hardware alerts:

Navigate to System → API Documentation
Find the settings table and expand it
Click the POST option and enter this body:

{
  "key": "syslog_regex_list",
  "value": "2E2A4765742053454C20496E666F20636F6D6D616E64206661696C65642E",
  "default_value": "",
  "description": "Hex encoded lines of regular expressions to filter out of syslog"
}

Click Execute

The value is a hex-encoded regex: .*Get SEL Info command failed. — you can encode additional patterns at a hex encoding tool and separate multiple patterns with |.

Example — filtering two patterns:

The regex (Get SEL Info command failed|Unable to send command: Device or resource busy) encodes to:

284765742053454C20496E666F20636F6D6D616E64206661696C65647C556E61626C6520746F2073656E6420636F6D6D616E643A20446576696365206F72207265736F75726365206275737929

Restarting the IPMI Service

After applying the filter, restart log capture on each affected node:

Option A — Via UI:

Navigate to Infrastructure → Nodes → [Node]
Edit the node, uncheck “Capture System Logs”, submit
Wait 15 seconds
Edit the node again, re-enable “Capture System Logs”

Option B — Via SSH:

sudo systemctl restart openipmi

Clearing a Full SEL

If the SEL is already full:

Navigate to Infrastructure → Nodes → [Node]
Click Clear SEL in the left menu
Confirm with Yes

Windows: Unable to Connect to CIFS Shares

Symptom: Windows 10/11 clients cannot access CIFS shares, receiving “access denied” or “cannot connect” errors even with correct credentials.

Root Cause: Modern Windows defaults to disabling insecure guest logons for SMB connections.

Resolution — Enable Insecure Guest Logons:

Press Win + R, type gpedit.msc, press Enter
Navigate to: Computer Configuration → Administrative Templates → Network → Lanman Workstation
Locate Enable insecure guest logons → Right-click → Edit
Select Enabled → Click OK
Restart the Windows device

macOS: Connection Failures or Poor Performance

Symptom: macOS Finder cannot connect to CIFS shares, connections drop intermittently, or performance is unusable.

Resolution — Force SMB3 via nsmb.conf:

Open Terminal and create or edit the SMB configuration:

sudo nano /etc/nsmb.conf

Add the following:

[default]
smb_neg=smb3_only
signing_required=no

Clear the macOS SMB cache:

rm -rf ~/Library/Caches/com.apple.finder
killall Finder

Reconnect to the share

Samba Fruit Module: For full macOS compatibility (Spotlight indexing, Time Machine support, resource forks), enable the Samba fruit VFS module in the NAS volume advanced settings. This enables native Apple SMB extensions.

Permission Denied Errors

Symptom: Users receive “Access Denied” when browsing or opening files on a share, even though they can see the share name.

Resolution checklist:

Valid Users list: Navigate to NAS → Shares → [Share] and confirm the user or group is in the valid users list
Browseable setting: Ensure the share is set to browseable if users need to discover it
Force User / Force Group: If configured, verify the forced user/group has read/write permissions on the underlying volume
NAS service restart: After permission changes, restart the NAS service to apply

Slow CIFS Performance

Symptom: File transfers over CIFS are significantly slower than expected.

Resolution:

SMB protocol version: Under NAS → Volumes → [Volume] → Advanced Configuration, verify the minimum SMB protocol version. Setting it too low (SMB1) forces legacy negotiation
Network path: Use Network Diagnostics (ping, traceroute) to check latency between the client subnet and the NAS network
Connection load: Use NAS Diagnostics → Samba Status to check active connections and identify overloaded shares
NAS resources: Check CPU and memory allocation for the NAS service — under-provisioned NAS VMs will bottleneck throughput

Installation Troubleshooting

Boot Issues

Symptom: Node fails to boot from the VergeOS USB installer.

Resolution:

Verify BIOS/UEFI boot settings match the installation media type (UEFI recommended)
Test the USB media on a known-working system to rule out a bad drive
Confirm hardware compatibility — check that the CPU supports 64-bit with hardware virtualization (VT-x/AMD-V)
Disable Secure Boot in BIOS if the installer fails to load

Network Configuration Mismatches

Symptom: Installation completes but the node cannot communicate with other nodes or the network.

Resolution:

During installation, stop immediately if any detected IP or interface does not match your network design
Verify VLAN configurations match the switch port settings
Check physical cable connections — the installer auto-detects interfaces; mismatched cabling leads to wrong interface assignments
Confirm IP addressing does not conflict with existing devices on the network

Storage Controller JBOD Mode

Symptom: VergeOS installer does not detect all expected drives.

Resolution:

VergeOS requires drives to be presented as individual disks (JBOD/passthrough mode), not as RAID arrays
Enter the storage controller BIOS (e.g., PERC, MegaRAID) and configure each drive as a JBOD volume or individual RAID-0
Some controllers require firmware updates to support JBOD mode — consult the hardware vendor documentation

Secondary Node Join Failures

Symptom: Secondary controller or compute node fails to join the existing cluster.

Resolution:

Verify you selected “No” when asked if this is a new install (for secondary nodes)
Confirm you entered the admin credentials from the primary controller correctly
Ensure both nodes are on the same network and can reach each other (check switch port VLAN assignments)
Match encryption settings from the primary controller exactly
Match drive tier assignments from the primary controller
If the secondary boots but is not visible in the primary UI, check the core fabric network configuration and verify switch connectivity between nodes

Storage Issues

vSAN Degraded State

Symptom: Dashboard shows a vSAN tier in “degraded” or “not redundant” status.

Explanation: A degraded state means one or more drives in a tier have failed or are unavailable, but the vSAN is still operational. Data remains accessible because VergeOS maintains redundancy across nodes.

Resolution:

Navigate to System → vSAN → Drives to identify the failed drive(s)
Check the drive’s SMART data via Node Diagnostics → S.M.A.R.T. Diagnostic Test
If a physical replacement is needed, use Node Diagnostics → LED Control to light up the drive bay for identification
Contact Verge support for drive replacement guidance — the vSAN will automatically rebuild redundancy once a replacement drive is added

Drive Rebuild Times

Understanding expectations: Rebuild times depend on the amount of data on the tier and the I/O capacity of the remaining drives. During a rebuild:

The system remains fully operational
Write performance may be slightly reduced
Monitor progress via the vSAN dashboard’s tier progress indicator (100% = complete)

Capacity Threshold Warnings

Symptom: Dashboard alerts warn about storage capacity approaching limits.

Resolution:

Check tier utilization in System → vSAN — each tier shows used vs. total capacity
Review drive SMART data via Infrastructure → Nodes → [Node] → Diagnostics → S.M.A.R.T. Diagnostic Test to check wear levels and drive health indicators
For immediate relief, identify and remove unnecessary snapshots or unused VM drives
For long-term resolution, add drives or nodes to expand the tier — refer to the vSAN scale-up procedures

Utilization Level	Action Required
< 70%	Normal operation — no action needed
70–85%	Plan capacity expansion; review snapshot retention policies
85–90%	Actively reduce usage or add capacity
> 90%	Critical — prioritize expansion; risk of write failures

Troubleshooting Decision Tree

When facing an issue that doesn’t fit the categories above, follow this general workflow:

1. Scope the Problem

Is the issue affecting one VM, one network, one node, or the entire system? Scoping determines which diagnostic tool to start with.

2. Use Component Diagnostics

Start with the component-specific diagnostic tool (Network, Node, NAS, or vSAN Diagnostics) — they run in the correct context automatically.

3. Check System Logs

Review Dashboard logs and system alerts for correlated events. Look for patterns — did multiple alerts fire at the same time?

4. Escalate with Data

If the issue is not resolved, generate a System Diagnostics bundle (System → System Diagnostics) and submit it with your support request.

Common Issues & Resolutions

Troubleshooting Quick-Reference

VM Network Connectivity

Missing NIC Configuration

Wrong Network Assignment

Missing VirtIO Drivers

Improper Guest IP Configuration

Guest Memory Reporting

Allocated vs. Active Memory

No Memory Ballooning

Capacity Planning Best Practices

SEL Noise (False-Positive IPMI Logs)

Understanding the SEL

Filtering SEL Noise via API

Restarting the IPMI Service

Clearing a Full SEL

NAS Share Issues

Windows: Unable to Connect to CIFS Shares

macOS: Connection Failures or Poor Performance

Permission Denied Errors

Slow CIFS Performance

Installation Troubleshooting

Boot Issues

Network Configuration Mismatches

Storage Controller JBOD Mode

Secondary Node Join Failures

Storage Issues

vSAN Degraded State

Drive Rebuild Times

Capacity Threshold Warnings

Troubleshooting Decision Tree