Troubleshooting

When Mesh Hypervisor hits a snag, this section helps you diagnose and fix common issues. All steps are run from the central orchestration node’s CLI unless noted. For setup details, see Usage.

Prerequisites

Ensure:

  • Central node is booted (see Installation).
  • You’re logged in (root/toor at the console).

Checking Logs

Start with logs—they’re your first clue:

mesh system logview

This opens lnav in /var/log/pxen/, showing DHCP, PXE, HTTP, and service activity. Scroll with arrows, filter with / (e.g., /error), exit with q.

Common Log Issues

  • DHCP Requests Missing: No nodes booting—check network cables or PXE settings.
  • HTTP 403 Errors: Permissions issue on /srv/pxen/http/—run chmod -R 644 /srv/pxen/http/*.
  • Kernel Downloaded, Then Stops: APKOVL fetch failed—verify UUID matches in /host0/machines/<folder>/uuid. Check permissions on /srv/pxen/http.

Node Not Booting

If mesh node info shows no nodes:

  1. Verify PXE: On the node, ensure BIOS/UEFI is set to network boot.
  2. Check Logs: In mesh system logview, look for DHCP leases and kernel downloads.
  3. Test Network: From the central node:
    ping <node-ip>
    
    Find <node-ip> in logs (e.g., DHCP lease). No response? Check cables or switches.

Workload Not Starting

If mesh workload start -n <uuid> -w <name> fails:

  1. Check Logs: Run mesh system logview—look for QEMU or KVM errors.
  2. Verify Config: Ensure /var/pxen/monoliths/<name>.conf exists and matches -w <name>—see Configuring Workloads.
  3. Resources: SSH to the node:
    mesh node ctl -n <uuid>
    free -m; cat /proc/cpuinfo
    
    Confirm RAM and CPU suffice (e.g., 500M RAM for qemutest1).
  4. Restart: Retry:
    mesh workload soft-stop -n <uuid> -w <name>
    mesh workload start -n <uuid> -w <name>
    

Network Issues

If a node’s IP or VXLAN isn’t working:

  1. Check IP: On the node:
    mesh node ctl -n <uuid> "ip addr"
    
    No static IP? Verify interfaces in manifest—see Network Configuration.
  2. VXLAN Bridge: Check bridge existence:
    mesh node ctl -n <uuid> "ip link show br456"
    
    Missing? Ensure /var/pxen/networks/manage.conf is installed.
  3. Ping Test: From the node:
    mesh node ctl -n <uuid> "ping6 -c 4 fd42:2345:1234:9abc::1"
    
    No reply? Check VXLAN config in /host0/network/.

Time Sync Problems

If nodes show as offline in mesh node info:

  1. Check Time: On the node:
    mesh node ctl -n <uuid> "date"
    
    Off by hours? Time sync failed.
  2. Fix Chrony: Ensure ntp_sync group is applied (e.g., in groups file)—see Configuring Nodes.
  3. Restart Chrony: On the node:
    mesh node ctl -n <uuid> "rc-service crond restart"
    

Notes

  • Logs are verbose—most errors trace back to permissions, network, or config mismatches.
  • If stuck, rebuild configs with mesh system configure and reboot nodes.
  • For manifest tweaks, see Manifest Files; for node control, see Managing Nodes.

Next, explore Advanced Topics.