Troubleshooting
When Mesh Hypervisor hits a snag, this section helps you diagnose and fix common issues. All steps are run from the central orchestration node’s CLI unless noted. For setup details, see Usage.
Prerequisites
Ensure:
- Central node is booted (see Installation).
- You’re logged in (
root
/toor
at the console).
Checking Logs
Start with logs—they’re your first clue:
mesh system logview
This opens lnav
in /var/log/pxen/
, showing DHCP, PXE, HTTP, and service activity. Scroll with arrows, filter with /
(e.g., /error
), exit with q
.
Common Log Issues
- DHCP Requests Missing: No nodes booting—check network cables or PXE settings.
- HTTP 403 Errors: Permissions issue on
/srv/pxen/http/
—runchmod -R 644 /srv/pxen/http/*
. - Kernel Downloaded, Then Stops: APKOVL fetch failed—verify UUID matches in
/host0/machines/<folder>/uuid
. Check permissions on/srv/pxen/http
.
Node Not Booting
If mesh node info
shows no nodes:
- Verify PXE: On the node, ensure BIOS/UEFI is set to network boot.
- Check Logs: In
mesh system logview
, look for DHCP leases and kernel downloads. - Test Network: From the central node:
Findping <node-ip>
<node-ip>
in logs (e.g., DHCP lease). No response? Check cables or switches.
Workload Not Starting
If mesh workload start -n <uuid> -w <name>
fails:
- Check Logs: Run
mesh system logview
—look for QEMU or KVM errors. - Verify Config: Ensure
/var/pxen/monoliths/<name>.conf
exists and matches-w <name>
—see Configuring Workloads. - Resources: SSH to the node:
Confirm RAM and CPU suffice (e.g., 500M RAM formesh node ctl -n <uuid> free -m; cat /proc/cpuinfo
qemutest1
). - Restart: Retry:
mesh workload soft-stop -n <uuid> -w <name> mesh workload start -n <uuid> -w <name>
Network Issues
If a node’s IP or VXLAN isn’t working:
- Check IP: On the node:
No static IP? Verifymesh node ctl -n <uuid> "ip addr"
interfaces
inmanifest
—see Network Configuration. - VXLAN Bridge: Check bridge existence:
Missing? Ensuremesh node ctl -n <uuid> "ip link show br456"
/var/pxen/networks/manage.conf
is installed. - Ping Test: From the node:
No reply? Check VXLAN config inmesh node ctl -n <uuid> "ping6 -c 4 fd42:2345:1234:9abc::1"
/host0/network/
.
Time Sync Problems
If nodes show as offline in mesh node info
:
- Check Time: On the node:
Off by hours? Time sync failed.mesh node ctl -n <uuid> "date"
- Fix Chrony: Ensure
ntp_sync
group is applied (e.g., ingroups
file)—see Configuring Nodes. - Restart Chrony: On the node:
mesh node ctl -n <uuid> "rc-service crond restart"
Notes
- Logs are verbose—most errors trace back to permissions, network, or config mismatches.
- If stuck, rebuild configs with
mesh system configure
and reboot nodes. - For manifest tweaks, see Manifest Files; for node control, see Managing Nodes.
Next, explore Advanced Topics.