Build a highly available Proxmox VE cluster consisting of:
* 2 hypervisor nodes * 1 external QDevice (quorum server) * shared storage (e.g. NFS)
The goal is a stable cluster with quorum and failover capability.
—
* Node A (PVE) * Node B (PVE) * QDevice (separate system, e.g. Raspberry Pi) * Shared Storage (NFS or equivalent)
—
* Time synchronization (NTP) * Consistent name resolution (DNS or /etc/hosts) * Static IP addresses (no changing SLAAC addresses for cluster communication) * Reliable network connectivity between all systems * SSH access between nodes and QDevice (key-based)
—
* Mount shared storage on both nodes * Configure storage in Proxmox * Ensure access from both nodes
—
* Create cluster on Node A * Join Node B to the cluster
—
* Set up the QDevice system * Install and start corosync-qnetd
—
* Add QDevice from the cluster * Initialize certificate-based communication
—
* Move or restore VMs to shared storage * Verify functionality
—
* Define HA groups * Assign VMs to groups * Configure failover behavior
—
* Node already contains VMs or old cluster state * Solution: clean node before joining
—
* Missing packages on nodes (e.g. corosync-qdevice) * Incomplete initial setup
—
* QDevice logs:
SSL peer cannot verify your certificate
* Cause:
* Solution:
—
* NSS DB not accessible
* Cause:
* Solution:
—
* Root login via password disabled
* Key-based authentication not working
* Solution:
—
* Asymmetric routing
* Wrong interface selection
* Solution:
—
* Inconsistent forward/reverse resolution
* Solution:
—
* Logs show repeated disconnects
* Cause:
—
* Use static IP addresses for cluster communication * Use dedicated network for Corosync and migration * Validate shared storage before cluster setup * Run QDevice on independent infrastructure * Perform certificate setup cleanly and only once
—
* Cluster with quorum (2 nodes + QDevice) * HA-capable environment * VMs can restart on surviving node after failure
—
A 2-node cluster without QDevice is not quorum-capable. QDevice is mandatory for stable operation.
—