If you're writing software that talks to vSphere or Proxmox VE, you've hit this wall: there's no easy way to develop against a real cluster on your laptop, and your CI pipeline can't reasonably spin up hardware for every PR. You end up either (a) testing manually against a shared lab cluster that breaks for everyone when one engineer typos a maintenance-mode toggle, or (b) writing only unit tests with hand-crafted JSON fixtures that pass forever while production silently regresses on the wire format.
There's a third option: in-process simulators that speak the real APIs. This post covers the two we use to build OpIntel — vcsim for vSphere and mock-pve-api for Proxmox — including real bugs each one has surfaced and the gotchas you'll trip over.
simulator.VPX() from govmomi. Free, fast, in-process, no Docker. Scale up via model.Datacenter/Cluster/Host/Machine.ghcr.io/jrjsmrtn/mock-pve-api. Docker image, two default nodes, create VMs/CTs over the API, expect a few endpoint gaps and add fallbacks.-tags=integration so unit tests stay fast.vcsim is a fully-featured vCenter simulator built into the govmomi Go SDK. It speaks SOAP, hosts a self-signed TLS endpoint, simulates the property collector, and supports power-on, snapshots, vMotion, alarms, performance counters, and most of the vim25 surface area. It's the same library every Go-based vSphere tool already imports, so adding the simulator is import .../simulator.
The minimum is one line:
model := simulator.VPX() // VPX = vCenter; ESX = standalone host
model.Create()
server := model.Service.NewServer()
defer server.Close()
// server.URL is now https://user:pass@127.0.0.1:<random-port>/sdk
But simulator.VPX() defaults are tiny (1 DC, 1 cluster, 2 hosts). To populate dashboards or stress-test inventory walks, override the model:
model := simulator.VPX()
model.Datacenter = 5 // 5 datacenters
model.Cluster = 12 // 12 clusters per DC
model.Host = 5 // 5 hosts per cluster
model.Machine = 60 // 60 VMs per cluster
model.Datastore = 5
model.Autostart = true // power on VMs at create time
That's 5 × 60 × 12 = 3600 VMs across 300 hosts, all returning QuickStats and PerfCounter data. On a recent laptop, model creation takes ~5 seconds; the resulting in-memory state happily serves a five-thousand-VM collector cycle in under 15 seconds.
If you don't want to write any glue code, govmomi also ships vcsim as a standalone binary with the same model knobs as CLI flags:
go install github.com/vmware/govmomi/vcsim@latest
vcsim -dc 5 -cluster 12 -host 5 -vm 60 -autostart -l 0.0.0.0:8989
# → export GOVC_URL=https://user:pass@127.0.0.1:8989/sdk GOVC_INSECURE=true …
Or run it without installing:
go run github.com/vmware/govmomi/vcsim@latest -dc 2 -cluster 4 -vm 40
The standalone binary doesn't power on VMs by default. If you want a more realistic mixed running/stopped state for dashboards, write a 30-line Go wrapper around simulator.VPX() that calls vm.PowerOn(ctx) on a percentage of guests after model.Create() — pattern is the same as the inline snippet above.
vm.PowerOn(ctx), host.EnterMaintenanceMode(ctx), and the like all return real Task objects you can .Wait() on. Good for testing your UPID/task tracking logic.host.summary.config.product.fullName was empty, breaking inventory display in any tool that relied on it.Two patterns work well:
// Pattern A: per-test simulator (best for isolated unit-style tests)
func TestVMPowerOn(t *testing.T) {
simulator.Test(func(ctx context.Context, c *vim25.Client) {
finder := find.NewFinder(c)
vm, _ := finder.VirtualMachine(ctx, "DC0_C0_RP0_VM0")
task, _ := vm.PowerOn(ctx)
if err := task.Wait(ctx); err != nil {
t.Fatal(err)
}
})
}
// Pattern B: shared simulator via TestMain (best for integration suites
// that exercise the same large inventory across many tests)
var sharedURL string
func TestMain(m *testing.M) {
model := simulator.VPX()
model.Cluster = 8
model.Create()
s := model.Service.NewServer()
sharedURL = s.URL.String()
code := m.Run()
s.Close()
os.Exit(code)
}
The Proxmox ecosystem doesn't have a govmomi-style first-party simulator, but ghcr.io/jrjsmrtn/mock-pve-api is the de-facto community option. It's a Python image that responds to a useful subset of the PVE 8.x REST API: nodes, storage, qemu, lxc, snapshots, migrate, backup jobs, firewall rules, SDN zones, cluster resources.
docker run --rm -d --name mock-pve -p 8006:8006 ghcr.io/jrjsmrtn/mock-pve-api:latest
curl -sk https://127.0.0.1:8006/api2/json/version
# {"data":{"version":"8.3","release":"8.3","keyboard":"en-us","repoid":"f123456d"}}
The mock ships with two nodes (pve-node1, pve-node2) and zero guests, but you can POST /nodes/pve-node1/qemu and POST /nodes/pve-node1/lxc to create VMs and containers in its in-memory state. Snapshots, power ops, migration, backup-create — they all return realistic UPIDs.
Cookie instead of Authorization: PVEAPIToken=….endtime/exitstatus set, so your WaitForTask polling logic actually terminates.When we wired mock-pve-api into the OpIntel test suite, two production bugs surfaced on the first run:
Authorization: PVE:user@realm!tokenid=secret. Real PVE expects Authorization: PVEAPIToken=user@realm!tokenid=secret. Token-auth was completely broken in production; ticket-auth happened to work, so nobody noticed.nodeStatus.LoadAvg. PVE returns load averages as strings (["0.15", "0.08", "0.01"]); mock returned floats. Our struct typed it as []float64, so real PVE failed to parse. Custom UnmarshalJSON fixed both shapes.We later added the same pattern for PBS (in-process httptest instead of Docker, since there's no maintained PBS mock) and surfaced an analogous bug in the PBS auth header path.
/cluster/resources doesn't aggregate guests. Real PVE returns every node + qemu + lxc + storage in one call. The mock only returns nodes and SDN entries. If your collector treats /cluster/resources as the source of truth for inventory, you'll see zero VMs against the mock even after creating them. Fix: fall back to per-node /nodes/{n}/qemu and /nodes/{n}/lxc enumeration when /cluster/resources returns nodes but no guests. (We added this to OpIntel; it doubles as defensive code for pre-7.x PVE.)uptime on guests. The mock omits the uptime field from /status/current, so inventory views that infer power state from uptime show everything as "off" even after POST .../status/start. Workaround: have your collector emit a power_state tag derived from status (running → poweredOn, stopped → poweredOff).TestMain boot as a fresh cluster.Both simulators run easily in CI, but you don't want them in every go test ./.... Gate them:
//go:build integration
package proxmox
func TestMain(m *testing.M) {
if _, err := exec.LookPath("docker"); err != nil {
os.Exit(0) // skip when no docker
}
// start mock-pve-api on a free port, wait for /version,
// run m.Run(), tear down container
}
make test-proxmox-integration runs them locally; a separate CI job runs them on every PR. The default unit suite stays fast and Docker-free.
Beyond tests, both sims work as drop-in dev infrastructure. A typical stack:
# Terminal 1 — vSphere
vcsim -dc 5 -cluster 12 -host 5 -vm 60 -autostart -l 0.0.0.0:8989
# Terminal 2 — Proxmox
docker run --rm -p 8006:8006 ghcr.io/jrjsmrtn/mock-pve-api:latest
# Terminal 3 — your app, pointed at both
export VSPHERE_URL=https://127.0.0.1:8989/sdk VSPHERE_USER=user VSPHERE_PASSWORD=pass
export PROXMOX_URL=https://127.0.0.1:8006 PROXMOX_USER=root@pam PROXMOX_PASSWORD=secret
./your-collector
Two terminals plus your binary, and you have a multi-DC vSphere cluster plus a two-node Proxmox cluster on localhost. New contributors can be running the full pipeline in a minute, with no VMware ELA, no Proxmox subscription, and no shared lab to break for everyone else.
Sims will never replace at least one staging cluster for:
The right mental model: sims catch wire-format and integration bugs; the staging cluster catches behavior bugs. Use both. Don't pretend either one is sufficient on its own.