Installing NUT on ESXi 4.1

The final battle in my quest for properly integrated UPS Monitoring was to get my ESXi 4.1 box into the mix.

Some rummaging around online turned up a NUT Client for ESXi 5 (french) which doesn’t work for ESXi 4, fortunately the same guy has also done a NUT Client for ESXi 4 (french) which will work, but takes some effort.

The tricky part is that it seems that the only way to customise an ESXi 4 install is via oem.tgz, whereas in ESXi 5 we have the concept of “packages”, I broke my ESXi box several times figuring out how to actually do this properly so I figure I might as well write it up for others (also it’s probably good to have a native English version rather than having to go through the peril of Google Translate.)

    Enable SSH access on your ESXi box;

  • Login to vSphere
  • Select host
  • Configuration -> Security Profile
  • Firewall -> Properties…
  • Remote Tech Support (SSH) -> Options…
  • Click Start

SSH into your ESXi box and do the following;

~ # cp /bootbank/* /altbootbank/ # not necessary but recommended, backs up your bootbank
~ # cp /bootbank/oem.tgz /bootbank/oem.tgz.orig
~ # cd /tmp
/tmp # mkdir oem
/tmp # cd oem
/tmp/oem # tar -zxf /bootbank/oem.tgz
/tmp/oem # cd ..
/tmp # wget http://rene.margar.fr/downloads/oem.tgz
Connecting to rene.margar.fr (88.166.248.92:80)
oem.tgz              100% |*******************************| 40871  00:00:00 ETA
/tmp # cd oem/
/tmp/oem # tar -zxf ../oem.tgz

Once that’s done you’ve got a “merged” tree of the existing oem.tgz and the NUT oem.tgz, next step is to go edit etc/ups/upsmon.conf to add a MONITOR clause for your UPS, and then update etc/ups/notify.sh if you want email notifications (If you don’t want notifications, just comment out the NOTIFYCMD line in upsmon.conf).

Once you’re happy with the config you need to re-pack the oem.tgz, the trick here (took several attempts and one ESXi rebuild to get this right…) is that your .tgz must NOT have leading ./ in the paths, the way to do this is pretty simple;

/tmp/oem # tar -zcf /bootbank/oem.tgz bin/ etc/ lib/ oem.txt sbin/ usr/ var

Now reboot, and if all’s well ESXi will boot up happily, if it doesn’t consult the troubleshooting section below.

Once ESXi has booted up, SSH back in and run;

 ~ # upsmon
Network UPS Tools upsmon 2.4.3
UPS: compaq3kva@lodestone (slave) (power value 1)
Using power down flag file /etc/killpower

It you’re seeing that then it looks like you’re all good, now to test the shutdown script issue the following;

 ~ # upsmon -c fsd

ESXi should shut itself down shortly thereafter.

If that all works then you’re done.

Trobleshooting
If your ESXi box pink screens, you probably cocked up the oem.tgz, at the loading screen do a “Shift+O” and give the additional option “noOem”, this will prevent ESXi from loading the oem.tgz, then you can SSH back in and try again.

If the above doesn’t work then you’ve REALLY screwed something up (like say, when I DELETED the oem.tgz file… Protip: DON’T DO THAT) for that “Shift+R” will switch you over to the “altbootbank” (actually it swaps the contents of bootbank and altbootbank), which will revert you to the original state (assuming you did the backup as suggested in the above code), SSH in and try again.

If that doesn’t work you can boot the install CD, switch to a console with “Alt-F1” and copy the oem.tgz.orig file back to oem.tgz.

If that doesn’t work then you’ve done something REALLY special and I can’t help 😉

Morgan / 2016-11-09 / Uncategorized / 0 Comments

Adventures in UPS Management

Always one for “if it’s worth doing, it’s worth overdoing”, I use a rather oldschool 2RU rackmount 3kVA Compaq UPS (R3000h, OEM badged PowerWare 5119) for my network core, NAS, ESXi and workstation.

It was a cheap buy at ~$100 sans batteries, I just threw 4 7.2AH SLA batteries into it and called it good (I also have a much larger UPS, but it’s in need of a new battery pack, and being that it’s internal pack is *20* 7.2AH SLAs, it’s rather $$$ to replace).

For the first time in two and a half years since we bought this place we had a blackout yesterday morning…

The “pleasure” of having to go and manually shutdown all my machines at 05:30 was enough to FINALLY spur me into getting off my arse and actually setting up monitoring for the UPS.

The first challenge was building a cable that ACTUALLY BLOODY WORKED with this UPS (pro-tip; DO NOT plug a straight-through serial cable into a Compaq R3000h, the UPS will IMMEDIATELY and without warning power down…)

After various false starts, I FINALLY got a working cable up and running (found several different pinouts floating around online), the UPS speaks UPSCode2 and the cable to talk to it is as follows;

r3000h-cable

Oddball comms parameters too 1200 8-n-1 software flow control (Xon/Xoff), if you’re interested check out the UPSCode2 Concept Description which includes all of the protocol info.

Spent some time arguing with Powerware LanSafe trying to get it to play ball, eventually gave up and spent some time digging around for other solutions. Initially I looked at apcupsd because I knew it existed, but that doesn’t support upscode2, some further digging turned up Network UPS Tools which fit the bill, as a bonus there are clients available for *nix, Windows and ESXi.

Now to choose a UPS server, decided the best option was my NAS which is a FreeBSD box, and as an added bonus it has a real, legit, serial port onboard, hooked the UPS up there, installed NUT from ports.

nut.conf;

MODE=netserver

ups.conf;

[compaq3kva]
 manufacturer=Compaq
 driver=upscode2
 port=/dev/ttyu0

Added a couple of LISTEN clauses to upsd.conf;

LISTEN 127.0.0.1 3493
LISTEN 192.168.0.120 3493

Added a couple of users to upsd.users.
Et viola;

# upsc compaq3kva@lodestone
battery.capacity.nominal: 17.00
battery.charge: 95.0
battery.runtime: 2280
battery.voltage: 55.20
battery.voltage.maximum: 56.00
battery.voltage.minimum: 40.00
battery.voltage.nominal: 48.00
device.mfr: Compaq
device.model: UPS 3000 VA FW -0033
...
ups.mfr: Compaq
ups.model: UPS 3000 VA FW -0033
ups.power.nominal: 3000.00
ups.status: OL

So we’re all good there.

A few tweaks to upsmon.conf (mainly switching out -h for -p in SHUTDOWNCMD), and a quick test, and we’re golden.
Also enabled CGI on the webserver on that box to allow me to monitor the UPS;

UPS Stats from NUT

UPS Stats from NUT

I’ll probably enable SSL at some point but for the moment it works.

For my firewall box (pfSense) it was very straightforward to integrate, basically just installed the nut package and pointed it back at the NAS box.

Getting the ESXi integration to work was a bit of a pain in the arse, so I’ll write that up separately.

Next cab off the rank will be integrating my Windows box, which will be a little more involved because I also want to suspend any VMWare VMs which happen to be running on it at the time, prior to shutdown (either some PowerShell + VMWare vmrun voodoo, or I might write a quick’n’dirty app to do that), so that’ll get a separate write up too.

Oh, and one more thing, had to add the following to /etc/devfs.conf to ensure that NUT had access to the serial port;

own    ttyu0    root:uucp
perm   ttyu0    0660

All in all it was a relatively painless process, the biggest issue being getting the right damn cable made…

Morgan / 2016-11-08 / Uncategorized / 0 Comments