As a follow on to “Creating a Datastore on your ESXi USB Boot Drive”, I’m now going to go through the process of setting up a “self-hosted” filer to support the actual VMs, I’m mainly doing this to save me having to either remember what I did or figure it out again next time ;)
This is for my home lab whereas the “USB Boot Drive” process was developed for another box, in this case I’m fortunate enough to have spare controllers and bays in my home box (that SC836 I picked up on eBay), so I’m running off a 60GB SSD so it’s less of a pain in the arse, but the same process applies in either case.
I’m using Alpine Linux for this because it’s pretty lightweight and so far has treated me well (it was also the only distro I tried in which the drivers for the storage controller in the first box worked 100% every time). That being said this process will work on the distribution of your choice with minor modifications (package names may be different and Alpine uses OpenRC).
The first thing you need to do is to create a new VM on your “boot” datastore, and ideally pass through your storage controller to it (though if you must you can pass individual disks through). Then install Alpine (or your perferred distro) on it, I’m not going to bore you with the details, partly because I use a bootserver to install everything so your process will probably differ, but mostly because I’m lazy…
The ZFS Bit
As noted above, I’m using Alpine here, first up we’ll install our ZFS packages;
Load the ZFS module at boot (and load it now);
Enable and start the ZFS-Import and ZFS-Mount services;
And now configure your zpool as your fancy takes you (in this case I’m using a two-way mirror with four disks essentially RAID10 or is it RAID01 never can remember);
If all’s well now you should have the VMStore pool mounted at /VMStore and it should get automagically imported and mounted at reboot (feel free to reboot to make sure).
The iSCSI Bit
We’re going to use targetcli here, I don’t particularly LIKE targetcli, it’s very dependency heavy and requires dbus (ewwww) so it’s likely rebuild this box using FreeBSD (unfortunately not an option for the original box because the FreeBSD drivers for the storage controller are even worse than the Linux drivers)…
First thing you’ll need to do is enable the “community” repository, edit /etc/apk/repositories and uncomment the “community” repository line (the first one, usually on the third line, you probably don’t want “edge”).
Once that’s done you can install targetcli (I’m also including some dependencies here which for some reason aren’t picked up by the package);
At this point you should probably also install open-vm-tools so your host can gracefully shut down your filer.
Then enable and start the necessary services;
Don’t worry about the two messages for the moment.
In theory you ought to be able to create your backing file from within targetcli but it’s never worked for me, instead I manually create a sparse file the appropriate size using dd (1800G here). If you’re not dedicating a filesystem to the backing store (and you should) you will be better off creating a solid file as your sparse file will get fragmented (for solid lose the count= parameter and set your block size to something sensible like bs=32k or bs=1M);
Now we’ll go ahead and configure targetcli, first step is to configure your backing store;
VMBucket is the symbolic name of this backstore within target cli, it’s largely arbitrary.
Some notes on iSCSI target naming
According to the RFC it’s supposed to be iqn.YYYY-MM.tld.domain[.subdomain]:<unique name>
Where tld.domain[.subdomain] is a domain in reverse order and YYYY-MM is the year and month that domain was first registered. For my money this is stupidly arbitrary and I can see no sane reason for it, BUT some stuff actually validates against this ridiculous pattern so you’ll want to make yours at least look like that.
Now let’s go and actually configure it.
If you only want it to listen on specific IP addresses you can traverse down to /iscsi/<your target>/tpg1/portals remove the 0.0.0.0 portal and add one on your preferred IP.
Now to create a lun on your target;
And finally you need to configure ACLs, here I’m just allowing full access without authentication, you probably ought to consider how to do this “properly” within your environment;
For sanity checking purposes here’s the output from ls in my targetcli config;
You should now have an iSCSI target which will persist after a reboot.
The ESXi Bit
Now to actually configure that iSCSI target on ESXi, sadly this is not as easy as it ought to be. It seems that in at least ESXi 6.7.0u2 (though I feel like I’ve seen the same issue on earlier versions) you can’t fully configure the Software iSCSI adapter through the Web UI (once the adapter is enabled if you select it again in the Web UI you’ll get an unrecoverable exception pop up) so we have to resort to esxcli…
Now if all’s well, looking in your WebUI you should see your new iSCSI device under “Storage->Devices”
ESXi Storage
The “Status” of “Normal, Degraded” is due to the iSCSI target not being multi-pathed, feel free to make it multi-pathed but since all the networking is internal to the box I don’t really see the point…
Proceed to create your Datastore as you would for any local device and you’re done…
Well almost, in the current state autoboot WON’T WORK (well at least for any box after the filer), this article’s getting a bit long so I’ll discuss that in the next post.
Addendum
At some point along the line it seems some incompatibility/breakage has been introduced into the targetcli codebase (or perhaps just the Alpine version), it manifests as a failure to restore configuration at startup/using restoreconfig from the targetcli shell, if you’re impacted you’ll get this error (and targetcli will whinge at startup):
To fix it you can apply the following patch (Note this is for Python 3.8 the same mods should work with different versions, assuming of course they have the issue):