Compact Server Definitions with Archiso

Code: https://git.henryhoff.org/archiso/

Full System Backups are Wasteful

Shortly after my servers were functional, I decided that they should be easily reproducible so I could switch hosting services or bring them back up after drive failures. I initially kept my servers reproducible by making full system backups, but they take up more space than other types of records. Smaller records are possible because some state on servers can be represented in a more efficient way. For example, a package could be represented by its name and version rather than its contents. For most servers, the following state can be represented more efficiently:

Package contents
Disk formatting
Initramfs
Bootloader

NixOS was one system that managed state with a small set of configuration files. I liked the design of NixOS, but the implementation was time-consuming and complicated to deal with. Docker could make reproducible images from a Dockerfile, but I didn't want to be stuck with it just for its build system. I decided to use archiso because it's minimal and simple to use. Archiso uses the following process to build a bootable ISO file:

Make a new root filesystem
Install packages into the root filesystem from packages.x86_64
Copy the directory structure and files from airootfs/ into the root filesystem
Compress the root filesystem into a read-only disk image with squashfs or erofs.
Make a new bootable ISO 9660 filesystem
Install a bootloader into the ISO filesystem
Copy the read-only root filesystem image into the ISO filesystem
Add code to the initramfs to:
1. Mount the read-only root filesystem
2. Mount overlayfs on top of the root filesystem

Using a non-persistent overlayfs over root requires me to explicitly define what state to keep. This also prevents the server's state from deviating too far from the original generated state. Archiso was practical but it was not dedicated to being completely deterministic, so I modified and extended it. At some point I might write my own version of Archiso.

Pinning Package Versions

NixOS has a robust system to control package versions and avoid breakage. The experimental Nix Flakes system combines fine-grained package control with an easy way to control the versions of most packages at once. The main file flake.nix defines a list of packages to install and optionally pins package sources to specific revisions. Packages that are not pinned to revisions in the main file get pinned in flake.lock. All pins in flake.lock can be updated at once with nix flake lock --update-input nixpkgs and rolled back with a version control system.

The Arch Linux Archive makes it possible to emulate these careful updates. The archive stores the state of package repositories at each day in the last two years. Most packages are pinned by directing pacman to a package repository from a specific date. I update packages by advancing the date from which pacman reads a repository, and I rollback packages by moving back the date.

Sometimes this system is not sufficient to make sure all packages work. Packages can have bugs, or, more commonly, they can be incompatible with new versions of their dependencies. I pin individual problematic packages by adding their names and versions to pinned.lock. A script downloads any versioned packages in the pinned.lock from the Arch Linux Archive and places them in a package database. When my system adds packages to a server image, pacman first looks in this database for package pins.

Ensuring Records are Reproducible

Full system backups take up more space than the descriptions that Archiso builds from, but they reliably represent a working state of the server. Without care, my system will produce different images at different times or fail to build images at all. The easiest way to make sure a new image is functionally identical to an old one is to make sure they're completely identical by comparing hashes. Variation in build dependency versions causes the main differences in server images, so I build my images in a clean chroot with pinned packages. I set the SOURCE_DATE_EPOCH environment variable to the date of the latest git commit so most programs don't add nondeterministic timestamps to their output. The last known source of variability is in pacman recording the time it installed each package. I am waiting for the pacman maintainers to decide if they will respect SOURCE_DATE_EPOCH or make me take up the issue with the Archiso maintainers.

Building and Installing Server Images

The archiso.sh script modifies ISO files from Archiso into complete server images. First, the script converts disk images from ISO 9660 filesystems to ext4 filesystems and reinstalls syslinux. The script also makes directories in the root of the filesystem to store log files and operating data from the server. The server's fstab file includes the following mappings to make some directories persistent:

Runtime directory	Directory on ext4 filesystem
/var/spool/mail	/state/mail
/var/git	/state/git
/var/html	/state/html

For a lack of any strong opinions about the update process, I referred to NixOS's use of generations to design my own update process. The command nixos-rebuild encapsulates the system configuration into a generation that can be loaded during the boot process. NixOS users usually keep old generations around so they always have at least one generation that is known to work. This makes it easy to rollback servers without rebuilding generations.

My system keeps old server images on the ext4 filesystem in /imgs/. The initramfs uses a symbolic link to determine which server image to load. However, this system is not robust because if I can't get a shell in the default server image, I won't be able to change the symlink and reboot. On personal computers, a boot menu is an ideal place to select a different image, but on my servers I can't access boot menus. Some bootloaders might provide the equivalent to a boot menu over a serial console. For now, I will hope that Archiso does not create such a broken image that this would be necessary.

Next Steps

Compiling packages

I should consider the security improvements I get from compiling code myself, and why it is or isn't easy to keep it automatic and deterministic. I would like to avoid bootstrapping GCC.

Using ZFS or a similar filesystem

State should be backed up, and with filesystem snapshots nothing breaks when files change during a backup. ZFS also supports compression, deduplication, encryption, etc. Maybe some of these features would be useful.

Improve Performance

Compressed root filesystem are okay but I have to rebuild them every time a file changes. Maybe ZFS could help with this. I could also make changes by directly modifying the root filesystem of the last build instead of rebuilding the whole image. I would have to deviate from Archiso.

Add ARM support

All the decent cheap servers have ARM processors, but they all seem to have a different nonstandard way to boot into a working system. I would have to decide on what hardware my main server should run on, then figure out some code to install a system specifically for that hardware. The following are my current candidates:

Raspberry Pi 4

I already own one
No SATA or PCIe, only SD cards
There is a mandatory proprietary bootloader
ArchLinuxArm provides no details about the installation process besides "extract this tarball into root"
The boot process should still be pretty simple to figure out

Jetson Tx2

I already own one
The boot process is a mess
The hardware is much faster than the other candidates
The power draw is higher

Pine64 Quartz64

Open hardware, open software
I would have to buy one for about $60
The boot process is simple and pretty well documented
The company seems to have good intentions, I wouldn't mind supporting it

Using it for my servers

I have a bunch of configuration files to port over, but that should be pretty easy.