Create Reproducible Builds of Development VMs with VirtualBox and Packer

Virtual machines have long been a popular choice for setting up local development environments, particularly for dev-prod parity of complex applications with one or more datastores and sometimes dozens of microservices. VirtualBox has long been one of the most popular VM providers, being free, open-source, and having strong cross-platform support for Linux, Windows, and MacOS (at least on Intel CPUs — no M1 support at the time of this writing and unfortunately it seems doubtful that it ever will).

HashiCorp's Packer is a useful tool for defining reproducible builds for machine images in code, either JSON or the domain-specific HashiCorp Configuration Language (HCL). Packer has support for dozens of builders that allow you to define how to build your image in multiple VM image formats or cloud providers from one source file. This post will walk through an example of building a VirtualBox image of Fedora Server from scratch, using the Packer VirtualBox builder.

Prerequisites

Download and install VirtualBox and Packer for your machine. For VirtualBox, this can vary depending on your operating system, so follow the official guides for your host OS. As previously mentioned, if you're on an Apple Silicon/M1 machine, you're unfortunately out of luck for using VirtualBox as of this writing. Currently, using Packer to build images of other operating systems on M1 hardware requires the pro version of Parallels, which costs $99/year.

Packer, being written in Go, is a single binary executable that should Just Work™.

Workspace Setup

I typically set up a Packer project with one directory per build template. Packer allows you to source variables from a separate file to keep things tidy or split builds for different environments. Packer will generate packer_cache and a user-configurable output folders in the directory in which it was run, so it's good to keep those separate per build as well (and be sure to add them to any .gitignore file for a version-controlled project!).

Create a packer folder in the root of a project, with at least a README or any other documentation, plus any common configuration files or scripts, then a templates subfolder with folders for all the different builds inside of that, e.g.

├── packer
│   ├── README.md
│   ├── Makefile
│   └── templates
│       └── fedora-server
│           ├── fedora34.pkrvars.hcl
│           ├── fedora-server.pkr.hcl
│           ├── http
│           │   └── ks-fedora.cfg
│       └── fedora-coreos
│           ├── fedora-coreos.pkr.hcl
│       └── postgres
│           ├── pg13.pkrvars.hcl
│           ├── pg12.pkrvars.hcl
│           ├── postgres.pkr.hcl
│       └── solr
│           ├── solr8.8.pkrvars.hcl
│           ├── solr.pkr.hcl

Packer Templates

As mentioned, defining your Packer builds as code can be done in JSON or HCL. If you're starting a brand new project, you should definitely opt for HCL. Not having to hand-write JSON would be a good-enough reason for me, but HCL is generally cleaner, allows comments, can do variable substitution in strings (among other rudimentary programming conventions like loops and conditionals) and is used across the HashiCorp suite of tools like Terraform.

Let's look at what a HCL file for a Fedora Server VirtualBox build could look like:

fedora-server.pkr.hcl


variable "build_directory" {
    default = "./build"
}

variable "boot_wait" {
    default = "10s"
}

variable "cpus" {
    default = 1
}

variable "disk_size" {
    default = 50000
}

variable "headless" {
    default = false
}

variable "http_directory" {
    default = "./http"
}

variable "iso_url" {
    type = string
}

variable "iso_checksum" {
    type = string
}

variable "kickstart_file" {
    type = string
}

variable "memory" {
    default = 1024
}

variable "template" {
    type = string
}

variable "username" {
    default = "vagrant"
}

variable "vm_name" {
    type = string
}

variable "provider_name" {
    default = "virtualbox"
}

variable "ssh_timeout" {
    default = "45m"
}

source "virtualbox-iso" "fedora_server" {

    boot_command = [
        "<up><tab><wait>",
        " inst.ks=http://{{ .HTTPIP }}:{{ .HTTPPort }}/${var.kickstart_file}<enter>",
    ]

    boot_wait            = var.boot_wait
    cpus                 = var.cpus
    disk_size            = var.disk_size
    guest_os_type        = "Fedora_64"
    hard_drive_interface = "sata"
    headless             = var.headless
    http_directory       = var.http_directory
    iso_url              = var.iso_url
    iso_checksum         = var.iso_checksum
    memory               = var.memory
    output_directory     = "${var.build_directory}/packer-${var.template}-${var.provider_name}"
    shutdown_command     = "echo '${var.username}' | sudo -S shutdown -P now"
    ssh_timeout          = var.ssh_timeout
    ssh_username         = var.username
    ssh_password         = var.username
    vm_name              = var.vm_name
}

build {
    sources = ["sources.virtualbox-iso.fedora_server"]
}

All HCL variables must be declared, along with their type or default value. The default type is string, so you could just have an empty object for string variables with no default, but I always err on the side of being explicit over implicit. The actual values for the variables can be passed in in one of several ways when invoking the packer command, either from a separate variable file (specified via the -var-file=/path/to/file.pkrvars.hcl) or one-by-one with -var key=value. You could also specify them as environment variables, prefixed with PKR_VAR_. For example, to specify the value of the kickstart_file variable you could set an environment variable of PKR_VAR_kickstart_file=/path/to/file. This is one way to set sensitive variables like passwords and API keys.

The source block configures your builder of choice. The keys and values will depend on the builder as listed in the Packer docs. Regardless of builder, variables can be referred to via var.variable_name. Variables can be used in strings with the ${var.variable_name} syntax, and the entire string must be quoted in that case, as in the output_directory and shutdown_command values above.

The build block tells Packer how to build the image when you run packer build, mostly by passing an array of your source blocks. The build block is also where you could specify one or more provisioners. This is typically where I would write an Ansible playbook to provision the image, but in this example we're going to use a Kickstart file to build a base image to use for all subsequent Ansible-provisioned images.

Kickstart

The Red Hat family of operating systems (including Fedora) can perform automatic unattended installations via Kickstart. Basically, you can create a configuration file that can script things like disk partitioning, networking setup, user creation, package installations, etc., without human input. Here's an example:

ks-fedora.cfg

lang en_US.UTF-8
keyboard us
rootpw --lock

url --mirrorlist=http://mirrors.fedoraproject.org/mirrorlist?repo=fedora-$releasever&arch=$basearch

network --bootproto=dhcp
firewall --disabled
selinux --permissive

timezone UTC

text
skipx

bootloader --location=mbr --append="net.ifnames=0 biosdevname=0"
clearpart --all
zerombr
part biosboot --size=1 --fstype=biosboot
part /boot --size=500 --fstype=xfs
part / --grow --fstype=xfs

firstboot --disabled
reboot --eject

user --name=vagrant --password=vagrant

%packages --excludedocs

@core
bzip2
curl
deltarpm
kernel-devel
kernel-headers
make
net-tools
nfs-utils
rsync
sudo
tar
wget
-plymouth
-plymouth-core-libs
-fedora-release-notes
-mcelog
-smartmontools
-usbutils
-man-pages

%end

%post

echo 'Defaults:vagrant !requiretty' > /etc/sudoers.d/vagrant
echo '%vagrant ALL=(ALL) NOPASSWD: ALL' >> /etc/sudoers.d/vagrant
chmod 440 /etc/sudoers.d/vagrant

update-crypto-policies --set LEGACY

systemctl enable sshd.service

%end

I won't go into the nitty-gritty of everything here (consult the Red Hat docs for the syntax reference) but basically this creates a vagrant user with sudo privileges, installs some packages, and makes the OS accessible over SSH so that Packer can communicate with it while creating the image.

To get VirtualBox to use this Kickstart file when booting the OS, we can leverage a nifty feature of Packer builders that can serve content (like our Kickstart file) over an HTTP server it starts up when running the build. This is the http_directory variable in our fedora-server.pkr.hcl file, which is where we place the ks-fedora.cfg Kickstart file (refer to the Workspace Setup section above). The second line of the boot_command variable in our fedora-server.pkr.hcl tells Fedora to fetch the Kickstart file over HTTP served by Packer.

Building the Image

I previously mentioned that Packer allows you to source variables from a separate file. We can do this for our Fedora 34 build for variables that are specific to that version of Fedora:

fedora34.pkrvars.hcl

iso_url="https://download.fedoraproject.org/pub/fedora/linux/releases/34/Server/x86_64/iso/Fedora-Server-netinst-x86_64-34-1.2.iso"
iso_checksum="sha256:e1a38b9faa62f793ad4561b308c31f32876cfaaee94457a7a9108aaddaeec406"
kickstart_file="ks-fedora.cfg"
template="fedora-34-x86_64"

All of these variables are declared in our fedora-server.pkr.hcl file but we're defining them in a separate file for Fedora 34. Obviously the next version of Fedora will have a different iso_url and iso_checksum, and we may want to use a different kickstart_file depending on circumstances.

Now you should have everything you need to build a base Fedora Server image using packer build, such as:

$ packer build --force \
    -var-file=fedora34.pkrvars.hcl \
    -var 'vm_name=fedora-34-x86_64' \
    -var 'provider_name=virtualbox' \
    fedora.pkr.hcl

This will kick off the build process by having Packer download the Fedora ISO, launching VirtualBox (you should see a VirtualBox window popup showing the console output of the automated install) and packaging the build as a VirtualBox .ovf and .vmdk files (in the build directory). You should be able to start this VM with VirtualBox and (depending on your VirtualBox networking setup) SSH into it with vagrant/vagrant.

Speaking of Vagrant, you could use Packer's post-processor to add a block in fedora-server.pkr.hcl to tell Packer to create a Vagrant box out of this VirtualBox build. Add the post-processor to the build block in fedora-server.pkr.hcl:

build {
    sources = ["sources.virtualbox-iso.fedora-base-box"]

    post-processor "vagrant" {
      keep_input_artifact = true
    }

}

And re-run the packer build command from above. You should now have a packer_fedora-base-box_virtualbox.box file in the same directory. You could add this box to Vagrant with:

$ vagrant box add --name my-fedora-box packer_fedora-base-box_virtualbox.box

You could name it anything you want with --name, but whatever name you chose will now be available to use as the config.vm.box name in your Vagrantfile.

That's a wrap for this introductory post. I plan to add more about creating Ansible playbooks to create more boxes out of this base box and orchestrating them together in your local environment as well as creating and using production builds for various cloud providers.


1681 Words

2021-06-14T13:58:07-05:00