Create Reproducible Builds of Development VMs with VirtualBox and Packer
Virtual machines have long been a popular choice for setting up local development environments, particularly for dev-prod parity of complex applications with one or more datastores and sometimes dozens of microservices. VirtualBox has long been one of the most popular VM providers, being free, open-source, and having strong cross-platform support for Linux, Windows, and MacOS (at least on Intel CPUs — no M1 support at the time of this writing and unfortunately it seems doubtful that it ever will).
HashiCorp's Packer is a useful tool for defining reproducible builds for machine images in code, either JSON or the domain-specific HashiCorp Configuration Language (HCL). Packer has support for dozens of builders that allow you to define how to build your image in multiple VM image formats or cloud providers from one source file. This post will walk through an example of building a VirtualBox image of Fedora Server from scratch, using the Packer VirtualBox builder.
Prerequisites
Download and install VirtualBox and Packer for your machine. For VirtualBox, this can vary depending on your operating system, so follow the official guides for your host OS. As previously mentioned, if you're on an Apple Silicon/M1 machine, you're unfortunately out of luck for using VirtualBox as of this writing. Currently, using Packer to build images of other operating systems on M1 hardware requires the pro version of Parallels, which costs $99/year.
Packer, being written in Go, is a single binary executable that should Just Work™.
Workspace Setup
I typically set up a Packer project with one directory per build template. Packer allows you to source variables from a separate file to keep things tidy or split builds for different environments. Packer will generate packer_cache
and a user-configurable output folders in the directory in which it was run, so it's good to keep those separate per build as well (and be sure to add them to any .gitignore
file for a version-controlled project!).
Create a packer
folder in the root of a project, with at least a README
or any other documentation, plus any common configuration files or scripts, then a templates
subfolder with folders for all the different builds inside of that, e.g.
css
├── packer
│ ├── README.md
│ ├── Makefile
│ └── templates
│ └── fedora-server
│ ├── fedora34.pkrvars.hcl
│ ├── fedora-server.pkr.hcl
│ ├── http
│ │ └── ks-fedora.cfg
│ └── fedora-coreos
│ ├── fedora-coreos.pkr.hcl
│ └── postgres
│ ├── pg13.pkrvars.hcl
│ ├── pg12.pkrvars.hcl
│ ├── postgres.pkr.hcl
│ └── solr
│ ├── solr8.8.pkrvars.hcl
│ ├── solr.pkr.hcl
Packer Templates
As mentioned, defining your Packer builds as code can be done in JSON or HCL. If you're starting a brand new project, you should definitely opt for HCL. Not having to hand-write JSON would be a good-enough reason for me, but HCL is generally cleaner, allows comments, can do variable substitution in strings (among other rudimentary programming conventions like loops and conditionals) and is used across the HashiCorp suite of tools like Terraform.
Let's look at what a HCL file for a Fedora Server VirtualBox build could look like:
fedora-server.pkr.hcl
variable "build_directory" {
default = "./build"
}
variable "boot_wait" {
default = "10s"
}
variable "cpus" {
default = 1
}
variable "disk_size" {
default = 50000
}
variable "headless" {
default = false
}
variable "http_directory" {
default = "./http"
}
variable "iso_url" {
type = string
}
variable "iso_checksum" {
type = string
}
variable "kickstart_file" {
type = string
}
variable "memory" {
default = 1024
}
variable "template" {
type = string
}
variable "username" {
default = "vagrant"
}
variable "vm_name" {
type = string
}
variable "provider_name" {
default = "virtualbox"
}
variable "ssh_timeout" {
default = "45m"
}
source "virtualbox-iso" "fedora_server" {
boot_command = [
"<up><tab><wait>",
" inst.ks=http://{{ .HTTPIP }}:{{ .HTTPPort }}/${var.kickstart_file}<enter>",
]
boot_wait = var.boot_wait
cpus = var.cpus
disk_size = var.disk_size
guest_os_type = "Fedora_64"
hard_drive_interface = "sata"
headless = var.headless
http_directory = var.http_directory
iso_url = var.iso_url
iso_checksum = var.iso_checksum
memory = var.memory
output_directory = "${var.build_directory}/packer-${var.template}-${var.provider_name}"
shutdown_command = "echo '${var.username}' | sudo -S shutdown -P now"
ssh_timeout = var.ssh_timeout
ssh_username = var.username
ssh_password = var.username
vm_name = var.vm_name
}
build {
sources = ["sources.virtualbox-iso.fedora_server"]
}
All HCL variables must be declared, along with their type
or default
value. The default type
is string
, so you could just have an empty object for string variables with no default, but I always err on the side of being explicit over implicit. The actual values for the variables can be passed in in one of several ways when invoking the packer
command, either from a separate variable file (specified via the -var-file=/path/to/file.pkrvars.hcl
) or one-by-one with -var key=value
. You could also specify them as environment variables, prefixed with PKR_VAR_
. For example, to specify the value of the kickstart_file
variable you could set an environment variable of PKR_VAR_kickstart_file=/path/to/file
. This is one way to set sensitive variables like passwords and API keys.
The source
block configures your builder of choice. The keys and values will depend on the builder as listed in the Packer docs. Regardless of builder, variables can be referred to via var.variable_name
. Variables can be used in strings with the ${var.variable_name}
syntax, and the entire string must be quoted in that case, as in the output_directory
and shutdown_command
values above.
The build
block tells Packer how to build the image when you run packer build
, mostly by passing an array of your source
blocks. The build
block is also where you could specify one or more provisioners. This is typically where I would write an Ansible playbook to provision the image, but in this example we're going to use a Kickstart file to build a base image to use for all subsequent Ansible-provisioned images.
Kickstart
The Red Hat family of operating systems (including Fedora) can perform automatic unattended installations via Kickstart. Basically, you can create a configuration file that can script things like disk partitioning, networking setup, user creation, package installations, etc., without human input. Here's an example:
ks-fedora.cfg
pgsql
lang en_US.UTF-8
keyboard us
rootpw --lock
url --mirrorlist=http://mirrors.fedoraproject.org/mirrorlist?repo=fedora-$releasever&arch=$basearch
network --bootproto=dhcp
firewall --disabled
selinux --permissive
timezone UTC
text
skipx
bootloader --location=mbr --append="net.ifnames=0 biosdevname=0"
clearpart --all
zerombr
part biosboot --size=1 --fstype=biosboot
part /boot --size=500 --fstype=xfs
part / --grow --fstype=xfs
firstboot --disabled
reboot --eject
user --name=vagrant --password=vagrant
%packages --excludedocs
@core
bzip2
curl
deltarpm
kernel-devel
kernel-headers
make
net-tools
nfs-utils
rsync
sudo
tar
wget
-plymouth
-plymouth-core-libs
-fedora-release-notes
-mcelog
-smartmontools
-usbutils
-man-pages
%end
%post
echo 'Defaults:vagrant !requiretty' > /etc/sudoers.d/vagrant
echo '%vagrant ALL=(ALL) NOPASSWD: ALL' >> /etc/sudoers.d/vagrant
chmod 440 /etc/sudoers.d/vagrant
update-crypto-policies --set LEGACY
systemctl enable sshd.service
%end
I won't go into the nitty-gritty of everything here (consult the Red Hat docs for the syntax reference) but basically this creates a vagrant
user with sudo
privileges, installs some packages, and makes the OS accessible over SSH so that Packer can communicate with it while creating the image.
To get VirtualBox to use this Kickstart file when booting the OS, we can leverage a nifty feature of Packer builders that can serve content (like our Kickstart file) over an HTTP server it starts up when running the build. This is the http_directory
variable in our fedora-server.pkr.hcl
file, which is where we place the ks-fedora.cfg
Kickstart file (refer to the Workspace Setup section above). The second line of the boot_command
variable in our fedora-server.pkr.hcl
tells Fedora to fetch the Kickstart file over HTTP served by Packer.
Building the Image
I previously mentioned that Packer allows you to source variables from a separate file. We can do this for our Fedora 34 build for variables that are specific to that version of Fedora:
fedora34.pkrvars.hcl
ini
iso_url="https://download.fedoraproject.org/pub/fedora/linux/releases/34/Server/x86_64/iso/Fedora-Server-netinst-x86_64-34-1.2.iso"
iso_checksum="sha256:e1a38b9faa62f793ad4561b308c31f32876cfaaee94457a7a9108aaddaeec406"
kickstart_file="ks-fedora.cfg"
template="fedora-34-x86_64"
All of these variables are declared in our fedora-server.pkr.hcl
file but we're defining them in a separate file for Fedora 34. Obviously the next version of Fedora will have a different iso_url
and iso_checksum
, and we may want to use a different kickstart_file
depending on circumstances.
Now you should have everything you need to build a base Fedora Server image using packer build
, such as:
sh
$ packer build --force \
-var-file=fedora34.pkrvars.hcl \
-var 'vm_name=fedora-34-x86_64' \
-var 'provider_name=virtualbox' \
fedora.pkr.hcl
This will kick off the build process by having Packer download the Fedora ISO, launching VirtualBox (you should see a VirtualBox window popup showing the console output of the automated install) and packaging the build as a VirtualBox .ovf
and .vmdk
files (in the build
directory). You should be able to start this VM with VirtualBox and (depending on your VirtualBox networking setup) SSH into it with vagrant/vagrant
.
Speaking of Vagrant, you could use Packer's post-processor to add a block in fedora-server.pkr.hcl
to tell Packer to create a Vagrant box out of this VirtualBox build. Add the post-processor to the build
block in fedora-server.pkr.hcl
:
nginx
build {
sources = ["sources.virtualbox-iso.fedora-base-box"]
post-processor "vagrant" {
keep_input_artifact = true
}
}
And re-run the packer build
command from above. You should now have a packer_fedora-base-box_virtualbox.box
file in the same directory. You could add this box to Vagrant with:
sh$ vagrant box add --name my-fedora-box packer_fedora-base-box_virtualbox.box
You could name it anything you want with --name
, but whatever name you chose will now be available to use as the config.vm.box name in your Vagrantfile
.
That's a wrap for this introductory post. I plan to add more about creating Ansible playbooks to create more boxes out of this base box and orchestrating them together in your local environment as well as creating and using production builds for various cloud providers.