Fakecracker: NetBSD as a Function Based MicroVM

In November 2018 AWS published an Open Source tool called Firecracker, mostly a virtual machine monitor relying on KVM, a small sized Linux kernel, and a stripped down version of Qemu. What baffled me was the speed at which the virtual machine would fire up and run the service. The whole process is to be compared to a container, but safer, as it does not share the kernel nor any resource, it is a separate and dedicated virtual machine.
If you want to learn more on Firecracker’s internals, here’s a very well put article.

I liked this idea, and I thought NetBSD would be a perfect match for this kind of target, as the kernel and the entire OS can be stripped down easily. I know this because in 2016 I wrote a wannabe-container project called sailor, which goal is to create container-type chroots that will run services “ala” docker. I use this project for my own needs, and the very website you are seeing right now is actually running on a sailor ship.

This is fun and all, but we can’t really talk about security only with chroot, and the Firecracker solution seemed about right for this matter, yet the overall NetBSD boot process was a bit too long for my taste.

So how exactly can we significantly improve NetBSD’s boot speed? Well there are two major wins:

  1. Reduce the number of kernel features
  2. Bypass the bootloader

The first point is pretty easy to accomplish, here’s a minimal kernel config:

include "arch/i386/conf/std.i386"

makeoptions	COPTS="-Os"
makeoptions	USE_SSP="no"

maxusers	8		# estimated number of users

options 	CONSDEVNAME="\"com\""
options		MULTIBOOT
options 	RTC_OFFSET=0	# hardware clock is this many mins. west of GMT
options 	PIPE_SOCKETPAIR	# smaller, but slower pipe(2)
include 	"conf/compat_netbsd60.config"
file-system	FFS		# UFS
file-system	KERNFS		# /kern
options 	FFS_NO_SNAPSHOT	# No FF snapshot support
options 	INET		# IP + ICMP + TCP + UDP
config		netbsd	root on ? type ?
options 	WSEMUL_VT100		# VT100 / VT220 emulation
options 	WSDISPLAY_COMPAT_SYSCONS	# emulate some ioctls

pci*	at mainbus? bus ?
isa0	at mainbus?
pcdisplay0	at isa?			# CGA, MDA, EGA, HGA
wsdisplay*	at pcdisplay? console ?
com0	at isa? port 0x3f8 irq 4	# Standard PC serial ports
cinclude "arch/i386/conf/GENERIC.local"
pseudo-device	fss			# file system snapshot device
pseudo-device	vnd			# disk-like interface to files
pseudo-device	bpfilter		# Berkeley packet filter
pseudo-device	loop			# network loopback
pseudo-device	tun			# network tunneling over tty
pseudo-device	pty			# pseudo-terminals
pseudo-device	clockctl		# user control of clock subsystem
pseudo-device	wsmux			# mouse & keyboard multiplexor
pseudo-device	ksyms

virtio* at pci? dev ? function ?        # Virtio PCI device
ld* at virtio?                          # Virtio disk device
vioif* at virtio?                       # Virtio network device

Note that we absolutely need the virtio drivers, as VirtIO is the fastest way of handling devices in a virtualized environment.

Now about the second point, this is not as simple as it seems. The kernel must be able to be loaded out of the blue, i.e. without the need of a boot loader. In NetBSD, this is possible on the i386 port thanks to the MULTIBOOT kernel option, and unfortunately, as of today (18/06/2020), this option is not supported by the amd64 port (but on the bench nevertheless). So the rest of this article will assume we’re working with i386, which is a bit frustrating yet not really a problem as mostly all packages are available for this platform as well.

I won’t be covering how to prepare the environment for cross compiling tools and kernel, instead here’s a very good tutorial on this subject. Note that you can cross compile the i386 NetBSD kernel on a 64 bits Linux system, that’s actually what I do. Nevertheless, you will need an i386 NetBSD (possibly virtual) machine in order to create the root disk used for the service we want to run.

TL;DR here’s the command I use to build my kernel (don’t forget you’ll have to build the tools first!):

$ ./build.sh -u -j 5 -U -m i386 kernel=FIRECRACKER

You can try that the kernel is booting correctly like this:

$ sudo kvm -kernel /home/imil/mnt/hd/src/NetBSD/src.git/sys/arch/i386/compile/obj/FIRECRACKER/netbsd -nographic

The kernel will ask for root filesystem’s location, which we don’t have yet. Exit the -nographic mode by pressing Ctrl+a x.

Now about the real service to be started, we’ll setup an nginx web server which will start without the rc.d framework, again to gain precious milliseconds. To create our root filesystem, we will use sailor. Again, I will not cover sailor’s usage as the documentation says it all.

⚠ the following must be done on an i386 NetBSD host ⚠

First, we create a ship file definition:

$ cat examples/test.conf
shipbins="/bin/sh /sbin/init /usr/bin/printf /sbin/mount /sbin/mount_ffs /bin/ls /sbin/mknod /sbin/ifconfig /usr/bin/nc /usr/bin/tail"

run_at_build="printf 'creating devices\\n'"
run_at_build="cd /dev && sh MAKEDEV all_md"

Then we create a custom rc file which will be interpreted after the MicroVM calls init:

$ cat ships/fakecracker/etc/rc

export HOME=/
export PATH=/sbin:/bin:/usr/sbin:/usr/bin
umask 022

mount -a
ifconfig vioif0 up
ifconfig lo0 up
printf "\nstarting nginx.. "
echo "done"
printf "\nTesting web server:\n"
printf "HEAD / HTTP/1.0\r\n\r\n"|nc -n 80
tail -f /var/log/nginx/access.log

And a minimal fstab for the mount -a command to succeed:

$ cat ships/fakecracker/etc/fstab
/dev/ld0a / ffs rw 1 1

Then we create the ship:

$ sudo -E ./sailor.sh build examples/test.conf

update I’ve been told about the makefs(8) utility which makes the following steps much easier, instead of vndconfig / newfs / mount / rsync, simply makefs <image-file> <directory> (thanks mlelstv).

For the record, here are the detailed steps of an image creation anyway:

Create an empty image:

$ dd if=/dev/zero of=root.img bs=1m count=100

Make it available as a block device and create a filesystem in it:

$ sudo vndconfig vnd0 root.img
$ sudo newfs /dev/vnd0a
$ sudo mount /dev/vnd0a /mnt

Now simply rsync the ship to the newly created image:

$ sudo rsync -av fakecracker/ /mnt/

umount it, unbind the image from the block device, and copy it to the kvm host:

$ sudo vnconfig -u vnd0
$ scp root.img

Now fire up the MicroVM with the root location and network properties:

$ sudo kvm -kernel /home/imil/mnt/hd/src/NetBSD/src.git/sys/arch/i386/compile/obj/FIRECRACKER/netbsd -append "root=ld0a" -nographic -drive file=rootbase.img,if=virtio -netdev type=tap,id=net0 -device virtio-net-pci,netdev=net0

And voila! asciicast

2023 Update check out https://gitlab.com/iMil/mksmolnb for an automated, more mature version of this experiment