It’s Not Unusual To Port the Linux Vector Packet Processor (VPP) to FreeBSD

18 min read Original article ↗

The Vector Packet Processor (VPP) is a framework for moving packets around at high rates. Its core concept is handling packets in groups known as “vectors,” which allows for the native use of vector processor instructions for packet classification and processing in different CPU architectures — currently amd64 and arm64.

VPP can process packets at incredibly high rates and competes with many dedicated forwarding appliances. This is achieved using userspace networking that bypasses the host’s normal network stack.

Since VPP runs in userspace, it is highly flexible and extensible, allowing quick modifications and changes independent of the host operating system stack.

VPP was originally developed for Linux and has never been ported to any other operating system except for the ability to build documentation packages on macOS.

This article describes the porting of VPP to FreeBSD and working with the upstream VPP project to include FreeBSD as a supported target.

The TL;DR

Many code changes were implemented to make all the components for VPP and dependencies work on FreeBSD. Here is a summary of what was done:

VPP

  • Tidied up the build system to be explicit about tools.
  • Triggered updates to VPP infrastructure to be better able to support multiple architectures.
  • Fixed Python tests to include required libraries.

DPDK

  • Fixed support for VMXNET3 on FreeBSD.
  • Created a new contiguous memory interface that can support multiple instances (WIP).
  • Documented use of Intel E810 (ice) devices with FreeBSD.

FreeBSD

  • netmap: Fixed a long-standing double-free issue with generic interfaces.
  • netmap: Fixed an early free issue when using TCP and generic interfaces.
  • ice: Added support to disable VLAN tag stripping.

VPP Software Overview

VPP is composed of a few components:

  • VPP core
  • Plug-ins
  • Packet interfaces
  • VPP API

The VPP core implements a complete userspace IP stack, including transport layer protocols such as UDP and TCP. It provides an abstraction over core operating system services to provide the functions of its components, and offers accounting mechanisms to dissect performance issues. It is the core of VPP, which implements polling on sockets and handles allocating memory and loading plug-ins. The VPP core framework provides instrumentation around most of the calls in its interface, enabling very low-cost accounting and straightforward performance investigation.

Plug-ins implement all of the features beyond the core IP services offered by VPP (even though many of VPP’s core features are implemented as statically compiled in plug-ins). Plug-ins can hook into the processing graph at any point and implement any IP stack layer, from the lowest packet input up to application services. For example, plug-ins in VPP implement a generic BPF packet-like interface with af_packet on Linux, and network layer services such as lldp go all the way up to application protocols such as an http performance test server.

Packet interfaces are a special example of plug-ins in VPP — not because they are different, but because they received most of the development work in our porting activity. Packet interfaces can be created on top of anything.

Core to VPP is the Data Plane Development Kit (DPDK), a userspace packet framework for interacting with PCI devices at high speeds. VPP was originally developed on top of DPDK. VPP can also be built on more regular interfaces for a host OS. BPF-like devices are available on Linux via af_packet, and VPP offers its own memif interface for interfacing with other instances of VPP on the same host via shared memory. Integrations with more usual native software interfaces such as TUN and TAP are available.

The final piece that makes VPP so powerful is its software API. A JSON-based API enables all of the possible operations via the VPP command-line interface. This API also enables a robust testing framework for VPP written in Python, which runs as part of the project’s continuous integration.

Previous Porting Attempts

Since its first release, VPP has been compelling software to port to FreeBSD. I am aware of two prior attempts to port to FreeBSD: one by George Neville-Neil and another building on this work by Nanoteq.

GNN’s port didn’t progress to working, but Nanoteq was able to do some performance experiments. During communication, they were blocked by a missing direct VFIO interface from FreeBSD to PCIe devices, which was required to tune a VPN offload engine.

My port integrated changes from GNN and Nanoteq’s ports, mostly as I tried to meet my first goal of getting VPP to build on FreeBSD.

The Porting Process

I started working on porting VPP to FreeBSD at the end of November 2023. I was aware of the two previous attempts to port VPP to FreeBSD. At the start of the project, I had one initial question to answer: “Do I try to update the last attempt, or do I just wade in and start working on getting VPP to build?”

The Nanoteq port was last updated in 2021, and since that last commit, there have been 2,500 commits to VPP. It could be that the Nanoteq changes would be simple to rebase, but in my experience, the world doesn’t work that way. If the rebase proved difficult, I would be in a tricky situation trying to merge patches into a code base I didn’t understand.

I decided instead to take the measured risk and wade into getting VPP to build on FreeBSD directly, only using the previous port attempts as a reference. I had to understand the code base to get VPP up to the expected performance. So, rather than failing around rebasing without understanding how things are implemented, it made sense for me to start again.

I decided that the following steps would make up the core of the port to FreeBSD:

  • Get VPP to build on FreeBSD.
  • Get VPP to run on FreeBSD.
  • Get VPP on FreeBSD to move packets.
  • Start investigating the performance of VPP on FreeBSD (to expose bugs!)

Getting VPP To Build on FreeBSD

VPP is a complex piece of software; all those features in plug-ins require a build system that can understand and integrate software from many different sources. VPP uses multiple build systems to create the final binaries:

  • GMake
  • Bash
  • CMake
  • Ninja
  • Meson
  • Python

The main interface to the VPP build system is gmake. At the root of the VPP project is a core makefile. This offers an interface for building and running targets, running tests, creating the documentation, and the packages distributed through fd.io.

The makefile calls out to Bash in many places to perform tasks that are difficult to write in GNU make. Subpackages are collected up by the external.mk makefile and their dependent build systems are used. A key external dependency here is DPDK, which uses Ninja and Meson for building.

Plug-ins, the core infrastructure and API components are built using CMake. CMake works recursively over the tree and collects components and plug-ins for the build. This is finally assembled into some core tools: vpp and vppctl for running and managing the main VPP process and several shared libraries. Each plug-in is provided as a shared library, making it easy to disable unneeded plug-ins through VPP configuration.

The main VPP makefile makes it straightforward to do VPP development; the main commands are:

$ make build

$ make debug

$ make run

The makefile is self-documenting, and once it runs, it provides many options to control what is built and to perform other tasks.

My first stop in the porting process was to address the many GNU-isms that sneak into a project that only runs on Linux. The first step was to use GNU make for all of the build commands explicitly:

gmake build

My initial build patches mostly dealt with GNU-isms in how tools are selected. When you only run on Linux, many things are always in the same places. It is always safe to assume that “make” is GNU make, “sh” is bash, and that bash can be found at /bin/bash. I updated the VPP build system to be explicit about what it uses. MAKE now inherits from the original make command everywhere it can, BASH is used explicitly, the build system errors out if it isn’t found, and all shell scripts use env to find bash rather than hard coding the path.

The next step is managing libraries, headers and includes. Some stuff is in different places in FreeBSD — perhaps because it is funny to be different — but VPP also pulls in many headers with an explicit Linux path, such as linux/tap.h.

I went through a process of evaluating each include and determining if it was Linux-specific or if there was a FreeBSD equivalent. I then used include guards (#ifdef __FreeBSD / #ifdef __linux) around the headers, including the correct files on the correct platform.

In some places, plug-ins could only be built on Linux; the af_packet plug-in mentioned earlier, which provides a BPF interface to raw sockets, does not apply to port to FreeBSD. I excluded many plug-ins early in the process to get an initial build going, but only a few plug-ins are marked as Linux in the final port. These are either not portable to FreeBSD or require significant work to make an equivalent interface.

I excluded some pieces of VPP infrastructure from building on FreeBSD; for others, I implemented placeholder functions that just throw an error when used.

sysfs

A major difference between FreeBSD and Linux is how operating systems make OS-specific information available. FreeBSD uses the sysctl interface to expose the operating system internals. Drivers and kernel subsystems use this common interface, enabling the discovery of system parameters and at-run-time configuration of drivers and subsystems.

Linux uses the sysfs virtual file system for similar purposes. VPP, a Linux native platform without ports, uses the sysfs interface. Sysfs is used to discover and configure huge pages, list PCIe devices and map memory regions of physical addresses.

Both sysctl and sysfs are dynamic run-time interfaces to the operating system internals. Sysfs is incredibly flexible. The information is provided as files in a directory structure, which makes it easy to add new sysfs values and relatively straightforward for applications to interface with.

For a porter of software, this is a nightmare. Sysfs accesses are just file accesses. On Linux, many of the sysfs files might not be present on a system, making errors reading a value a common well-handled case. VPP provided an abstraction over sysfs operations, but the VPP code used the sysfs interface freely in many places. As files may not exist on certain Linux builds, failing sysfs reads were commonly silent errors.

Gradually, through testing the port, each call into sysfs was given a matching call into FreeBSD’s sysctl interface or a similar API. As part of the upstreaming process, a VPP contributor helped develop a more abstract API, and now the VPP code is much cleaner to read on both platforms.

First Build

All of the above was a continual loop of:

$ make build

... wait ...

errors

$ fix

$ make build

... wait ...

errors

I was reasonably quickly able to build VPP. I made consistent progress fixing GNU-isms and mapping Linux header files until I finally completed the build process.

Once I got the first build, I kicked off VPP. I got to enter the fun process of dealing with segmentation faults and other crashes — until, finally, I got to debug why my plug-ins list only included the core plug-ins. (It was VPP using a Linux-only method to locate the VPP binary and the plug-in path when using its ELF loader.)

With VPP running, the next step was to get some packets into the processor.

Packets To Process

VPP offers several interfaces for dealing with networks; the first interface used in the VPP progressive tutorial uses the AF_PACKET BPF-style interface to get packets. I spent some time looking at what is available and where would be best to place my attention. In the past, VPP supported netmap, a userspace networking library that hooks into operating system network device drivers. This interface had been moved into their “attic” directory in the source code as development was not kept up, and the VPP plug-in API had diverged without the maintainer updating the plug-in.

DPDK is a core interface that VPP uses, and is the default that will be used if you just run make run from the VPP source tree. DPDK uses kernel modules to speak directly to devices via PCIe; the interfaces are “stolen” from the host. DPDK has its own device drivers that handle all configuration and packet collection, feeding them directly into the DPDK application.

DPDK was a priority to support with the VPP FreeBSD port. Thankfully, DPDK is already ported to FreeBSD, and during this process, it saw some good updates that should keep it in line with upstream releases.

DPDK

The first stop for getting a functioning packet interface for VPP on FreeBSD was DPDK. DPDK is a userspace framework for high-performance networks; it runs PCIe drivers in userspace via a direct access interface to the kernel.

DPDK already has a port to FreeBSD, and the port itself has had minimal changes from upstream. I had to bypass a NUMA check in the VPP build infrastructure to get VPP with DPDK to build on FreeBSD. Otherwise, the actual changes to VPP for DPDK FreeBSD were minimal. This was the normal header mashing to get things to line up and disable some paths that make sense on Linux, such as tweaking pipe size, but don’t make sense on FreeBSD.

With those changes in place, I could build the DPDK module, but it would fail to start. At this point, I had to implement the basics of detecting and finding PCIe devices for the VPP Infra on FreeBSD. Once strange patch changes were defined for PCIe devices between FreeBSD and Linux, I was surprised that the values used with the PCIe standard were not standard.

With these changes in place, I could connect to interfaces but couldn’t send any packets. They just didn’t leave the device. I stopped at this point and tried to get netmap working (see below). The eventual fix was to the memory management interface that VPP uses.

VPP does a lot of memory management on its own; this is a requirement to do vectorized packet processing very quickly and integrate with DPDK. Following packets in GDB, I resolved that the code made it into the virtio virtual queue mechanism in the DPDK device driver. Still, the VPP packet and device packet queues were never married. They should have been the same (i.e., the same memory), but they weren’t.

One way that VPP tries to move packets quickly is by minimizing copies. The ideal number of copies is 0, or if you have to do something useful with packet 1. The packet appears in the device driver, and at some later point, it is copied to another interface and sent. All operations processing the packet should ideally deal with references. VPP creates a block of memory for DPDK to use and passes that into DPDK on start-up, but it wasn’t clear how this should map to FreeBSD. After much fighting, I realized a sysfs issue was biting me.

Linux can discover a page’s physical mapping by reading a sysfs file. The sysfs file doesn’t exist on FreeBSD, and this mapping silently fell back to default. Once I pinned this down, I was able to ask for help finding a corresponding FreeBSD interface, and Mark Johnson pointed me to the right code in FreeBSD (mem_extract, documented in mem(4))). With a patch to VPP to use this interface on FreeBSD DPDK, I could move packets with DPDK through my virtio test interface.

netmap

When I hit a wall with the DPDK memory issue, I turned to look at using netmap with VPP on FreeBSD. Netmap is an interface for userspace networking. Unlike DPDK, it hooks the kernel’s device drivers to receive and send packets rather than implementing the device drivers in userspace.

Netmap catches the device driver’s receive path at the earliest possible opportunity. Rather than processing the packet, it feeds it to userspace via the netmap interface. Netmap can be supported in one of three ways: directly by the network interface device driver, for “free” by iflib device drivers or through an emulation layer called “netmap generic mode.” Netmap ships as a default-enabled feature in FreeBSD, and in the past, it had provided ports to Linux and Windows.

VPP had a netmap implementation, but over time, it had aged out and was not used much, so it wasn’t maintained. I started by copying the netmap files from the attic source directory and moving them into a new plug-in location. The plug-in API has changed over the last few years, and it has changed since the attempted Nanoteq port to FreeBSD. I tidied these up and was quickly able to get netmap processing packets and VPP acting as a router with netmap as its core.

For further testing, I encountered stalls with netmap and could not use epair interfaces with netmap. I quickly got kernel panics.

Epair has been an issue with netmap for a long time. I looked into this interface in the past, but I didn’t have the time to fix the core infrastructure then. Epair is a virtual wire, similar to Linux’s veth interface. When a virtual interface is created, you get two devices: an “A” and a “B” link. Epair works by keeping a queue of packets for each interface and occasionally passing them between the two sides of the pair based on a kernel task. After a few megabytes of traffic through an epair interface with VPP, I experienced kernel panics on my test machine. I figured out that I was encountering two distinct types of crashes.

Virtual interfaces such as repair don’t have direct netmap support and instead use generic mode, which emulates the interface receive buffers that exist on an actual network card using a queue mechanism. After much debugging, I determined that the netmap queue mechanism interacted with the epair queue mechanism, leading to packets being freed twice — not because epair was freeing the packets but because the epair queue mechanism was chaining packets together. When netmap came to free packets from its own packet queue, it could free a packet twice: once as it should and once if the packet mbuf was part of a chain.

Netmap generic mode keeps a pointer to the packets queue in the packet itself. VPP was able to exacerbate a case when a TCP packet would go into the socket buffer queue and have its pointer to the netmap reference removed through normal processing.

I fixed both issues in netmap generic mode, enabling netmap/VPP to work well in my test cases. This has the excellent side benefit of allowing FreeBSD epair-based tests to be written for netmap code.

Using netmap in anger like this has exposed other issues, including how iflib driver processes VLAN tags. I am currently working with other developers in the FreeBSD project to find a suitable for netmap and iflib to enable performant VLANs with VPP and netmap.

The Patch Queue

When I decided to wade into the port without trying to update other changes, I made two decisions: “Just start the work” and “Move fast and tidy up later.” I wanted to make progress quickly. I wasn’t aware of anything that might make a port of VPP to FreeBSD impossible; nothing on the feature list would be a major blocker.

I also didn’t know if a port was possible.

I prioritized getting VPP to build, run and then move packets, and I deliberately left upstreaming as a future task.

As you develop a project more, you improve your understanding of its style and how changes should look. You also learn more about the code and discover interfaces that are not always easy to find at first glance.

Moving quickly, I could make satisfying concrete steps forward, but I was also tearing down the walls to hang a picture. There was a lot of cleaning up to do at the end.

Once I was reasonably happy with my progress, I took my nightmare WIP branch and broke it down into a logical set of small commits. This includes changes in the small chunks I want to see as the maintainer of a project accepting diffs from a stranger.

Once I had tidied this up, I had a long patch queue to upstream to VPP. I have roughly 100 extracted changes from my branch. There are a couple of DPDK and FreeBSD changes, but most of these are adding FreeBSD support to VPP and fixing things on the way.

This port was an incredible amount of work — beyond the scale of a task someone would take on for fun in their free time. Upstreaming the changes is going very smoothly, but it has taken several months to get the bulk of the changes through review and into VPP.

This work would not have been possible without the generous support of the FreeBSD Foundation and RG Nets, its co-sponsor.

Group Created with Sketch.