ZFS Without a Server Using the Nvidia BlueField-2 DPU

Hacker News - Sat May 14 01:48

AIC JBOX J5010 02 NVIDIA BlueField 2 DPU With Three SSDs 2
AIC JBOX J5010 02 NVIDIA BlueField 2 DPU With Three SSDs 2

Today we are going to take a look at something a long time coming, specifically, we are going to run a ZFS sever entirely from a PCIe card, the NVIDIA BlueField-2 DPU. This piece has actually been split into two based on our recent Ethernet SSDs Hands-on with the Kioxia EM6 NVMeoF SSD piece. In that piece, we looked at SSDs that did not require an x86 server. With this, we are going to show you another option for that that will hopefully start some discussion on the topic as we go through some of the basics of how this works. What we realized after the last piece is that it is a new enough concept where we could do a bit better of a job explaining how things are happening, and why. So we are splitting this into two. In this piece, we are going to use NVMe SSDs and ZFS to help folks understand exactly what is going on here and keep it as close to a NAS model as we possibly can before moving to a higher-level demo.

Video Version and Background

Here is the video version where you can see the setup running, and likely some more screen capture here, and there is a voiceover.

As always, we suggest opening this in a separate tab, browser, or app for a better viewing experience.

For some background:

These mostly have videos as well but they explain what a DPU is, and why it is different from traditional NICs. The AWS Nitro was really the first to implement this and we recently discussed the market in the AMD-Pensando video.

The short version, however, is that a DPU is a processor that combines a few key characteristics. Among them are:

  • High-speed networking connectivity (usually multiple 100Gbps-200Gbps interfaces in this generation)
  • High-speed packet processing with specific acceleration and often programmable logic (P4/ P4-like is common)
  • A CPU core complex (often Arm or MIPS based in this generation)
  • Memory controllers (commonly DDR4 but we also see HBM and DDR5 support)
  • Accelerators (often for crypto or storage offload)
  • PCIe Gen4 lanes (run as either root or endpoints)
  • Security and management features (offering a hardware root of trust as an example)
  • Runs its own OS separate from a host system (commonly Linux, but the subject of VMware Project Monterey ESXi on Arm as another example)
STH Elements Of A DPU Q2 2021
STH Elements Of A DPU Q2 2021

Most of the coverage with DPUs has been DPUs in traditional x86 systems. The vision for DPUs goes well beyond that, and we will touch on that later. Still, it is correct to think of a DPU as a standalone processor that can connect to a host server or devices via PCIe and network fabric.

In this piece, we are going to show what happens when you do not use a host server, and run the DPU as a standalone mini-server by creating a ZFS-based solution with the card.

NVIDIA BlueField-2 DPU and Hardware Needed

For these are are using the NVIDIA BlueField-2 DPU. Here is the overview slide.

NVIDIA BlueField 2 DPU Overview
NVIDIA BlueField 2 DPU Overview

These are interesting because they are basically a ConnectX-6 NIC, a PCIe switch, and an Arm processor complex on the card.

NVIDIA BlueField 2 DPU Block Diagram
NVIDIA BlueField 2 DPU Block Diagram

The Arm CPU complex has 16GB of single-channel DDR4 memory. NVIDIA has 32GB versions, but we have not been able to get one. We can see the ConnnectX-6 NICs on the PCIe bus here. There is also eMMC storage that is running our Ubuntu 20.04 LTS OS.

NVIDIA BlueField 2 DPU Lstopo
NVIDIA BlueField 2 DPU Lstopo

There are eight Arm Cortex-A72 cores with 1MB of L2 cache shared between two cores and a 6MB L3 cache.

NVIDIA BlueField 2 DPU Lscpu Output
NVIDIA BlueField 2 DPU Lscpu Output

Here is one of the BF2M516A cards that we have with dual 100GbE ports. We will note that these are the E-cores not the P or performance parts.

NVIDIA BlueField-2 DPU 2x 100GbE
NVIDIA BlueField-2 DPU 2x 100GbE

In the Building the Ultimate x86 and Arm Cluster-in-a-Box piece, we showed that each of these DPUs are individual nodes running their own OS leading to a really interesting little cluster.

We are exploiting that property by utilizing the cards in the AIC JBOX that we recently reviewed.

AIC JBOX J5010 02 NVIDIA BlueField 2 DPU With Three SSDs 2
AIC JBOX J5010 02 NVIDIA BlueField 2 DPU With Three SSDs 2

The AIC JBOX is a PCIe Gen4 chassis that has two Broadcom PCIe switches and is solely meant to connect PCIe devices, power them, and cool them. There is no x86 processor in the system, so we can use the BlueField-2 DPU as a PCIe root and then use NVMe SSDs in the chassis to connect to the BlueField-2 DPU.

AIC JBOX J5010 02 NVIDIA BlueField 2 DPU With Three SSDs 1
AIC JBOX J5010 02 NVIDIA BlueField 2 DPU With Three SSDs 1

With the hardware platform set, we are able to start setting up the system with ZFS.