AArch64 Bitfield Move (BFM) Instruction

6 min read Original article ↗

The sixth part of my series on decoding UTF-8 was a pretty uninteresting one: in it, I described a basic non-validating UTF-8 decoder. When examining the generated assembly, I saw three different instructions for shifting bits: LSL, BFI and BFXIL. As it turns out, BFI and BFXIL are both aliases of the same instruction: BFM - bitfield move. I decided to learn more about BFM and share my findings here1.

A little history first (shocking, I know). Arm32 had the following instructions for moving bitfields: BFI, BFC, UBFX, and SBFX. Unlike their Arm64 namesakes, they are instructions on their own and not aliases of other instructions. The 64-bit Arm introduces three separate, but related instructions: BFM, UBFM and SBFM. There are a number of aliases that are easier to use and are implemented in terms of the three basic instructions.

If I really knew anything about chip design, I would probably speculate that all three core bitfield move instructions share most of the common implementation, but as a software person, I’ll leave the implementation details to the knowledgeable folks.

To keep the post relatively short, I’ll focus on the BFM instruction. Once we understand how it works, the rest of them will be easy enough to understand from the documentation.

The BFM instruction copies a bitfield (a collection of adjacent bits within a machine word) from the source register to a position in the destination register without changing any other bits.

According to the official documentation:

Bitfield Move copies any number of low-order bits from a source register into the same number of adjacent bits at any position in the destination register, leaving other bits unchanged.

BFM Rd, Rn, #immr, #imms

Where:

  • Rd is the destination register (Xd for the 64-bit operation, Wd for the 32-bit variant).

  • Rn is the source register (Xn for the 64-bit operation, Wn for the 32-bit variant).

  • immr is the right rotate amount.

  • imms is the leftmost bit number to be moved from the source.

The operation is described as a sequence of a rotation, masking and insertion, but I find the following explanation easier to understand:

Depending on the values of the immr and imms, the instruction does one of the two things:

  • if immr <= imms, copies a bitfield [immr : imms] of the source register to the least significant bits of the destination register: [0 : imms - immr]

  • if immr > imms, copies a bitfield [0 : imms] of the source register to [regsize − immr : (regsize − immr) + imms] of the destination register.

where regsize is the destination register size of 32 or 64 bits.

Let’s look at examples of these two cases2:

// ------------------------------------------------------------
// Load W0 = 0x55667788
// Binary: 0101 0101 0110 0110 0111 0111 1000 1000
// ------------------------------------------------------------
movz    w0, #0x7788
movk    w0, #0x5566, lsl #16

// Load W1 = 0xEEFF0011
// Binary: 1110 1110 1111 1111 0000 0000 0001 0001
movz    w1, #0x0011
movk    w1, #0xEEFF, lsl #16

// ------------------------------------------------------------
// Copy bits [8:23] of W1 into low bits of W0
// Bits 8:23 of W1 are: 1111 1111 0000 0000 (FF0)
// Expected result: 0101 0101 0110 0110 1111 1111 0000 0000
// ------------------------------------------------------------
bfm     w0, w1, #8, #23

// Result: w0 = 0x5566FF00

Here, immr=8, imms=23. Since imms-immr=15 we:

  1. Take bits [8:23] of W1: 1110 1110 [1111 1111 0000 0000] 0001 0001

  2. Copy them into [0:15] bits of W0: 0101 0101 0110 0110 [1111 1111 0000 0000]

The actual machine code generated for the BFM instruction in the example was:

Sample:

// ------------------------------------------------------------
// Load W0 = 0x55667788
// Binary: 0101 0101 0110 0110 0111 0111 1000 1000
// ------------------------------------------------------------
movz    w0, #0x7788
movk    w0, #0x5566, lsl #16

// Load W1 = 0xEEFF0011
// Binary: 1110 1110 1111 1111 0000 0000 0001 0001
movz    w1, #0x0011
movk    w1, #0xEEFF, lsl #16

// ------------------------------------------------------------
// Copy bits [0:7] of W1 to [12:19] bits of W0
// Bits 0:7 of W1 are: 0001 0001 (11)
// Expected result: 0101 0101 0110 0001 0001 0111 1000 1000
// ------------------------------------------------------------
bfm     w0, w1, #20, #7
// Result: w0 = 0x55611788

Here, immr=20, imms=7, regsize=32, regsize−immr=12,(regsize−immr)+imms=19 so we:

  1. Take bits [0:7] of W1: 1110 1110 1111 1111 0000 0000 [0001 0001]

  2. Copy them into [12:19] bits of W0: 0101 0101 0110 [0001 0001] 0111 1000 1000

The following instructions are aliases of BFM:

  • BFC - Bitfield Clear. Clears a selected bitfield in the destination register.

  • BFI - Bitfield Insert. Inserts the low bits of the source register into a bitfield of the destination register.

  • BFXIL - Bitfield Extract and Insert Low. Extracts a bitfield from the source register and writes it into the low bits of the destination register.

As we have seen, BFM copies a bitfield to the destination register, but leaves the other bits unchanged. UBFM works exactly the same as BFM, but destination bits outside the bitfield are set to zero.

Judging by the number of aliases, UBFM is the most popular bitfield move instruction:

  • LSL (immediate) - Logical Shift Left (immediate). Shifts the source value left by an immediate amount and fills the new low bits with zeros.

  • LSR (immediate) - Logical Shift Right (immediate). Shifts the source value right by an immediate amount and fills the new high bits with zeros.

  • UBFIZ - Unsigned Bitfield Insert in Zeros. Extracts a bitfield from the low bits of the source register and inserts it into the destination at a specified position, with all other bits set to zero.

  • UBFX - Unsigned Bitfield Extract. Extracts a bitfield from the source register, puts it in the low bits of the destination register, and zero‑extends the result.

  • UXTB - Unsigned Extend Byte. Zero‑extends the low 8 bits of the source register and writes the result to the destination register.

  • UXTH - Unsigned Extend Halfword. Zero‑extends the low 16 bits of the source register and writes the result to the destination register.

Unsurprisingly, SBFM works almost exactly the same as the other two bitfield move instructions. The difference is that it sets the destination bits below the bitfield to zero and the bits above the bitfield to the value of the most significant bit of the bitfield.

  • ASR (immediate) Arithmetic Shift Right (immediate). Shifts the source value right by an immediate amount and fills the high bits with the sign bit (sign-extends it).

  • SBFIZ Signed Bitfield Insert in Zeros. Extracts a bitfield from the low bits of the source register, inserts it into the destination, clears the bits below it, and sign‑extends above it.

  • SBFX Signed Bitfield Extract. Extracts a bitfield from the source register, places it in the low bits of the destination register, and sign‑extends the result.

  • SXTB Signed Extend Byte. Extracts an 8-bit value from a register, sign-extends it, and writes the result to the destination register.

  • SXTH Sign Extend Halfword extracts a 16-bit value, sign-extends it, and writes the result to the destination register.

  • SXTW Sign Extend Word sign-extends a word and writes the result to the destination register.