BREX Syntax Reference - Binary Range Expression Language

9 min read Original article ↗

Binary Range Expression (BREX) is a Domain Specific Language (DSL) for extracting sub-ranges from binary data based on position and/or conditions, with support for Type-Length-Value (TLV) searches. While designed for network packet analysis and processing, BREX can be used for security scanning, file identification, and other binary data manipulation tasks.

Note: BREX expressions are used in various Proxylity UDP Gateway destination configurations, including SQS and IoT Data destinations, to extract specific data from UDP packet payloads.

Basic Memory Access

Simple Byte Access

[5] - Returns byte at offset 5
[2, 4] - Returns 4 bytes starting at offset 2
[0, 10] - Returns first 10 bytes

Range Access with Colons

[2:6] - Returns 4 bytes starting at offset 2 up to but not including byte 6
[0:10] - Returns first 10 bytes
[2:] - Return all bytes starting with the byte at offset 2
[:] - Returns all bytes
[] - Returns an empty range

Offset Expressions

[u8[0] + 1] - Read byte at (value of byte 0) + 1
[u16le[0] * 2, 4] - Read 4 bytes at ((16-bit value at byte 0) * 2)
[u8[0] + 1, 10] | [5] - Read 10 bytes from calculated offset, get byte 5

Expression Chaining

Expression chaining allows you to pass the result of one operation as input to the next operation, creating powerful data processing pipelines. BREX uses the pipe operator | to chain expressions explicitly.

The Pipe Operator

The pipe operator | takes the result from the left expression and makes it the data context for the right expression:

[10:20] | [5] - Get bytes 10-19, then get byte 5 of that result
[0:100] | [50:60] - Get bytes 0-99, then get bytes 50-59 of that result
[0:100] | [50:60] | [0] - Multi-step: get range, then sub-range, then first byte

Left-to-Right Data Flow

Chaining evaluates from left to right, where each operation receives the output of the previous operation:

[20:30] | [2:8] | [1]
Step-by-step evaluation:
[20:30] → produces 10 bytes (original buffer bytes 20-29)
| [2:8] → produces 6 bytes (bytes 2-7 of the 10-byte result)
| [1] → produces 1 byte (byte 1 of the 6-byte result)

When to Use Chaining

Chaining is particularly useful for:

  • Structured data parsing: Extracting parts of TLV records, headers, etc.
  • Multi-step filtering: Progressive narrowing of data ranges
  • Protocol decoding: Following offset chains and nested structures
  • Data validation: Extracting and checking multiple related fields

[0:20] | [16:] | u32le[0:] - Get header, skip to timestamp field, read as uint32

Data Types

Integer interpretations of byte sequences can be specified for indexes, lengths, and condition operands.

8-bit Types

u8[0] - Unsigned 8-bit integer at offset 0
i8[0] - Signed 8-bit integer at offset 0

16-bit Types

u16le[0:] - Unsigned 16-bit little-endian
u16be[0:] - Unsigned 16-bit big-endian
i16le[0:] - Signed 16-bit little-endian
i16be[0:] - Signed 16-bit big-endian

32-bit Types

u32le[0:] - Unsigned 32-bit little-endian
u32be[0:] - Unsigned 32-bit big-endian
i32le[0:] - Signed 32-bit little-endian
i32be[0:] - Signed 32-bit big-endian

64-bit Types

u64le[0:] - Unsigned 64-bit little-endian
u64be[0:] - Unsigned 64-bit big-endian
i64le[0:] - Signed 64-bit little-endian
i64be[0:] - Signed 64-bit big-endian

Endianness Behavior

The endianness specified in the type is preserved throughout expression evaluation:

Given data: byte[] { 0x01, 0x02 }

u16be[0:] == 258 - true (0x01 << 8 + 0x02 = 258)
u16le[0:] == 513 - true (0x01 + 0x02 << 8 = 513)

Dynamic Indexing

Single-Level Indexing

[u8[0]] - Read byte at offset specified by byte 0
[u16le[2:], u8[0]] - Read (byte 0) bytes at offset (16-bit value at 2)

Multi-Level Indexing (Pointer Following)

[u8[u8[0]]] - Three-level pointer following
[u8[u16le[0:]]] - Mixed types in pointer chain

Indexing with Arithmetic

[u8[0] + 4] - Read at (byte 0 value) + 4
[u8[0] * 2, u8[1]] - Dynamic offset and length

Arithmetic Operations

Basic Operations

[u8[0] + u8[1]] - Add two byte values
u16le[0:] - 100 - Subtract constant
u8[0] * 4 - Multiply by constant
u32le[0:] / u16le[4:] - Divide two values
u8[0] % 16 - Modulo operation

Type Mixing

All arithmetic operations convert to the largest type involved:

u8[0] + u16le[1] - Result is 16-bit
u16le[0:] + u32be[2:] - Result is 32-bit

Conditional Expressions

Ternary Operator

u8[0] > 10 ? [1, u8[0]] : [0, 1] - condition ? true_expression : false_expression
u8[0] == 1 ? [1, 4] : u8[0] == 2 ? [5, 8] : [0, 1] - Nested conditions

Comparison Operators

u8[0] == 42 - Equality
u8[0] != 0 - Inequality
u16le[0:] > 1000 - Greater than
u16le[0:] >= 1000 - Greater than or equal
u8[0] < 128 - Less than
u8[0] <= 127 - Less than or equal

Null Coalescing

u8[100] ?? [0, 1] - Return [0,1] if offset 100 is out of bounds

Range Bitwise Operations

When applied to ranges, &&, ||, and ^^ perform byte-wise bitwise operations:

[0:2] && [2:4] - Bitwise AND between two ranges
[0:2] || [2:4] - Bitwise OR between two ranges
[0:2] ^^ [2:4] - Bitwise XOR between two ranges
[0] && [1] - Single byte AND operation
!(u8[0] > 128) - Logical NOT (for conditions)

Note: For mismatched range lengths, operations use truncation (only overlapping byte positions are used).

Concise Syntax

BREX provides several shorthand forms to make common expressions more concise.

Implicit Chaining

Instead of explicit pipe operators, you can chain operations by placing brackets directly adjacent:

Verbose: [10:20] | [5]
Concise: [10:20][5]

Verbose: [20:, [0]==31] | [2:]
Concise: [20:, [0]==31][2:]

Multi-step verbose: [20:, [0]==31] | [2:] | [0:4]
Multi-step concise: [20:, [0]==31][2:][0:4]

Concise Search Conditions

Simple equality conditions can omit the explicit field reference:

Verbose: [20:, [0]==31]
Concise: [20:, ==31] (assumes [0]==31)

Verbose: [20:, [0]==0x42]
Concise: [20:, ==0x42] (assumes [0]==0x42)

When to Use Concise vs Verbose

Use verbose syntax when:

  • Learning BREX for the first time
  • Code needs to be very clear and self-documenting
  • Working with complex multi-step operations
  • Collaboration requires maximum readability

Use concise syntax when:

  • Writing quick scripts or prototypes
  • Familiar with BREX patterns
  • Space/brevity is important
  • Working with well-understood data formats

Operator Precedence

From highest to lowest precedence:

  1. Primary expressions: [offset], type[offset], literals, parentheses
  2. Chained indexing: expr[range], expr[offset] (left-associative)
  3. Unary: !, - (unary minus), @
  4. Multiplicative: *, /, %
  5. Additive: +, -
  6. Comparison: <, <=, >, >=
  7. Equality: ==, !=
  8. Pipe: | (expression chaining, left-associative)
  9. Range Bitwise XOR: ^^
  10. Range Bitwise AND: &&
  11. Range Bitwise OR: ||
  12. Null coalescing: ??
  13. Ternary conditional: ? :

Precedence Examples

u8[0] + u8[1] * 2 is equivalent to u8[0] + (u8[1] * 2)
[0] && [1] ^^ [2] is equivalent to ([0] && [1]) ^^ [2]
u8[0] == 1 ? 2 : 3 + 4 is equivalent to u8[0] == 1 ? 2 : (3 + 4)
[20:, ==31] | [2:] | [0] + 5 is equivalent to (([20:, ==31] | [2:]) | [0]) + 5

Best Practices

1. Use Appropriate Types

Good: Explicit about endianness
u16le[0:] + u16le[2:]

Avoid: Implicit typing unnecessarily
u16le[0:] + [2:] (works but may not be intended)

2. Bounds Checking

2. Bounds Checking

Good: Safe access with fallback
u8[1] >= 6 ? u32le[u8[2]:] : 0

Also Good: Use null coalescing
u32le[u8[2]:] ?? 0

3. Readable Complex Expressions

Good: Clear intent
u8[0] == 0x04 ? [1, u8[0]] : [0, 1]

Consider breaking complex expressions into steps:
[20:, [0]==31][2:] - First find TLV, then extract value

4. Error Handling

  • Out of Bounds Access: Use null coalescing operator ?? to provide fallback values
  • Type Conversion Errors: Ensure sufficient bytes exist for multi-byte value reads
  • Division by Zero: Validate denominators before division operations