When LLVM's Optimizer Breaks Your eBPF Program

2 min read Original article ↗

I spent 3 hours debugging why my eBPF program wasn’t compiling. The error? A call to built-in function ‘memset’ is not supported.

The confusing part? I was using __builtin_memset like you’re supposed to.

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters

#include "vmlinux.h"
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_core_read.h>
#define PATH_MAX 1024
struct path_key {
char container_path[PATH_MAX]; // 1024 bytes
char directory_path[PATH_MAX]; // 1024 bytes
};
struct {
__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
__uint(max_entries, 1);
__type(key, u32);
__type(value, struct path_key);
} temp_map SEC(".maps");
SEC("kprobe/vfs_open_broken")
int broken_no_barrier(struct pt_regs *ctx)
{
u32 idx = 0;
struct path_key *out = bpf_map_lookup_elem(&temp_map, &idx);
if (!out)
return 0;
__builtin_memset(out->container_path, 0, sizeof(out->container_path));
__builtin_memset(out->directory_path, 0, sizeof(out->directory_path));
const char *path = "/some/path";
bpf_probe_read_kernel_str(out->directory_path,
sizeof(out->directory_path), path);
return 0;
}
char LICENSE[] SEC("license") = "GPL";

When I commented out ONE of the memsets, it compiled fine. Both together? Failed.

I started looking at the LLVM output.

I dumped the LLVM IR

call void @llvm.memset.p0.i64(ptr noundef nonnull align 1 dereferenceable(2048) %3, i8 0, i64 2048, i1 false), !dbg !131

2048 bytes. LLVM merged my two 1K memsets into a single 2K operation.

  • memset(path1,0,1024)

  • memset(path2,0,1024)

  • memset(path1,0,2048) <- Wait, What?

LLVM merged my two 1K memsets into one 2K memset.

The BPF verifier rejects large memset operations. They unroll into tons of

instructions and blow past the verifier’s complexity limits.

Seems like 1024 bytes is the sweet spot for BPF. Anything bigger gets rejected.

If you see something like

- Individual memsets work

- Combined memsets fail

- Error message: “A call to built-in function ‘memset’ is not supported”

Insert an asm volatile memory barrier after each memset.

asm volatile(”“ ::: “memory”);

This prevents the LLVM optimizer from merging the two memsets into one.

The asm volatile memory barrier tells LLVM: “Hey, don’t get clever here. Keep these memsets separate.”

It’s a compiler fence, not a CPU barrier zero runtime cost!

Discussion about this post

Ready for more?