After reading man bpf
and a few other sources of documentation, I was under impression that a map
can be only created by user process. However the following small program seems to magically create bpf
map:
struct bpf_map_def SEC("maps") my_map = {
.type = BPF_MAP_TYPE_ARRAY,
.key_size = sizeof(u32),
.value_size = sizeof(long),
.max_entries = 10,
};
SEC("sockops")
int my_prog(struct bpf_sock_ops *skops)
{
u32 key = 1;
long *value;
...
value = bpf_map_lookup_elem(&my_map, &key);
...
return 1;
}
So I load the program with the kernel's tools/bpf/bpftool
and also verify that program is loaded:
$ bpftool prog show
1: sock_ops name my_prog tag f3a3583cdd82ae8d
loaded_at Jan 02/18:46 uid 0
xlated 728B not jited memlock 4096B
$ bpftool map show
1: array name my_map flags 0x0
key 4B value 8B max_entries 10 memlock 4096B
Of course the map is empty. However, removing bpf_map_lookup_elem
from the program results in no map being created.
UPDATE
I debugged it with strace
and found that in both cases, i.e. with bpf_map_lookup_elem
and without it, bpftool does invoke bpf(BPF_MAP_CREATE, ...)
and it apparently succeeds. Then, in case of bpf_map_lookup_elem left out, I strace on bpftool map show
, and bpf(BPF_MAP_GET_NEXT_ID, ..)
immediately returns ENOENT
, and it never gets to dump a map. So obviously something is not completing the map creation.
So I wonder if this is expected behavior?
Thanks.
As explained by antiduh, and confirmed with your
strace
checks,bpftool
is the user space program creating the maps in this case. It calls functionbpf_prog_load()
from libbpf (undertools/lib/bpf/
), which in turn ends up performing the syscall. Then the program is pinned at the desired location (under abpf
virtual file system mount point), so that it is not unloaded when bpftool returns. Maps are not pinned.Regarding map creation, the magic bits also take place in libbpf. When
bpf_prog_load()
is called, libbpf receives the name of the object file as an argument.bpftool
does not ask to load this specific program or that specific map; instead, it provides the object file and libbpf has to deal with it. So the functions in libbpf parse this ELF object file, and eventually find a number of sections corresponding to maps and programs. Then it tries to load the first program.Loading this program includes the following steps:
In other words: start by creating all maps we found in the object file. Then perform map relocation (i.e. associate map index to eBPF instructions), and at last load program instructions.
So regarding your question: in both cases, with and without
bpf_map_lookup_elem()
, maps are created with abpf(BPF_MAP_CREATE, ...)
syscall. After that, relocation happens, and program instructions are adapted to point, if needed, to the newly created maps. Then once all steps are finished and the program is loaded,bpftool
exits. The eBPF program should be pinned, and still loaded in the kernel. As far as I understand, if it does use the maps (ifbpf_map_lookup_elem()
was used), then maps are still referenced by a loaded program, and are kept in the kernel. On the other hand, if the program does not use the maps, then there is nothing more to hold them back, so the maps are destroyed when the file descriptors held bybpftool
are closed, whenbpftool
returns.So in the end, when
bpftool
has completed, you have a map loaded in the kernel if the program uses it, but no map if no program would rely on it. Sounds like expected behaviour in my opinion; but please do ping one way or another if you experience strange things withbpftool
, I'm one of the guys working on the utility. One last generic observation: maps can also be pinned and remain in the kernel even if no program uses them, should one need to keep them around.You're completely right - user programs are the ones that invoke the
bpf
system call in order to load eBPF programs and create eBPF maps.And you did just that:
Your
bpftool
program is the user process that is invoking thebpf
syscall, and thus is the user process that is creating the eBPF map.BPF programs don't have to be unloaded when the user program that created it quits - bpftool likely uses this mechanism.
Some relevant bits from the man page to connect the dots: