From 748e4f3d702def1a4bff191e0cf93b6a05340f01 Mon Sep 17 00:00:00 2001 From: hc <hc@nodka.com> Date: Fri, 10 May 2024 07:41:34 +0000 Subject: [PATCH] add gpio led uart --- kernel/Documentation/bpf/bpf_design_QA.rst | 105 ++++++++++++++++++++++++++++++++++------------------ 1 files changed, 69 insertions(+), 36 deletions(-) diff --git a/kernel/Documentation/bpf/bpf_design_QA.rst b/kernel/Documentation/bpf/bpf_design_QA.rst index 6780a6d..2df7b06 100644 --- a/kernel/Documentation/bpf/bpf_design_QA.rst +++ b/kernel/Documentation/bpf/bpf_design_QA.rst @@ -36,27 +36,27 @@ defines calling convention that is compatible with C calling convention of the linux kernel on those architectures. -Q: can multiple return values be supported in the future? +Q: Can multiple return values be supported in the future? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A: NO. BPF allows only register R0 to be used as return value. -Q: can more than 5 function arguments be supported in the future? +Q: Can more than 5 function arguments be supported in the future? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A: NO. BPF calling convention only allows registers R1-R5 to be used as arguments. BPF is not a standalone instruction set. (unlike x64 ISA that allows msft, cdecl and other conventions) -Q: can BPF programs access instruction pointer or return address? +Q: Can BPF programs access instruction pointer or return address? ----------------------------------------------------------------- A: NO. -Q: can BPF programs access stack pointer ? +Q: Can BPF programs access stack pointer ? ------------------------------------------ A: NO. Only frame pointer (register R10) is accessible. From compiler point of view it's necessary to have stack pointer. -For example LLVM defines register R11 as stack pointer in its +For example, LLVM defines register R11 as stack pointer in its BPF backend, but it makes sure that generated code never uses it. Q: Does C-calling convention diminishes possible use cases? @@ -66,8 +66,8 @@ BPF design forces addition of major functionality in the form of kernel helper functions and kernel objects like BPF maps with seamless interoperability between them. It lets kernel call into -BPF programs and programs call kernel helpers with zero overhead. -As all of them were native C code. That is particularly the case +BPF programs and programs call kernel helpers with zero overhead, +as all of them were native C code. That is particularly the case for JITed BPF programs that are indistinguishable from native kernel C code. @@ -75,9 +75,9 @@ ------------------------------------------------------------------------ A: Soft yes. -At least for now until BPF core has support for +At least for now, until BPF core has support for bpf-to-bpf calls, indirect calls, loops, global variables, -jump tables, read only sections and all other normal constructs +jump tables, read-only sections, and all other normal constructs that C code can produce. Q: Can loops be supported in a safe way? @@ -85,8 +85,33 @@ A: It's not clear yet. BPF developers are trying to find a way to -support bounded loops where the verifier can guarantee that -the program terminates in less than 4096 instructions. +support bounded loops. + +Q: What are the verifier limits? +-------------------------------- +A: The only limit known to the user space is BPF_MAXINSNS (4096). +It's the maximum number of instructions that the unprivileged bpf +program can have. The verifier has various internal limits. +Like the maximum number of instructions that can be explored during +program analysis. Currently, that limit is set to 1 million. +Which essentially means that the largest program can consist +of 1 million NOP instructions. There is a limit to the maximum number +of subsequent branches, a limit to the number of nested bpf-to-bpf +calls, a limit to the number of the verifier states per instruction, +a limit to the number of maps used by the program. +All these limits can be hit with a sufficiently complex program. +There are also non-numerical limits that can cause the program +to be rejected. The verifier used to recognize only pointer + constant +expressions. Now it can recognize pointer + bounded_register. +bpf_lookup_map_elem(key) had a requirement that 'key' must be +a pointer to the stack. Now, 'key' can be a pointer to map value. +The verifier is steadily getting 'smarter'. The limits are +being removed. The only way to know that the program is going to +be accepted by the verifier is to try to load it. +The bpf development process guarantees that the future kernel +versions will accept all bpf programs that were accepted by +the earlier versions. + Instruction level questions --------------------------- @@ -109,16 +134,16 @@ A: This was necessary to avoid introducing flags into ISA which are impossible to make generic and efficient across CPU architectures. -Q: why BPF_DIV instruction doesn't map to x64 div? +Q: Why BPF_DIV instruction doesn't map to x64 div? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A: Because if we picked one-to-one relationship to x64 it would have made it more complicated to support on arm64 and other archs. Also it needs div-by-zero runtime check. -Q: why there is no BPF_SDIV for signed divide operation? +Q: Why there is no BPF_SDIV for signed divide operation? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A: Because it would be rarely used. llvm errors in such case and -prints a suggestion to use unsigned divide instead +prints a suggestion to use unsigned divide instead. Q: Why BPF has implicit prologue and epilogue? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -147,22 +172,41 @@ CPU architectures and 32-bit HW accelerators. Can true 32-bit registers be added to BPF in the future? -A: NO. The first thing to improve performance on 32-bit archs is to teach -LLVM to generate code that uses 32-bit subregisters. Then second step -is to teach verifier to mark operations where zero-ing upper bits -is unnecessary. Then JITs can take advantage of those markings and -drastically reduce size of generated code and improve performance. +A: NO. + +But some optimizations on zero-ing the upper 32 bits for BPF registers are +available, and can be leveraged to improve the performance of JITed BPF +programs for 32-bit architectures. + +Starting with version 7, LLVM is able to generate instructions that operate +on 32-bit subregisters, provided the option -mattr=+alu32 is passed for +compiling a program. Furthermore, the verifier can now mark the +instructions for which zero-ing the upper bits of the destination register +is required, and insert an explicit zero-extension (zext) instruction +(a mov32 variant). This means that for architectures without zext hardware +support, the JIT back-ends do not need to clear the upper bits for +subregisters written by alu32 instructions or narrow loads. Instead, the +back-ends simply need to support code generation for that mov32 variant, +and to overwrite bpf_jit_needs_zext() to make it return "true" (in order to +enable zext insertion in the verifier). + +Note that it is possible for a JIT back-end to have partial hardware +support for zext. In that case, if verifier zext insertion is enabled, +it could lead to the insertion of unnecessary zext instructions. Such +instructions could be removed by creating a simple peephole inside the JIT +back-end: if one instruction has hardware support for zext and if the next +instruction is an explicit zext, then the latter can be skipped when doing +the code generation. Q: Does BPF have a stable ABI? ------------------------------ A: YES. BPF instructions, arguments to BPF programs, set of helper functions and their arguments, recognized return codes are all part -of ABI. However when tracing programs are using bpf_probe_read() helper -to walk kernel internal datastructures and compile with kernel -internal headers these accesses can and will break with newer -kernels. The union bpf_attr -> kern_version is checked at load time -to prevent accidentally loading kprobe-based bpf programs written -for a different kernel. Networking programs don't do kern_version check. +of ABI. However there is one specific exception to tracing programs +which are using helpers like bpf_probe_read() to walk kernel internal +data structures and compile with kernel internal headers. Both of these +kernel internals are subject to change and can break with newer kernels +such that the program needs to be adapted accordingly. Q: How much stack space a BPF program uses? ------------------------------------------- @@ -201,17 +245,6 @@ program is loaded the kernel will print warning message, so this helper is only useful for experiments and prototypes. Tracing BPF programs are root only. - -Q: bpf_trace_printk() helper warning ------------------------------------- -Q: When bpf_trace_printk() helper is used the kernel prints nasty -warning message. Why is that? - -A: This is done to nudge program authors into better interfaces when -programs need to pass data to user space. Like bpf_perf_event_output() -can be used to efficiently stream data via perf ring buffer. -BPF maps can be used for asynchronous data sharing between kernel -and user space. bpf_trace_printk() should only be used for debugging. Q: New functionality via kernel modules? ---------------------------------------- -- Gitblit v1.6.2