type
status
date
slug
summary
tags
category
icon
password
Dynamically program the kernel for efficient networking, observability, tracing, and security
- 不需要修改内核源代码的情况下扩展内核功能
- eBPF程序在加载和运行时会经过验证和限制
- 确保加载eBPF的进程有权限,不会crash内核,没有死循环
- 老师:ebpf虚拟机存在安全问题
- 运行在内核空间中,无需上下文切换
- 支持JIT编译,速度快
- 事件驱动,eBPF可以响应内核和用户空间的事件,不会持续消耗系统资源
一个新的内核模型:用户的应用程序可以在内核态和用户态同时执行,用户态通过传统的 syscall 访问系统资源,内核态则通过 BPF Helper Calls (由内核提供)和系统的各个部分完成交互
Hook
- Pre-defined hooks include system calls, function entry/exit, kernel tracepoints, network events, and several others.
- If a predefined hook does not exist for a particular need, it is possible to create a kernel probe (kprobe) or user probe (uprobe) to attach eBPF programs almost anywhere in kernel or user applications.
How to work
the eBPF program can be loaded into the Linux kernel using the bpf system call. This is typically done using one of the available eBPF libraries(
libbpf
).- libbpf 将
bpf()
等原始的系统调用进行了初步的封装,包含将字节码加载到内核中的函数以及一些其他的关键函数。
- 一个典型的基于
libbpf
的eBPF程序具有*_kern.c
和*_user.c
两个文件,*_kern.c
中书写在内核中的挂载点以及处理函数,*_user.c
中书写用户态代码,完成内核态代码注入以及与用户交互的各种任务。
Hardening
具体怎么保护安全
- Program execution protection: The kernel memory holding an eBPF program is protected and made read-only. If for any reason, whether it is a kernel bug or malicious manipulation, the eBPF program is attempted to be modified, the kernel will crash instead of allowing it to continue executing the corrupted/manipulated program.
- Mitigation against Spectre: Under speculation CPUs may mispredict branches and leave observable side effects that could be extracted through a side channel. To name a few examples: eBPF programs mask memory access in order to redirect access under transient instructions to controlled areas, the verifier also follows program paths accessible only under speculative execution and the JIT compiler emits Retpolines in case tail calls cannot be converted to direct calls.
- 防御测信道攻击,比如分支预测、数据总线、缓存测信道之类的
- Constant blinding: All constants in the code are blinded to prevent JIT spraying attacks. This prevents attackers from injecting executable code as constants which in the presence of another kernel bug, could allow an attacker to jump into the memory section of the eBPF program to execute code.
内存访问
eBPF programs cannot access arbitrary kernel memory directly. Data and data structures that lie outside of the context of the program must be accessed via eBPF helpers. This guarantees consistent data access and makes any such access subject to the privileges of the eBPF program, eBPF program cannot randomly modify data structures in the kernel.
Toolchains
bcc: Python
使用C来设计内核中的 BPF程序,其余包括编译、解析、加载等工作在内,均可由BCC完成。
兼容性并不好。基于BCC的 eBPF程序每次执行时候都需要进行编译,编译则需要用户配置相关的头文件和对应实现。最好选一次编译-多次运行(CO-RE)的工具,如libbpf-bootstrap
bpftrace: C
GO/C++
tracepoint和krpobe
- 静态 vs. 动态:
tracepoint
是静态的,需要在内核源代码中进行明确定义,然后通过编译内核来启用。kprobe
是动态的,可以在运行时动态在内核函数的特定地址上插入和删除探针,而无需重新编译内核。
- 性能开销:
tracepoint
的性能开销相对较低,因为它们是静态的,且在编译时生成的。kprobe
的性能开销略高,因为它们是动态的,需要运行时的代码路径跟踪和修改。
- 用例:
tracepoint
通常用于监视内核中的特定事件,如系统调用、中断处理、驱动程序事件等,以进行性能分析和调试。kprobe
通常用于监视和跟踪内核函数的调用,变量的修改,以进行更细粒度的分析、故障排除或性能优化。
Hello World
- URL:/article/09719f03-1f2f-4015-be36-c990f7d09dc6
- Copyright:All articles in this blog, except for special statements, adopt BY-NC-SA agreement. Please indicate the source!