-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
System tracing provider for tracing via native probes (BPF) #1288
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks amazing @lc525! 🚀
Conscious this is still a WIP draft, but I've added a few comments and some questions below. Happy to chat about how you foresee this being used!
mlserver/types/tracepoints.py
Outdated
ArgTypeMap._prototypes = { | ||
Tracepoint.model_load_begin: ArgTypeMap.model_args, | ||
Tracepoint.model_load_end: ArgTypeMap.model_args, | ||
Tracepoint.model_reload_begin: ArgTypeMap.model_args_reload, | ||
Tracepoint.model_reload_end: ArgTypeMap.model_args, | ||
Tracepoint.model_unload: ArgTypeMap.model_args, | ||
Tracepoint.inference_enqueue_req: ArgTypeMap.queue_args, | ||
Tracepoint.inference_begin: ArgTypeMap.infer_args, | ||
Tracepoint.inference_end: ArgTypeMap.infer_args, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking out loud here, but the tracepoints and args seem very coupled with the implementation - would it be possible to make those more generic?
My main worry would be whether a small change into any of those implementations (e.g. changing the args for inference) could then cause any side effects around system tracing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a way, they do end quite coupled indeed. However, the prototypes essentially become the public interface that external probes use to run their code. Any changes to arguments, besides simply adding new ones at the end of the list, would require changes for whatever external BPF probes people write against the interface. So we would need to stabilise them at some point, after we write more actual probing code.
The idea is to pass to probes the minimal information that is required for meaningful tracing applications. Hopefully, this information remains relatively stable even as MLServer evolves. For example, the actual inference functions might change without us changing the data passed for tracing via the same Tracepoints.
That being said, the _prototypes
defined here do not end up being "enforced". Say we add a new argument to the inference_begin
tracepoint without changing any of the tp_*
functions in the provider. All existing code will continue to work because the function for triggering the tracepoint (Probe.fire
) receives a variable number of arguments. We can then selectively call the tracepoint with additional arguments sometimes by directly using the SystemTracingProvider.__call__(tracepoint, *args)
(I don't necessarily advocate that it's a sane thing to do, as code might end up being confusing in the long term wrt to what arguments should be passed to which tracepoints).
We can discuss further, perhaps there is a better design to be found here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding that extra context.
Do you have an example of how it looks on the "other end", i.e. where those probes are used?
Dockerfile
Outdated
@@ -107,7 +169,12 @@ RUN . $CONDA_PATH/etc/profile.d/conda.sh && \ | |||
pip install $_wheel --constraint ./dist/constraints.txt; \ | |||
done \ | |||
fi && \ | |||
pip install $(ls "./dist/mlserver-"*.whl) --constraint ./dist/constraints.txt && \ | |||
if [[ ${#_extras[@]} -gt 0 ]]; then \ | |||
extras_list=$(IFS=, ; echo "${_extras[@]}") && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following the same reasoning as the other comment, unless we have some reason to avoid it I'd install all the extras in the Dockerfile - that way the seldonio/mlserver
image is always ready as-is to enable tracepoints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to do that if the decision (given above comments) is to install all extras. I don't know if pip has a simple way of doing that here, I might still have to build a list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can also see if Poetry supports an all
extras target? Otherwise we can always add one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(and / or just hardcode a pip install mlserver[extras1, extras2, etc.]
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Poetry currently supports --all-extras on install but not export (this is tracked in python-poetry/poetry-plugin-export#45). I've added an everything
extra. I've intentionally avoided the name all
as that might be reserved in the future by poetry/pip, but we can move to it once there is enough support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH I would just use all
as it's more standard. If poetry decides to always add one automatically we can just remove it and use Poetry's built-in target.
- Allows MLServer to expose native tracepoints that fire on application events; BPF/Systemtap probes can be attached at runtime to those tracepoints, for performing measurements or for linking application activity to OS behaviour (resource consumption and contention, performance variations) - Near-zero overhead when not in use or external probes are not attached - The exposed tracepoints are configured via MLServer settings - Adds (optional) dependencies on * linux-usdt/python-stapsdt * linux-usdt/libstapsdt (native library)
Also update license files to contain dependencies for the `tracepoints` extra
- Only allow tracepoints to be enabled/disabled as a whole rather than having per-tracepoint options; - Refactor code so that all tracepoint/provider code resides in the same module (this has been facilitated by the simplified settings) - Update code for Python 3.8+ (use explicit Union types rather than |)
This PR introduces a new tracing provider, enabling dynamic attachment of native (BPF, Systemtap) probes at runtime.
The goal of the provider is to allow correlation between MLServer-specific events and operating system behaviour (system load, performance, resource usage, etc).
Approach
The provider exposes a number of tracepoints (hook functions without any code attached), which can be triggered (fired)
when particular application-level events happen (e.g. a model gets loaded/unloaded, an inference request joins a queue, etc).
At runtime, external probes (BPF programs, Systemtap scripts) can be attached to those tracepoints and to perform tracing actions (measurements, in-linux-kernel data aggregations, correlating MLServer context with OS context, etc).
The underlying implementation creates and dynamically links a native shared library where the tracepoint hooks exist as functions containing just a couple of nop instructions. When external probes are attached, the code of those tracepoints is modified at runtime to jump into the tracing code.
Features
tracepoints
extraIntroduced dependencies
tracepoints
extra is enabled, MLServer will require additional dependencies, as follows:libelf
)TODOs
bpftrace