author | version |
---|---|
gk |
20230212 |
[TOC]
You have a bunch of data, possibly streaming...
id,first_name,last_name,email,gender,ip_address
1,Rufe,Morstatt,[email protected],Male,216.70.69.120
2,Kaela,Scott,[email protected],Female,73.248.145.44,2
(...)
... and you need to filter. For now lets say we have them already as list of dicts.
You can do it imperatively:
foo_users = [
u
for u in users
if (u['gender'] == 'Male' or u['last_name'] == 'Scott') and '@' in u['email']
]
or you have this module assemble a condition function from a declaration like:
from pycond import make_filter
cond = 'email contains .de and gender eq Male or last_name eq Scott'
is_foo = make_filter(cond) # the built filter function is first
and then apply as often as you need, against varying state / facts / models (...):
foo_users = filter(is_foo, users)
with roughly the same performance (factor 2-3) than the handcrafted python.
In real life performance is often better then using imperative code, due to
pycond's
lazy evaluation feature.
When the developer can decide upon the filters to apply on data he'll certainly
use Python's excellent expressive possibilities directly, e.g. as shown above
through list comprehensions.
But what if the filtering conditions are based on decisions outside of the program's
control? I.e. from an end user, hitting the program via the network, in a somehow serialized form, which is rarely directly evaluatable Python.
This is the main use case for this module.
But why yet another tool for such a standard job?
There is a list of great tools and frameworks where condition parsing is a (small) part of them, e.g. pyke or durable and many in the django world or from SQL statement parsers.
1.
I just needed a very slim tool for only the parsing into functions - but this pretty transparent and customizable
pycond allows to customize
- the list of condition operators
- the list of combination operators
- the general behavior of condition operators via global or condition local wrappers
- their names
- the tokenizer
- the value lookup function
and ships as zero dependency single module.
All evaluation is done via partials and not lambdas, i.e. operations can be introspected and debugged very simply, through breakpoints or custom logging operator or lookup wrappers.
2.
Simplicity of the grammar: Easy to type directly, readable by non
programmers but also synthesisable from structured data, e.g. from a web framework.
3.
Performance: Good enough to have "pyconditions" used within stream filters.
With the current feature set we are sometimes a factor 2-3 worse but (due to lazy eval) often better,
compared with handcrafted list comprehensions.