-
Notifications
You must be signed in to change notification settings - Fork 76
Uproot 3 → Uproot 4 migration guide
Everyone is welcome to contribute!
(I'll write that documentation, but anything you want to make sure gets into such a page, dump it here!)
In uproot3
, the length (f["/a/path"].__len__()
, or len(f["/a/path"])
) was
corresponding to the actual length of the underlying data (number of entries).
In uproot4
the length is the number of sub entries (e.g. subbranches), similar
to a Python dict
where len()
returns the number of keys/values.
To get the number of entries, the .num_entries
attribute can be used.
See a more detailed explanation in https://github.com/scikit-hep/uproot4/issues/191#issuecomment-726889311
from skhep_testdata import data_path
import uproot as uproot3
import uproot4
branch = "E/Evt"
f3 = uproot3.open(data_path("uproot-issue431b.root"))
f3[branch]
# <TBranchElement b'Evt' at 0x00010f15f5e0>
len(f3[branch])
# 10
f3[branch]["id"].array()
# array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10], dtype=int32)
f4 = uproot4.open(data_path("uproot-issue431b.root"))
f4[branch]
# <TBranchElement 'Evt' (22 subbranches) at 0x00010f37a2b0>
len(f4[branch])
# 22
len(f4[branch].keys()) # note that this will also count nested entries
# 118
f4[branch].keys()
# ['AAObject', 'AAObject/TObject', 'AAObject/TObject/fUniqueID', ...,
# 'mc_trks/mc_trks.hit_ids', 'mc_trks/mc_trks.error_matrix', ...]
len([k for k in f4[branch].keys() if "/" not in k]) # only top-level subbranches
# 22
f4[branch].num_entries
# 10
TODO: The caching has also changed in uproot4.
Whenever the automatic serialisation of uproot fails for whatever reason, a custom interpretation comes to the rescue. The concept remained the same but a few details have been changed. Here is an example how a custom jagged array interpretation was utilised in uproot3
:
import uproot # version 3
from skhep_testdata import data_path
f = uproot.open(data_path("uproot-issue124.root"))
tree = f["KM3NET_EVENT"]
snapshot_hits = tree["snapshotHits"].array(
uproot.asjagged(
uproot.astable(
uproot.asdtype(
[
("dom_id", ">i4"),
("channel_id", "u1"),
("time", "<u4"),
("tot", "u1"),
]
)
),
skipbytes=10,
)
)
This will return a JaggedArray
:
>>> snapshot_hits
<JaggedArray [[<Row 0> <Row 1> <Row 2> ... <Row 50> <Row 51> <Row 52>] ... [<Row 849> <Row 850> <Row 851> ... <Row 887> <Row 888> <Row 889>] [<Row 890> <Row 891> <Row 892> ... <Row 920> <Row 921> <Row 922>]] at 0x7f9b8e6c89d0>
>>> snapshot_hits.dom_id
<JaggedArray [[808432835 808432835 808432835 ... 809526097 809526097 809526097] ... [808432835 808488997 808488997 ... 809526097 809526097 809544061] [808432835 808432835 808432835 ... 809526097 809526097 809544061]] at 0x7f9bc99b05e0>
import uproot4 as uproot # version 4
from skhep_testdata import data_path
f = uproot.open(data_path("uproot-issue124.root"))
tree = f["KM3NET_EVENT"]
snapshot_hits = tree["snapshotHits"].array(
uproot4.interpretation.jagged.AsJagged(
uproot4.interpretation.numerical.AsDtype(
[
("dom_id", ">i4"),
("channel_id", "u1"),
("time", "<u4"),
("tot", "u1"),
]
), header_bytes=10,
)
)
Which will return an awkward1.Array
:
>>> snapshot_hits
<Array [[{dom_id: 808432835, ... tot: 30}]] type='23 * var * {"dom_id": int32, "...'>