How to handle XRootD error: [ERROR] Operation expired #355
Replies: 7 comments 5 replies
-
Interestingly I can't reproduce it on lxplus but I can when reading the data from outside of the CERN. Now to figure out why... |
Beta Was this translation helpful? Give feedback.
-
This isn't really an uproot problem (ROOT also had the same issue) but I'll summarise how I debugged it in case others run in to the same issue.
|
Beta Was this translation helpful? Give feedback.
-
That's good news. Thanks for taking care of it @chrisburr! |
Beta Was this translation helpful? Give feedback.
-
Thanks, @chrisburr! To make sure your solution remains visible, I'll convert this into a discussion. |
Beta Was this translation helpful? Give feedback.
-
Advance apologies as this is not an 'answer', but adding a similar problem I have encountered in more recent versions of I am also seeing I'm reading chunks from trees across many files, streaming them from CERN to my institute, then processing using When I use LHCb-related: these ntuples come from the analysis production
When I have some more time to debug this, I'll try to run this through again with def get_file_handler(filename):
xrootd_src = filename.startswith("root://")
if not xrootd_src:
return {"file_handler": uproot.MultithreadedFileSource} # otherwise the memory maps overload available Vmem
elif xrootd_src:
# uncomment below for MultithreadedXRootDSource
# return {"xrootd_handler": uproot.source.xrootd.MultithreadedXRootDSource}
return {}
def load_root_file(tree, columns, total_file_parts=4):
def get_tree_part(nentries, part, total_parts):
part_len = nentries / total_parts
entry_start = part * part_len
entry_stop = (
nentries if (total_parts - 1) == (part) else int(entry_start + part_len)
)
return int(entry_start), entry_stop
def root_loader(filename, file_part):
with uproot.open(filename, **get_file_handler(filename)) as rf:
treeo = rf[tree]
estart, estop = get_tree_part(
treeo.num_entries, file_part, total_file_parts
)
df = rf[tree].arrays(
columns, entry_start=estart, entry_stop=estop, library="pd"
)
assert (estop - estart) == df.shape[0]
return df
return root_loader
def load_root_as_dask_df(filenames, tree, columns, total_file_parts=1):
meta = get_dtypes_root(filenames[0], tree, columns)
loader = load_root_file(tree, columns, total_file_parts=total_file_parts)
dfs = []
for fn in filenames:
for part in range(total_file_parts):
dfs += [delayed(loader, pure=True, traverse=False)(fn, part)]
df = dd.from_delayed(dfs, meta=meta)
return df[columns] |
Beta Was this translation helpful? Give feedback.
-
I just wanted to report that we are currently having this issue using uproot 4.1.5. We are using uproot.iterate to loop over the files in chunks of defined sizes. I can try to whip together a MWE if needed. I did try setting the file_handler to MultithreadedXRootDSource in iterate, but this didn't improve our situation (jobs were hanging and eventually killed after the first iteration) @jpivarski is there a possibility that this needs to be upgraded from discussion to issue again? |
Beta Was this translation helpful? Give feedback.
-
Hi, all, I am suffering the same issue of |
Beta Was this translation helpful? Give feedback.
-
When reading many files via XRootD, we see intermittent and seemingly non-deterministic "Operation expired" errors from XRootD. This might be an XRootD or an EOS issue, but I filed it here since we are experiencing it when using Uproot. I'd be interested to hear if you have ideas about how to troubleshoot this.
MWE to reproduce follows. Unfortunately, it requires access to LHCb files on EOS.
It is not a problem with expired credentials, etc. - just rerunning the script will cause it to fail on a different file or not at all.
The issue occurs on multiple computers with very different setups.
uproot 4.0.7
xrootd 5.1.1
Beta Was this translation helpful? Give feedback.
All reactions