Replies: 3 comments 2 replies
-
This comment started off as an explanation of why this would be hard to do, but along the way, I found that the job was half-done for some reason. There wasn't much left to be done, so writing TGraph is a plausible feature request. Reading and writing are very different. Even before #309, Uproot could read TGraphs because the TStreamerInfo in the ROOT file specified how the sequence of bytes is to be interpreted as a data structure with attributes; #309 added methods that interpret those attributes in a Pythonic way. To write a data type, the data-attributes-to-bytes layout has to be known to Uproot by explicit implementation, not discovered from a file. We'd also need to add TStreamerInfo for the TGraph class, its superclasses, and any attributes that are class types. We've done this for TObjString, all of the histogram types, TTree and TBranch, and all of the TLeaf types, so that we can write strings, histograms, and TTrees. I was going to point out where these things would need to be added for TGraph, and interestingly, some of them are already there. We've included a copy of TGraph's TStreamerInfo: uproot5/src/uproot/models/TGraph.py Lines 22 to 27 in 1f0e557 (we do this as raw bytes, without interpreting the content of those bytes, for simplicity; that's why it's a raw dump, copied from a ROOT file). However, the uproot5/src/uproot/models/TGraph.py Lines 337 to 345 in 1f0e557 Using your example-objects.root to see what classes would need to be writable to write a TGraph, it looks like the only new thing that we haven't implemented yet is TGraph itself (not any hidden superclasses or nested classes): >>> f = uproot.open("example-objects.root")
>>> f.file.show_streamers("TGraph")
TArrayF (v1): TArray (v1)
fArray: float* (TStreamerBasicPointer)
THashList (v0): TList (v5)
TArray (v1)
fN: int (TStreamerBasicType)
TArrayD (v1): TArray (v1)
fArray: double* (TStreamerBasicPointer)
TAttAxis (v4)
fNdivisions: int (TStreamerBasicType)
fAxisColor: short (TStreamerBasicType)
fLabelColor: short (TStreamerBasicType)
fLabelFont: short (TStreamerBasicType)
fLabelOffset: float (TStreamerBasicType)
fLabelSize: float (TStreamerBasicType)
fTickLength: float (TStreamerBasicType)
fTitleOffset: float (TStreamerBasicType)
fTitleSize: float (TStreamerBasicType)
fTitleColor: short (TStreamerBasicType)
fTitleFont: short (TStreamerBasicType)
TAxis (v10): TNamed (v1), TAttAxis (v4)
fNbins: int (TStreamerBasicType)
fXmin: double (TStreamerBasicType)
fXmax: double (TStreamerBasicType)
fXbins: TArrayD (TStreamerObjectAny)
fFirst: int (TStreamerBasicType)
fLast: int (TStreamerBasicType)
fBits2: unsigned short (TStreamerBasicType)
fTimeDisplay: bool (TStreamerBasicType)
fTimeFormat: TString (TStreamerString)
fLabels: THashList* (TStreamerObjectPointer)
fModLabs: TList* (TStreamerObjectPointer)
TH1 (v8): TNamed (v1), TAttLine (v2), TAttFill (v2), TAttMarker (v2)
fNcells: int (TStreamerBasicType)
fXaxis: TAxis (TStreamerObject)
fYaxis: TAxis (TStreamerObject)
fZaxis: TAxis (TStreamerObject)
fBarOffset: short (TStreamerBasicType)
fBarWidth: short (TStreamerBasicType)
fEntries: double (TStreamerBasicType)
fTsumw: double (TStreamerBasicType)
fTsumw2: double (TStreamerBasicType)
fTsumwx: double (TStreamerBasicType)
fTsumwx2: double (TStreamerBasicType)
fMaximum: double (TStreamerBasicType)
fMinimum: double (TStreamerBasicType)
fNormFactor: double (TStreamerBasicType)
fContour: TArrayD (TStreamerObjectAny)
fSumw2: TArrayD (TStreamerObjectAny)
fOption: TString (TStreamerString)
fFunctions: TList* (TStreamerObjectPointer)
fBufferSize: int (TStreamerBasicType)
fBuffer: double* (TStreamerBasicPointer)
fBinStatErrOpt: TH1::EBinErrorOpt (TStreamerBasicType)
fStatOverflows: TH1::EStatOverflows (TStreamerBasicType)
TH1F (v3): TH1 (v8), TArrayF (v1)
TCollection (v3): TObject (v1)
fName: TString (TStreamerString)
fSize: int (TStreamerBasicType)
TSeqCollection (v0): TCollection (v3)
TList (v5): TSeqCollection (v0)
TAttMarker (v2)
fMarkerColor: short (TStreamerBasicType)
fMarkerStyle: short (TStreamerBasicType)
fMarkerSize: float (TStreamerBasicType)
TAttFill (v2)
fFillColor: short (TStreamerBasicType)
fFillStyle: short (TStreamerBasicType)
TAttLine (v2)
fLineColor: short (TStreamerBasicType)
fLineStyle: short (TStreamerBasicType)
fLineWidth: short (TStreamerBasicType)
TString (v2)
TObject (v1)
fUniqueID: unsigned int (TStreamerBasicType)
fBits: unsigned int (TStreamerBasicType)
TNamed (v1): TObject (v1)
fName: TString (TStreamerString)
fTitle: TString (TStreamerString)
TGraph (v4): TNamed (v1), TAttLine (v2), TAttFill (v2), TAttMarker (v2)
fNpoints: int (TStreamerBasicType)
fX: double* (TStreamerBasicPointer)
fY: double* (TStreamerBasicPointer)
fFunctions: TList* (TStreamerObjectPointer)
fHistogram: TH1F* (TStreamerObjectPointer)
fMinimum: double (TStreamerBasicType)
fMaximum: double (TStreamerBasicType) All of the classes that TGraph depends on are histogram-related, which are already implemented in Uproot, and anyway they wouldn't matter if we only write TGraph without an attached
It could, perhaps, be a single struct.pack command. I put a
The fields are
Using >>> [int(x) for x in np.array([1.1, 2.2, 3.3, 4.4, 5.5], dtype='>f8').tobytes()]
[63, 241, 153, 153, 153, 153, 153, 154, 64, 1, 153, 153, 153, 153, 153, 154, 64, 10, 102, 102, 102, 102, 102, 102, 64, 17, 153, 153, 153, 153, 153, 154, 64, 22, 0, 0, 0, 0, 0, 0] to find where
is all of the superclasses, which the >>> expected_bytes = np.array([0, 0, 0, 5, 1, 63, 241, 153, 153, 153, 153, 153, 154, 64, 1, 153, 153, 153, 153, 153, 154, 64, 10, 102, 102, 102, 102, 102, 102, 64, 17, 153, 153, 153, 153, 153, 154, 64, 22, 0, 0, 0, 0, 0, 0, 1, 63, 240, 0, 0, 0, 0, 0, 0, 192, 0, 0, 0, 0, 0, 0, 0, 64, 8, 0, 0, 0, 0, 0, 0, 192, 16, 0, 0, 0, 0, 0, 0, 64, 20, 0, 0, 0, 0, 0, 0, 64, 0, 0, 31, 255, 255, 255, 255, 84, 76, 105, 115, 116, 0, 64, 0, 0, 17, 0, 5, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 192, 145, 92, 0, 0, 0, 0, 0, 192, 145, 92, 0, 0, 0, 0, 0], np.uint8)
>>> expected_bytes.tobytes()
b'\x00\x00\x00\x05\x01?\xf1\x99\x99\x99\x99\x99\x9a@\x01\x99\x99\x99\x99\x99\x9a@\nffffff@\x11\x99\x99\x99\x99\x99\x9a@\x16\x00\x00\x00\x00\x00\x00\x01?\xf0\x00\x00\x00\x00\x00\x00\xc0\x00\x00\x00\x00\x00\x00\x00@\x08\x00\x00\x00\x00\x00\x00\xc0\x10\x00\x00\x00\x00\x00\x00@\x14\x00\x00\x00\x00\x00\x00@\x00\x00\x1f\xff\xff\xff\xffTList\x00@\x00\x00\x11\x00\x05\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xc0\x91\\\x00\x00\x00\x00\x00\xc0\x91\\\x00\x00\x00\x00\x00'
>>>
>>> attempt = struct.pack(">i", 5) + b"\x01" + np.array([1.1, 2.2, 3.3, 4.4, 5.5], dtype=">f8").tobytes() + b"\x01" + np.array([1., -2., 3., -4., 5.], dtype=">f8").tobytes() + b"@\x00\x00\x1f\xff\xff\xff\xffTList\x00@\x00\x00\x11\x00\x05\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00" + struct.pack(">d", -1111.0) + struct.pack(">d", -1111.0)
>>> attempt
b'\x00\x00\x00\x05\x01?\xf1\x99\x99\x99\x99\x99\x9a@\x01\x99\x99\x99\x99\x99\x9a@\nffffff@\x11\x99\x99\x99\x99\x99\x9a@\x16\x00\x00\x00\x00\x00\x00\x01?\xf0\x00\x00\x00\x00\x00\x00\xc0\x00\x00\x00\x00\x00\x00\x00@\x08\x00\x00\x00\x00\x00\x00\xc0\x10\x00\x00\x00\x00\x00\x00@\x14\x00\x00\x00\x00\x00\x00@\x00\x00\x1f\xff\xff\xff\xffTList\x00@\x00\x00\x11\x00\x05\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xc0\x91\\\x00\x00\x00\x00\x00\xc0\x91\\\x00\x00\x00\x00\x00'
>>> attempt == expected_bytes.tobytes()
True So I tried an implementation: --- a/src/uproot/models/TGraph.py
+++ b/src/uproot/models/TGraph.py
@@ -338,7 +338,16 @@ class Model_TGraph_v4(uproot.behaviors.TGraph.TGraph, uproot.model.VersionedMode
where = len(out)
for x in self._bases:
x._serialize(out, True, name, tobject_flags)
- raise NotImplementedError("FIXME")
+ out.extend([
+ struct.pack(">i", self._members["fNpoints"]),
+ b"\x01",
+ self._members["fX"].astype(">f8").tobytes(),
+ b"\x01",
+ self._members["fY"].astype(">f8").tobytes(),
+ b"@\x00\x00\x1f\xff\xff\xff\xffTList\x00@\x00\x00\x11\x00\x05\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00",
+ struct.pack(">d", self._members["fMinimum"]),
+ struct.pack(">d", self._members["fMaximum"]),
+ ])
if header:
num_bytes = sum(len(x) for x in out[where:])
version = 4 and it can convert the TGraph into a PyROOT object, which is a demonstration that serialization works (since that's how we convert objects between Uproot and PyROOT): >>> f = uproot.open("/tmp/example-objects.root")
>>> g = f["tgraph"]
>>> g2 = uproot.to_pyroot(g)
>>> type(g2)
<class cppyy.gbl.TGraph at 0x600946035810>
>>> g2.GetName()
'tgraph'
>>> g2.GetTitle()
'title'
>>> np.asarray(g2.GetX())
array([1.1, 2.2, 3.3, 4.4, 5.5])
>>> np.asarray(g2.GetY())
array([ 1., -2., 3., -4., 5.])
>>> g2.GetMinimum()
-1111.0
>>> g2.GetMaximum()
-1111.0 The above does not test whether the streamers are being handled correctly, but it appears as though they have a correct implementation already. That would just need to be tested. src/uproot/writing/identify.py would need an entry point for syntax like output_file["name"] = XYZ to work (and as discussed on #1144, we'd have to decide what an appropriate Python object XYZ is, to be interpreted as a TGraph). This started out as a comment saying why this project would be hard, but since half of it was already implemented, it's not taking much to get the implementation finished. Could you take over from here? Do you have enough context from what I've started? I can help, and I can give you permissions to edit the |
Beta Was this translation helpful? Give feedback.
-
Also - work on this would help me to understand the issues in #1135 (comment) |
Beta Was this translation helpful? Give feedback.
-
@jpivarski where do you prefer to handle the discussion on possible python object to interpret as TGraph ? here or in the PR #1144 (comment) ? I see another point that some of the propositions may cause confusions for users. Let's assume we go with DataFrame with The basic data behind TGraph (x and y) seems like basic ingredient for TTree as well. The "write by assignment" doesn't give many options to specify the type in the ROOT file to be saved. Maybe if we require some other ingredient which makes easier to deduce a datatype, for example DataFrame attribute which is unique to TGraph and doesn't make sense to TTree ? |
Beta Was this translation helpful? Give feedback.
-
I would like to perform similar operation as here, but using uproot:
uproot5/dev/example-objects.py
Line 157 in 1f0e557
Namely to create TGraph and save it in the ROOT file:
I saw some discussion here: #309 but it seemed to be related to reading
Beta Was this translation helpful? Give feedback.
All reactions