-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fqns/schema thoughts #272
Comments
FWIW I still don't really get why |
Apologies for not getting to this earlier, but in general, your summary sounds about right. Hence, just a few points: I'm using the term "schema" in vaguely the same sense as in "database schema" or "XML schema", that is, to unambiguously refer to the metadata that defines the shape and meaning of the parameters and results, the types, rather than the values taken by part of a scan/… (or to precise, as you surmise, the description detailing those types). As for "schemata", in actuality, I was just using it as the plural of "schema", but it might well be that it always ends up referring to a mapping of a name to the respective schema. The summary of the "why" also sounds correct. As for "(This one I'm particularly sketchy on...) exportability in the argument interface", yes, the argument editor (i.e. the experiment submission UI) is the main client of the parameter schema information. The way ndscan parameters integrate with the default ARTIQ experiment scheduling infrastructure (which is entirely unchanged) is that the argument editor plugin receives all the schemata in a PYON argument named As for the fact that everything is a nested dictionary of mostly strings, you suspect correctly that the main reason here is that the data needs to be conveyed between both argument editor and experiment, and then experiment and applets (PYON)/ results files (HDF5). As you correctly point out, a way to reparse these dictionaries into a richly typed form (i.e. Python objects with docstrings, …) could be useful for code readability/explorability. Back when writing the bulk of this, I chose to have
I agree that an explicit description would be good to have, whether in the form of docstring'd code, or just a textual description of the JSON contents. If it is any consolation, I was quite careful to stick to consistent terminology and not to use the same word in more than one meaning throughout, though. Regarding
the main reason here is that while the 1:1 mapping between FQNs and Python classes is straightforward and clear, sometimes being able to parametrise a type is helpful – for instance, to have multiple variants of a class for a number of different transition parameters. Like with C++ templates, those variants aren't actually related from the perspective of the type system (here: they just have unrelated FQNs). The concept of templates/generic types doesn't really exist in Python (at least not before the recent In this, the current, design, the way to pass extra arguments is simply to provide extra values later (through some other method, or just by setting a public property in the subfragment). It has to be that way round – rather than providing the parameters later –, as the FQN is used in However, I've been somewhat on the fence about this design ever since I added it in the first place. There are two alternatives: One is to explicitly require the user to actually create different Python types. In a sense, this would be the lowest-overhead option, as it doesn't introduce any new concepts. If there are enough of them that you don't want to write them all out by hand, making sure they end up with different names would requires some fairly esoteric code such as this, though (which could be addressed through documentation, I suppose): class Foo(Fragment):
def build_fragment(self):
self.setattr_fragment("freq", FloatParam, "Frequency",
f"dataset('{self.name}.freq', 1e6)")
def make_foo(variant):
class FooInstance(Foo):
name = variant
FooInstance.__qualname__ = f"Foo_{variant}"
return FooInstance
class FooUser(ExpFragment):
def build_fragment(self):
self.setattr_fragment("foo_stretch", make_foo("stretch"))
self.setattr_fragment("foo_clock", make_foo("clock")) Another would be to add an explicit mechanism for specifying "template values" to |
@pmldrmota: Any thoughts on explicitly requiring users to make new classes for different variants like in the example (for the different transitions, …)? Just removing the build_fragment argument -> FQN mangling kind of seems the way to go for me, as it strictly removes complexity from ndscan – users just have to make sure every class (as defined by a unique |
I suppose we could also have a |
On the other hand, we need to make sure that there are good diagnostics for the schema mismatch case if we do remove the FQN mangling entirely, as one advantage of the current design is that it is easy to "accidentally" do the right thing without even thinking about the fact that there is only a single schema per FQN, as the most natural way to get values that would change e.g. some parameter defaults would precisely be from |
I agree with this. A couple of disadvantages: fqns can get pretty messy and often have a trailing The case of having a parameter which does not affect the schema is not that rare. Currently if one includes it in build_fragment some things break in a pretty non-obvious way. For generic code which can take a class as a parameter the FQNs get particularly bad and some things can easily break due to the way ARTIQ hacks the import mechanism during the first call to |
Aah, right, I had forgotten about the plugin. That makes more sense. |
Shall we close this brain dump issue now and open new issues if there are any specific things we want to discuss further? |
@dnadlinger one of the biggest challenges I still find with Assuming we can get upstream pyon to (finally) accept a couple of basic data types like Edit: the specific thing I was struggling with was |
Closing this issue as it doesn't feel actionable. |
Retrospective from having spent some time a while back thinking about the issue I highlighted above about confusion over types / interfaces. See also #308 I got into this because I wanted to add custom online fits (see #309). This meant being able to tell the applet what fitting code I wanted to run. I toyed around with extending the schema to do this (e.g. specifying a That bit worked out nicely. Where I ran into real trouble was trying to hook this into Right now though, things are very entangled in a way that makes abstracting (or understanding at all) the commonalities really difficult. To illustrate the point, here is how things currently stand with regards to plot annotations. As I understand it (but I may have got a few things wrong), things are as follows: On the experiment side we have On the applet side things are a bit more complicated. The basic dataflow as far as I understand it (which is still pretty limited despite having spent a while looking at the code) is:
The thing I struggled with here was how piecemeal the whole thing is. I had expected that the scan model would be something static that is known once the experiment is built and we could serialise / deserialise in one go. Instead there is a lot going on here and it's hard to follow how all the bits are intended to fit together / what's driving the complexity. Anyway...back to annotations... AFAICT at present The challenge here is that there isn't really an interface for how |
ndscan is an awesome addition to artiq. I've been trying to get my head around a few aspects of the design and can't yet tell if I'm hitting against rough edges of an MVP or careful design choices made for reasons I haven't understood. I'm afraid this post is a bit of a brain dump rather than a fully coherent set of questions, but hopefully it's useful for other people looking at the code (even if it's just pointing out that I've got completely the wrong end of the stick). @dnadlinger and others, any comments / corrections really welcome!
Glossary
schema: string representations of objects (fragment, scan parameter, result channel, etc). This ends up being roughly a json-ified dictionary.
fqn: fully-qualified name of an object as a string
schemata: dictionary of schema with fqns as keys
Why do we have schema?
There is a really nice discussion in the design retrospective which I found really helpful. My current understanding is:
hdf5
results file. e.g.ndscan.show
allows us to easily reproduce the default plots for an experiment. This feature requires us to convert the scan to something that can fit inside the hd5 filefqn
s to allow the users to find parametersBits I've found confusing...
I found (still find) the parts of ndscan which involve schema hard to get my head around. Trying to boil this down to something concrete, I think it's the following (c.f. the design retrospective):
Obscuring the ontology
As a newcomer to the codebase, I found myself thinking "what on earth is this object and where does it come from". With all the string representations of objects, often referred to by rather generic names ("schema" is used in the codebase to refer to schema for multiple different types of objects) and sparse doc strings, understanding a piece of code often turned into a rather non-local problem.
I found myself asking "what kind of object is this the schema for?" and not being able to answer that without digging through the code to find where it was called from, often having to search through a few layers to get an answer. This felt particularly bad on the applet side where everything is a dictionary and it took me quite a while to trace those through the dataset broadcast and back into the experiment code. It's also not easy to do this by searching the codebase because of the use of generic terminology and lack of concrete types.
To some degree ndscan gets away with this because the ontology is pretty simple, but it still felt like a major point of friction. Even some daft things like the figuring out the convention that that "channel" is synonymous with "result channel".
No clear finalization step
If I've followed the design correctly, the scan structure is communicated to the applet in the
TopLevelRunner.run
method (via _broadcast_metadata ). To avoid a messy synchronization problem, the scan needs to be "finalized" before then. When I first came to the codebase, I assumed this would be done in a single finalization step, which freezes the scan structure and creates the schema representation.In fact, creating the schema is spread across at least a couple of places. Some is done at run time (either inside
TopLevelRunner.[_broadcast_metadata]
(ndscan/ndscan/experiment/entry_point.py
Line 445 in a1bb524
[scan_runner.describe_scan]
(ndscan/ndscan/experiment/subscan.py
Line 92 in 9d46a98
ArgumentInterface
. Overall, I still find it hard to trace through where various bits of the schema are created and why they are done there (c.f. #269). To what extent would it be possible / desirable to have a single function that recursively does the entire thing?Right now I find it really hard to follow how the information in the schema is used and what the impacts of changing something at various parts of the scan lifecycle are. (semi-related, why is the scan stored in a pyon argument rather than a dataset?).
No decode / encode methods
Parameters and result channels all have a
describe
method which produces a schema for the object. This is nice as it gives one a clear place to understand what's in the schema (although, arguably, it might be better to have a base class which provides this functionality since all parameters have the same structure of schema). However, on the applet side the schema are used instead of objects. I wonder if it would be worth having adecode
(bikeshed name) method that reconstructs the objects from the schema?But, other parts of the ontology are converted in a more ad hoc fashion (e.g. is there a reason that
ScanAxis
doesn't have a describe method?ndscan/ndscan/experiment/scan_runner.py
Line 373 in 9d46a98
The text was updated successfully, but these errors were encountered: