-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: jsonasobj2 vendoring #361
base: main
Are you sure you want to change the base?
Conversation
Any idea how to run the CI locally ? |
Can we add deprecation warnings to all the jsonasobj2 behavior while we do this? We want to say "hey if your code relies on using the special jsonasobj stuff, these will soon become just regular dataclasses, see xyz page in the docs for an upgrade guide" with plenty of lead time. That should be visible even if people don't regenerate models during the deprecation window if they don't have an upper bound on linkml-runtime in CI. I think we would want that on the dunder methods in the jsonobj class, as well as in every function like |
@sneakers-the-rat like that 185e6ed ? I used
|
Haha oh boy, ya that might be too verbose. maybe we emit one deprecation warning when importing that module? |
(few hours of sleep later) fixed in 73effc9 by using
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First of all this is great and it is exactly what needed to be done to move us forward here, so thank you for that!
I AM SORRY for my hassle here, but it appears this would break existing generated modules, see comments for details. I proposed a few options but we can discuss further and I'll help out implementing them since this is proving to be a lot. we may want to pull into a develop branch and do this in a few chunks :)
@@ -8,6 +8,7 @@ MODEL_DIR = ../linkml-model/linkml_model/ | |||
|
|||
update_model: | |||
cp -pr $(MODEL_DIR)/* linkml_runtime/linkml_model | |||
sed -i 's/from jsonasobj2/from linkml_runtime.utils.jsonasobj2/g' linkml_runtime/linkml_model/*.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haha i am assuming this is a "temporary fix until we update PythonGen" thing right? we should remember to write that down somewhere or else merge this at the same time as we make that small change in pythongen upstream? wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, it's "temporary".
I'll add a comment that it should be deleted after PythonGen is updated.
Worst case, the command has no effect if we forget to remove it immediately.
@@ -29,3 +29,13 @@ It also includes the [SchemaView](https://linkml.io/linkml/developers/manipulati | |||
## Notebooks | |||
|
|||
See the [notebooks](https://github.com/linkml/linkml-runtime/tree/main/notebooks) folder for examples | |||
|
|||
## Development |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gotta love turning issue discussion into ambient docs fixes, thanks for this :)
hide = ['_if_missing', '_root'] | ||
|
||
# https://github.com/linkml/linkml/issues/1966 | ||
warnings.warn("This module is going to be replaced by native Python dataclasses.", PendingDeprecationWarning) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I AM SORRY to keep belaboring this, and i know this was literally my proposal and you've done it twice already, but just to do a final double check - does this do what we want it do?
So the squirrely case that we want to protect with lots of people using generated code that might stay static for a long time: say someone has ancient dataclass models that import jsonasobj2
but they update linkml-runtime
(or don't have an upper bound & just get latest). That generated code isn't really subject to updates the same way normal dependencies are (although working on project config that will make that possible). It's made a bit more complicated by how the generated code is a mix of behavior from the parent classes and from code in the module (ie. the as_dict
calls are in the generated model code, not in YamlRoot.__init__()
).
What this does correctly: the package is imported, and the module will definitely be imported, so the warning would still be emitted though even if one was using old models, so it accomplishes the goal of letting people know they need to update their models (i think we need a little more information here but i'll put that in separate comment).
Using the metamodel as an example, JsonObj
is imported but not used, but as_dict
is used commonly. jsonasobj2.as_dict
checks for being an instance of jsonasobj2.JsonObj
, so it will just return the object unchanged - i think this is what Chris was talking about in the issue. i think it would be pretty broken - in both directions, our linkml_runtime.utils.jsonasobj2
functions being used on jsonasobj2
objects and vice versa wouldn't work.
We need a specific regression test for this in any case: validating that we can use some frozen kitchen sink models with our functions and etc. Honestly what we might want to do is just make a test data directory that has frozen copies of all the generated models and have a global autouse
parameterized fixture that runs the tests twice against current and old models - the linkml-runtime tests are pretty quick iirc.
maybe that's overkill, but what i'd like to get a sense of is
a) is it actually broken or am i wrong (i haven't tested locally, will do) and
b) how broken is it.
If it's mostly fine, we can fix it with smaller tweaks, but if it completely breaks everything, we would want to actively handle it being in the environment. some initial ideas:
-
We could not immediately remove the jsonasobj2 dependency, but use this module as a passthrough and do stuff like have our
JsonObj
inherit fromjsonasobj2
, or override the__instancecheck__
method and swap in the matching function/etc. fromjsonasobj2
with a warning or smth. edit: actually would it work if we just overrode__instancecheck__
and just returnedTrue
if we were checking againstjsonasobj2.JsonObj
? i would do it but it's getting late and i still have some other work to do -
Since we have control of the
jsonasobj2
repo, we could also update the instance checks there to check if we're a linkml object, emit a warning, and proceed as if it was ajsonasobj2
object. -
One last idea would be to do some aggressive monkeypatching and force replace
jsonasobj2
. This is pretty cursed, but also might be the most complete (as in would allow the versions between now and removal to be used with most existing generated models) one while moving fastest on deprecation? we can't guarantee thatlinkml-runtime
is imported beforejsonasobj2
. in fact if there isn't some code that importslinkml-runtime
for other reasons before importing a generated module, we are guaranteed to be imported after it. If i'm not mistaken we will need to go and find every other namespace that could have importedjsonasobj2
and patch it there. from the importlib docs:Other references to the old objects (such as names external to the module) are not rebound to refer to the new objects and must be updated in each namespace where they occur if that is desired.
For generated modules, that's straightforward, we just need to ascend into the containing scope and patch it there, but say someone has some other module in a downstream package that imports
jsonasobj2
and uses its functions/objects but notlinkml-runtime
, and a third module then imports the second module along with a generated module - either before or after. Thenjsonasobj2
wouldn't be in our call tree anymore, so we would need to inspect the scope of all the other imported modules too which would be either way too expensive or impossible to do. anyway there are a few more things we'd need to do but you get my point it would be hard to do and probably fragile.
anyway first thing's first, let's write some tests to see if there even is a problem here, but i am thinking we might need to hold onto jsonasobj2 as a dependency and patch it in here in some way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
confirmed this does break generated modules, e.g. I replaced the from linkml_runtime.utils.jsonasobj2
import with from jsonasobj2
and i'm getting the expected error, getting returned the original JsonObj, and this does prevent the __init__
method from running.
FAILED tests/test_issues/test_linkml_runtime_issue_1317.py::TestRemoteModularSchemaView::test_imported_classes_present - TypeError: linkml_runtime.linkml_model.meta.ReachabilityQuery() argument after ** must be a mapping, not JsonObj
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fundamental issue I see, which will appear again, is the cycling dependency:
linkml-runtime -> linkml-model -> linkml -> linkml-runtime
.
Unfortunately, I can't help much more on my side as I'm not able to reproduce the Github CI of the 3 repos locally.
About breaking isinstanceof
call, maybe that's an opportunity for a major release v2.0.0 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes yes that is the problem for within linkml packages, we could have some better sync tooling it is true, but it is resolvable by doing synchronized PRs which isn't too bad. This problem is more about what happens to all the python dataclass modules that have been generated in other projects, which are both independent code but also entangled with linkml-runtime
, so we have to make sure to give lots of runway when making changes in linkml-runtime
so people notice and regenerate their models without them suddenly breaking.
The bug can be reproduced by swapping this line https://github.com/ticapix/linkml-runtime/blob/73effc99c2a2d76cd9c6270ccfc04f64e20c9e80/linkml_runtime/linkml_model/meta.py#L35
to the previous
from jsonasobj2 import JsonObj, as_dict
which simulates what we would expect an old (or current) pythongen output to be, and how it would behave if someone were to install an updated version of linkml-runtime with this PR merged. I used the metamodel there because it's used in like every test but it could be any generated module.
Unfortunately, I can't help much more on my side
Totally fine, I think i know how to fix this (thanks to you having vendored the module, we have new flexibility we didn't before!) and will add a specific regression test for it too.
I'm not able to reproduce the Github CI of the 3 repos locally.
that's a good point, this is pretty hard to do, we do need some metacode for development, regenerating the metamodel and handling sync'd prs is pretty inaccessible
hide = ['_if_missing', '_root'] | ||
|
||
# https://github.com/linkml/linkml/issues/1966 | ||
warnings.warn("This module is going to be replaced by native Python dataclasses.", PendingDeprecationWarning) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
separate note on this, I think we will want to give some instructions on what to do here - we are warning someone that something will happen, but don't tell them what to do about it (so then the only way they can silence the warning is to filter it, but then they might miss if it gets updated with instructions). Are the instructions we want to give here just "replace jsonasobj2 stuff with equivalent dataclasses stuff"? if so i think we'll probably want to write an update guide for the docs.
@cmungall we might want to give a version warning too, like "jsonasobj2 will be removed in x version" so people can either set version caps or know when they need to update by, any input on what that version should be?
As proposed here linkml/linkml#1966