-
-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
farmOS.py revisited: Design/collaboration proposal - hybrid sync/async, pluggable authentication/reactors #31
Comments
@symbioquine great!! Look forward to hearing your thoughts. As I mentioned in that issue, all I've briefly looked into is the httpx library as a potential path to async support. I've just released a new beta version of this library with additional support for OAuth (most everything up to this point has been a beta release for testing in the Aggregator). Once we get OAuth finalized in 7.x-1.4, I think we'll release a v0.1.6 of the library. Perhaps we could start drawing up async support for v0.2.0 and allow for some more breaking changes/improvements. |
@symbioquine Just seeing the updated description! This is great.
I would really love to have some farmOS Data Types! There's some discussion at farmOS/farmOS#231 about adding JSON Schema definitions to an API endpoint. This way, it might be possible to ingest the data type schemas (which might vary from one farmOS server to another) and generate the data models then. This will have to wait until farmOS 2.x, however
👍
👍
This sounds good! I think we will really only need to support OAuth and Session cookie auth, but adding support for Basic would be fairly easy too. However because the modules for Basic Auth aren't included with farmOS core, lets not prioritize that now (although this might be different with D8)
Yea, interested how we can do this! Also curious what are advantages/disadvantages of the different backends. I want to lean towards whichever makes it easiest for this that you bring up -> "How to support sync/async in the same code base" ! Also, @symbioquine are you familiar with the included python ConfigParser library? I've been using this in the latest releases as it makes it a bit easier to provide default configurations for the various OAuth settings. In certain instances, it might also be nice to save different "Profiles" in one config file - I'm curious what your thoughts are with that. This would be related to the |
Thanks for all the feedback @paul121!
Hmmm... there's a tradeoff there. If those types are directly exposed to clients of farmOS.py it won't be abstracting the underlying transport (http/JSON/etc) API anymore. I should update the proposal's design objectives to be clearer on the abstraction point so we can hash this out more fully. My thought was that the types should be part of farmOS.py so that it can abstract between farmOS versions and hide ugly parts of the transport API. e.g. why is the primary id field of an area called "tid"...
Agreed on ignoring Basic Auth for the time being.
Supporting different backends would allow the client to be used directly in existing frameworks/applications. This is especially important for the async cases where it needs to play nicely with whichever event loop is being used to manage the async co-routines. There are probably other differences between various backend options - performance, proxy configurability, TLS ciphers, etc. - but I don't know how important they are.
I have a prototype which already achieves this to some degree although it may require a bit of refining... Planning to add more detail to this proposal in the next couple days which should capture a structural diagram and the high level interfaces involved.
I've worked with a number of different configuration frameworks in past - including a very long time ago with ConfigParser... These days I'm pretty wary of baking "configuration" into APIs - especially early on or at too low of a level.
Don't get me wrong, there are scenarios where externalizing configuration makes sense, I'd just recommend waiting until we have a very distinct group of use-cases which are hard to serve any other way and their criteria upon which the API can be crafted. Even then it is probably wise to make the configuration functionality an optional "convenience" part of the API rather than a core part. |
Hmm yea, I see what you're saying. Perhaps certain aspects like the "tid" could be abstracted, while also pulling the schemas from the farmOS server. A good example for this might be the "Soil Test" log type - this module might not be enabled on the server. Similarly, certain field types on a "log" may be provided by optional modules. Once pulled from the server, maybe these models could still be abstracted?? I've been working a lot with the Pydantic library and would be excited to use that here for data validation! I actually just recently learned about its Settings Management features:
... Which might be an alternative to ConfigParser.. re: an API "configuration"...
Yeah, I agree that the configuration should be an optional "convenience" feature. Supporting Drupal Session Auth & OAuth Auth got to be fairly complicated from a single API level because parameters like A good use case for an API Configuration would be using the library to
Great!! Excited to continue thinking this through! 👍 |
Skimming through this a bit since I saw the topic of JSON Schema was mentioned...
I'm not sure if this speaks to your concern, depending on exactly what you meant by the "transport API", but the approach I've been taking with Field Kit is to take the schema data as a partially applied dependency, which could come from any sort of source, whether it's over the wire via http, or whether that schema is stored on disk, or what have you. Then that spits out some functions for creating and updating logs based on that schema. I've got a gist that demonstrates the basic usage I'm imaging (the actual implementation is still in the works): https://gist.github.com/jgaehring/8799af782404fdf8abbaa116b3e97614 I suppose this is not the purest form of type definition, b/c it still assumes JSON as the data interchange format, but I feel like if we're able to abstract the data model to JSON Schema, that brings us at least a step closer to something that would be truly format agnostic. I'm curious, do you know of any other methods of providing a spec for such data models that would be less coupled to a particular format? In the meantime, I've opened a new issue, farmOS/farmOS#243, specifically for adding JSON Schema to farmOS. I have among my long-term aspirations the dream of creating a local-first implementation of farmOS that could support the farmOS data model while not ever requiring connection to a farmOS server. It's my hope that the |
I want to explore this further in my reply to Jamie's comments, but I'm initially skeptical of the idea of dynamically basing the model types on a schema vended by farmOS at runtime. I think it may be difficult to do that and still provide a useful abstraction to consumers of farmOS.py.
Agreed that shoving all the parameter combinations into a single constructor/factory-method would get really complex - as you noted, I was trying to get around that with a separate static factory-method for each scenario. (I'll add a FAQ entry to the issue description tackling the possible combinatorial explosion of said factory methods.) That Pydantic settings management functionality looks like a good way to provide the configuration convenience methods we touched on above... async def create_session_client(url: str, username: str, password: str):
....
async def create_session_client_from_settings(settings: FarmOSSessionClientSettings):
return create_session_client(**settings.dict())
I'm interested to learn more about this use-case, but my initial reaction is that the actual sensor integration and accompanying environment might be out of scope for farmOS.py. I would imagine the level of support that farmOS.py might need to provide is; async with farmOS.create_asyncio_sensor(url="https://farmos.local", public_key="xxxxxx", private_key="yyyyyy") as sensor:
await sensor.publish(data={"temperature": 76.5, "humidity": 60})
await sensor.publish(data={"temperature": 76.5, "humidity": 60}, datetime=datetime.fromtimestamp(1541519720)) or; sensor_settings = FarmOSSensorSettings.parse_file("/path/to/sensor_settings.json")
async with farmOS.create_asyncio_sensor_from_settings(sensor_settings) as sensor:
... |
I mean the REST/JSON/Drupal-entity API represents one abstraction boundary which aligns with what actually travels over the wire - via the http(s) network transport. In its current incarnation, that "transport" API probably cannot be expected to abstract a wide range of current/future farmOS versions since it is partially an implementation detail of the underlying Drupal framework and likely needs to be able to change in order to allow farmOS itself to grow/evolve. On the other hand, farmOS.py represents an different kind of API boundary which could provide a stable abstraction to its consumers over a (reasonably) wide range of farmOS versions. To put this more concretely, ideally none of the code in farmOS/farmOS-aggregator or symbioquine/farm-os-area-feature-proxy should need to change to support the D8 version (and be backwards compatible with the D7 version) of farmOS if we do this correctly. (Other than making sure to use a new enough version of farmOS.py.)
I like where you're going with this, but to achieve what I've described above, the dependency would need to be a build-time dependency. i.e. model types or code could be generated from various schema at build-time, but unless I'm missing something, I don't think it would work at runtime without pushing version compatibility off on the consumers of farmOS.py.
I'm curious about the motivation for decoupling from JSON? This seems like a perfectly fine hard dependency to me - given the ubiquity of JSON and how well it is supported over a wide range of languages/tools. Do you know of any use-case for farmOS.py which cannot use JSON as a transport/interchange format?
Even in light of what I've said above, I agree that this would a be a useful feature.
I'd like to hear more about this. I've been considering something that seems related - that it would be useful to have a "mock/headless farmOS" reference implementation that just keeps the data in memory and hosts the transport API against which libraries like farmOS.py could be tested. |
@paul121 @jgaehring Also I've updated the issue description to add more examples, and add diagrams showing how the dependencies/methods of the parallel async/sync APIs could be organized. |
Added a references section, diagrams for the top-level module's methods, and a couple more FAQs. Also tagging @mstenta for comment since he was active on symbioquine/farm-os-area-feature-proxy#1 |
@symbioquine Really appreciate all the thought you've put into this!! 💪 Overall, I think this proposal is great and would bring some really great features to the python library! re: recent changes..
I like the idea of exposing pagination via the API. Currently, pagination is "exposed" via the returned response (see #15) and allows the consumer to request specific pages by passing a
all_areas = list(farm.area.iterate_query(filters={'area_type': 'field'})) The current implementation defaults to a "get all" action unless a
👍 👍 Great examples in the FAQ! Another benefit of the I'm still thinking on best ways to support Settings.. it would be great to avoid the second API method. I think think this would work just fine? # Build settings object from env var or read from file
settings = FarmOSOAuthClientSettings()
async def create_oauth_client(**settings):
.... Re: JSON Schema.... I think our general motivation for JSON Schema has been to support record validation in the client libraries. The motivation for this comes from different log types having different fields: (screenshot of This is important because some farmOS servers could have the
From my understanding, consumers will always be responsible for checking the supported record types on a farmOS server they connect to - which is different than "version compatibility" of the farmOS server. What we're trying to achieve is some additional validation in the library that can adapt to the farmOS server on runtime. I think this is possible! Pydantic has a cool feature for Dynamic model creation :
This is really great because the dynamic model can extend a base model. So..
Note, the dynamic These thoughts are pretty conceptual, but I think that it might work. Worst case, the base |
@paul121 Really appreciate all the thoughts/feedback here!
I think we should disallow controlling the pagination via the For example; page = farm.area.query_page(filters={'area_type': 'field'}, page_num=3, max_page_size=10) Calls farmOS as;
Seems like a good simplification as long as we document it well enough. My thought was that the method is very cheap to implement and provides a nice top-level place to put that documentation in the code...
These are good points. I'll need to put a bit more thought into this part...
I think you're right that there are several connected concerns here.
Validation makes total sense, we can't/shouldn't get around the fact that some farmOS installations may support a sub/superset of the fields. Especially when writing data to farmOS, consumers of farmOS.py will necessarily be coupled to the abstract feature set of the specific farmOS installation(s) they are working with. e.g. As you said above, you can't write log records with quantities if the I'm mostly worried about consumers of farmOS.py not being able to predict the structure of the model objects they get back - or the structure of the objects they can create locally. To make this a bit more concrete, as a consumer of data via farmOS.py I would hope I don't have to be defensive about the existence of the For example, instead of writing; for quantity in getattr(log, 'quantities', []):
print(quantity) I should be able to write; for quantity in log.quantities:
print(quantity) If the
Agreed that we should try not to make farmOS.py a barrier to working with any of the data in a reasonably canonical farmOS installation. |
Yea, this is exactly what I'm dealing with in my I guess one way in which we're solving this with Field Kit is that we're mirroring the modularity of the server with similar modularity in the client. As we move further towards the Field Module architecture, this means that Field Kit itself makes fewer and fewer assumptions about the server's configuration, and instead each field module becomes more responsible for declaring its own dependencies explicitly. And again, I'm also allowing for some of that modularity by structuring But that's application code. I'm not sure if it would be possible to take a similar modular approach with the farmOS.py library, but perhaps it's a thought? Also, I'm still not sure about this:
It seems to me that there will always be some details (eg, configuration, modules, dependencies, etc) that need to be shared knowledge in a server-client architecture, and that cannot always be available at build-time. But I also don't see runtime dependencies as a barrier to abstraction. Maybe I'm still missing something though? |
@jgaehring Thanks for the inputs!
Yes, I think this is a key point. Field Kit can safely make certain assumptions and provide behavior which effectively yield an implied interface. The contract for that interface is: "If your farmOS installation behaves like X, these N features will be available in farmOS.app". Users of farmOS.app then presumably dynamically adapt their behavior based on the available features - and (hopefully) that feels natural to them. An API like farmOS.py is most useful though if it can reduce the amount of variables consumers have to deal with - and where that's not possible, provide predicable ways to deal with those variables.
Certain kinds of runtime dependencies are a barrier to abstraction. For example, I'm skeptical that it would be wise to generate the actual model Python classes at runtime since that would create the issue I described above where one has to be defensive when consuming farmOS.py even for the most basic attribute access cases. Whether given modules are actually enabled and whether the corresponding data can ever be returned or saved from/to a particular farmOS installation is definitely only known at runtime. |
Yea, this is nice. More explicit 👍
@symbioquine @jgaehring I think we're finally reaching the nitty-gritty details of this, great discussion 😃 Field Kit is a good example of why these server "dependencies" should be passed through the client to the consumer:
In general, I think farmOS.py should operate similarly. But perhaps there are different use cases. The
For some use cases (I'm hesitant to say most) I think this behavior (always returning an empty list/adding other fields to logs) could be problematic. As a consumer, if you're using log Also, I think its worth noting if a Perhaps a good example is the farmOS-Aggregator which is designed to simply relay the records retrieved by farmOS.py. If farmOS.py added an empty
This seems like a general challenge 3rd parties working with farmOS will have to work around. Most of the time, it can be expected that most of the core modules will be available. But thinking to the future, very custom modules could provide additional fields to farmOS records for specific use cases. The farmOS client libraries should still support these extensions.. but that might require consumers to be "defensive" about the structure of the models that are returned. If the library makes decisions that change the models the farmOS server is actually returning, isn't that a problem? 🤔 Regarding creating records, a quick test found that this request... new_log = farm.log.send({'name': "New test log.", 'type':'farm_soil_test', 'done':1, 'seed_source':'blah'}) gets this response from the server ... Question: are there any use cases where a consumer cannot be defensive? Is that inconvenience worth manipulating the response from the server? Perhaps this could be a configuration for the library, to return "raw" data, or (for lack of a better term) "clean" data? @symbioquine I must admit that I might be stuck in thinking about only a subset of use-cases for farmOS.py. In general I think of two use cases:
I'm glad you created the forum post to collect more thoughts on what is expected of the library! I might list my thoughts on use-cases there. |
Wow @symbioquine this is great! And so dense with ideas and considerations haha! Great job getting into all the details @paul121 and @jgaehring! I can't think of much more to add. There may be a few little details here and there that might need minor correction/clarification, but I think the overall issues remain the same. For example: the quantity module is a hard dependency of many log types, so the quantity field will not be missing on some sites and there on others. BUT, the "equipment used" field and the new "data" field ARE added dynamically depending on whether or not their module is enabled. Now this in itself is up for debate and I may be swayed to make them not optional if that simplifies things... the data field is an easy one... the "equipment used" field would put a hard dependency on the Equipment module, however, which I'm less sure about. As you can see we could probably spin off on lots of different tangents with these thoughts. :-) In general, I love all the goals you laid out @symbioquine - and figuring out the details/considerations to move us towards them, even if that is done iteratively, is worthwhile IMO. |
And of course... we should also consider the changing API options in Drupal 8. We will probably be adding the GraphQL module, in addition to the core JSON:API module, which will open up new possibilities... it would be worth playing around with some of those things early in this process as well. |
We have some funding to make improvements to farmOS.py and farmOS-aggregator over the next few months. I'm tracking these in a v2 milestone: https://github.com/farmOS/farmOS.py/milestone/1 This issue is a little dated now but still has lots of great info and ideas. Some of these things have already been implemented in farmOS/farmOS.py, but certainly not everything (see initial issue description for a nice summary). There a couple ideas from this issue I'd love to pick up in this next around of work:
I think we could close this issue but will leave that for you @symbioquine - I'd prefer to move forward in some smaller issues but want to make sure we don't dismiss any large topics you mentioned here |
I've been digging into possible farmOS.py API structures and (relatively) "recent" improvements in Python's support for async co-routines as prompted by the discussion on symbioquine/farm-os-area-feature-proxy#1.
I plan to use this issue to capture my thoughts and see if we can converge on a design for a next major version of this library that could support a wider range of use-cases more gracefully.
Design goals
API Sketching
Synchronous Usage
Simple login/password session client;
Yields;
FarmOSFarmInfo(name='Test0', url='http://farmos.local', api_version='1.1')
OAuth2 client;
Asynchronous Usage
All API usage should be identical except that;
farmOS.create_*_client
methods are used to create the clientawait
is used for calling all methods which directly return a valueasync with
is used anywhere you would usewith
in synchronous usageasync for
is used when iterating over pages of results - or derivatives thereofAreas
By Id
Yields;
FarmOSArea(area_id='8', area_type='field', name='cactus_field', description='<p>This is the description!</p>\n', flags=['priority', 'monitor', 'review'])
Pagination
or
Yields;
Iteration
Yields;
Creation
Yields;
FarmOSArea(area_id='10', area_type='field', name='neon_field', description='<p>This is the vibrant description!</p>\n', flags=[])
Updates
Yields;
None/throws
Deletion
or
Yields;
None/throws
How to support sync/async in the same code base
The layers that do most of the heavy lifting get written in the async style, but have no concrete external dependencies themselves.
These classes get put into python files named
async_*.py
and the classes are named following the patternAsync*
.Then we generate the non-async versions from that with some code seen below. The top level of the farmOS library then provides static factory methods for wiring together the various permutations of the sync/async client layers.
Structure
Diagram .png files include embedded draw.io source.
Asynchronous
Synchronous
Asynchronous Sensors
Synchronous Sensors
FAQs
It seems like you are trying to "boil the ocean" with this issue... is all this tractable to tackle as a single "issue"?
Maybe not. I expect the API described here to take a lot of work to implement, document, and test thoroughly.
My purpose as mentioned above is to polish a vision for how the farmOS Python API should be. If we can align on that vision, a branch can be used to track the development against that vision until it can be released as a new major version.
Why expose pagination explicitly?
This gives consumers of the farmOS.py library the flexibility to handle large amounts of data in/from farmOS performantly. It would be really challenging or impossible to completely hide the pagination and still provide that flexibility.
Why are there no "get all" or "query all" methods which return collections, only pagination and iteration?
This would unnecessarily increase the "surface area" of the API while encouraging unperformant usage patterns that hold all entities/records in memory.
Consumers which require the entities/pages as collections can easily do so themselves;
Will those static factory methods on the top-level module of farmOS explode combinatorially as the library supports more kinds of transports, auth mechanisms, etc?
Possibly, but this also lets us be opinionated about the supported combinations and provide concrete documentation with embedded code examples for each one.
How would the proposed API support retrieving multiple records by id in a single request - i.e. #14?
Naively, it would seem that this should be supported by the
get_by_id
methods - e.g. onFarmOSTaxonomyClient
- however that would probably be a mistake since that would make the return type different depending whether a single or multiple ids are passed.A better approach would be to support this via the
filters
argument to thequery_page
,iterate_query_pages
,iterate_query
methods. This conveniently fits into both the proposed python API (which already return multiple records for those methods) and the existing Drupal entity "transport" API simply by adding a simple requirement for the handling of thefilters
argument;Thus the example from #14 would be supported by the proposed API as follows;
Calls farmOS as;
Conveniently, this also allows for other kinds of cool queries to execute efficiently from a transport API perspective;
Calls farmOS as;
References
async
/await
keywordsThe text was updated successfully, but these errors were encountered: