-
Notifications
You must be signed in to change notification settings - Fork 613
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Formalize process for code merging into tf.image, tfa.image, and keras-preprocessing #1780
Comments
From GSOC project desc:
What will happen if users will start train models and serialize overlapping ops in Addons that are or will appear soon in other places (developed from scratch or ported from google AutoAugmentation repos)? Need we to maintain these overlapping ops in Addons for Models retrocompatibility? Need we to stop to process image ops PR by now? What about morphology custom ops that we have in the repo? |
This is a good topic so want to make sure we fully cover it... does the below information cover the issues you're seeing? When a pre-processing layer is serialized it will contain package information so that the correct layer is re-loaded upon de-serialization: For example a TFA preprocessing layer would be marked as pakage If there were underlying custom-ops being called by that python layer those too would have package information: Our worst case for compatibility would be that a custom-op gets called from a preprocessing layer and the exact version of TFA may need to be used upon re-loading. But then we already don't provide backwards compatibility for custom-ops: If the preprocessing layer were to reference a python op from addons that had been migrated then it should alias the new location of that python op (tf-core most likely). |
I suppose that part of the @gabrieldemarmiesse EDIT: some points touched in this ticket was also confusing for students as you can see at tensorflow/tensorflow#37274 |
Sorry @seanpmorgan @bhack , apparently I missed this comment! Let me try to answer your question in several perspectives:
Cheers, |
We are trying to improve the process at: I think we could solve 1. and 2. if we find a common vision in these templates. On the TF side I think you just need to add, to the internal and public PR/review triage checklist an extra check to verify if the code is already in SIGs. The main issue about 3 is that if we don't have a minimal short range roadmap about ops planned in keras preprocessing by TF team developing activities and Gsoc student activities we are at risk to have PR here in addons by contributors that could conflict/duplicate/overlap. So I think nobody here want to waste free contributors time and so the best solution is to have a minimal overview of the plan in this area. E.g. If we know that the TF team is not going to add new image operators and we have a public list of ops in the Gsoc roadmap it could be clear to potential contributors on what kind of image ops PR we are looking for here in Addons. |
We don't have any short-term plans to add ops. For keras preprocessing, they are described here and already implemented. What AutoAugment requires (such as solarize, sample pairing) will not be made into core. Francois and I are discussing the roadmap for keras_image which might include some of these. |
So where the Gsoc student is going to create PRs? On keras_image? |
@tanzhenyu Thank you for the update but it is a little bit unclear for me. Are vision models already under refactoring in the model Garden Vision section and related TF hub activities? So what we will have in keras-cv? |
We'd believe it's best to have each repo do its own thing and do it really well. So Model Garden (IMHO) should handle user facing end-to-end modeling workflow that can quickly get deployed, while Keras-CV should handle low level research needs that can build models with reuse-able components. At the end of day, model garden should re-use Keras-CV components to build their models. |
OK this part is clear and it is what I hope to have reuse-able components not embedded every time in the mode itself (model garden).
I'm not so sure about this second part cause it could go to confuse users and conflict with the multi storage source of models. I've already seen in past many ambiguous entries between TPU repo, tensorflow Garden, Coral, tensorflow lite and so on on model storage/reference. If you start to store overlapped models in Keras and in other places you risk to confusing people a lot. |
We don't intend to have overlapped models (in Keras-CV), and frankly having many models in different places is exactly what is confusing users today, I think? |
Is it a question? If it is a question for me yes was on of the confusing topics. At least the scope of TF.hub was to masquerade these multiple sources to have "reusable parts of machine learning models" and Keras support was one of the first feature requests in TF.hub when was launched (tensorflow/hub#13). |
You clearly understand our plans well :-) |
@tanzhenyu Yes I was talking about transfer learning but then you need to reuse that utils and stuff but generally many of these are embedded in the mode code itself. So this was the main issue, all the preprocessing, postprocessing and glue code was not reusable in transfer learning or "model forks". |
Sorry that I don't follow -- can you elaborate the "main issue" part? |
That other than TF hub common interface for trasfer learning users what to reuse model components that generally are embedded in the model itself. So I hope that if we have a new model like e.g Efficientnet we can use its augmentation as a general components also for other model designs and that we have one reliable model soruce instead of Keras.application, TPU repository, automl repository, model garden then keras-cv and so on. |
IIUC -- as of today Keras-application is still the only one that provides a full suite of backbones to be used for image classification tasks which doesn't have the issue you mentioned |
I guess it doesn't matter too much where it worked and where not. What is important is that it works as a whole in the ecosystem for what we offer/expose to users. I hope we are not going to confict or creating too much duplicates with the model request/contribution policy that we already expose now for Model garden: https://github.com/tensorflow/models/wiki/How-to-contribution. As also for the reusable components part of the topic my position Is at tensorflow/community#223 (you can substitute Addons target with any SIG). I think it is in a way a collective effort of the whole ecosystem to not expose too much teams fragmentation (every group Is working on its own stuffs) to the end users. |
Per #2156 migrations will now be handled by core TF and Keras team members to make the migration process more managable. |
Currently there is no communication or criteria for what is being merged into the different repositories. This creates duplicated code within the ecosystem and is wasting contributor time/effort.
As an example, we've tried to extend
ImageProjectiveTransformV2
without realizing that TF-core had already implemented the same functionality: tensorflow/tensorflow@2699281@tanzhenyu Could you give us some clarification on what the internal process is for adding components to
tf.image
? Can we publish a roadmap of features that are planned to be added and can we make verifying against Addons part of the process for adding new things going forward?@dynamicwebpaige
Could you help clarify whats being contributed to Keras in this GSoC project:
https://summerofcode.withgoogle.com/projects/#4863446367076352
The auto-augment issue caused us to receive a lot of PRs implementing functionality into Addons which looks as though it may now be going directly to Keras.
CC @karmel for visibility and see if you had any ideas for how we can formalize this process.
Related issues (To be expanded upon because there are many):
#1779
#1126
#1275
The text was updated successfully, but these errors were encountered: