-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WG Data proposal #673
base: master
Are you sure you want to change the base?
WG Data proposal #673
Conversation
I am very strongly opposed to using the name My proposal for the name is: Where "data" can mean both actual data (spark) and metadata (model registry). We can also split it up in the future, if the members who are maintaining these components diverge. |
very well noted @thesuperzapper , as also marked here: I just wanted to have a branch where to start collecting this kind of feedback in a non-sparse way and also to report back to you and the group on the progress on Tuesday meetings. |
@thesuperzapper how about we make it more explicit |
As it currently stands, this WG does not meet the requirement for diverse leadership given all chairs come from one company (IBM - which owns RedHat). |
@thesuperzapper Andrey is listed as a Chair, he's from Apple |
noticing only now it was not marked as Draft PR despite being my intent:
my sincerest apologies. Marked as Draft PR per original message in thead. |
@thesuperzapper Is there a minimum number of companies to compose the chair to make the WG eligible? |
While there is no specific number requirement, the steering comity must approve the new WG (currently, @jbottum @james-jwu) in line with the community's interests. I would expect at least some concern with having 4 leads from one company and only 1 from another. For reference, here is the lifecycle and other info about forming a working group: Also, there are only meant to be 2-3 chairs, some other WGs have more, but in most cases, there are 2 active members and we just need to formally clean up the inactive chairs. |
Also, some of the proposed chairs are not even current Kubeflow org members, so are ineligible unless they go through that process first: |
Thank you for the references! Those are valid points though, and I'll see how we can work on the eligibility topic as well as your concerns. |
As Ricardo noted, thanks ! Is there guidance for deputies to keep work WG ongoing during leaves, please? As noted, will work out to account all the feedback received; thank you those are very helpful |
Thank you for starting this @tarilabs! Let's collaborate together on this PR for the WG Charter and Name. Please provide your suggestion on how we should name this WG that initially will have Spark Operator and Model Registry component. A few initial suggestions if WG Lifecycle is too ambitious:
This is valid concern @thesuperzapper. We can add folks from Spark Operator maintainers to this WG |
cc @kubeflow/wg-training-leads |
I would request "WG ML Lifecycle" if the purpose of the group is to house things in the MLOps orbit that don't have a more specific working group yet so they can "incubate". Data Preparation, Feature Store, and Model Registry being 3 examples that have been recently discussed that likely aren't big enough yet to have their own working group. I guess one key aspect here is to consider how new efforts can happen without the overhead of setting-up a new working group for each one until it is truly merited and bandwidth is available. Is there a process that exists for refactoring a topic out of one working group to a new working group? |
Kubeflow seems to be entering a new growth phase. The community needs a structure to support add-on components (Spark, Ray, Model Registry, Feature Store, etc). We want to encourage contributors and users to meet, discuss, experiment, decide, store code and produce documentation with a goal that integrations will help both Kubeflow and the add-on projects. We need to minimize overhead. We need to set expectations (of support...to/from Kubeflow and for users) especially if we are experimenting and trying to find market acceptance. Most importantly, we need active user participation, comment and leadership. I want to move this forward...I am a +1 to adding a single umbrella WG for all of these projects to get things moving. @james-jwu would you please provide your thoughts |
I think that the name
Also, I am still very against Separately to the discussion around names, I think we should confirm that the maintainers of these various components are actually overlapping, otherwise it will make it difficult for this "mega working group" to function. |
+1 to @thesuperzapper I would suggest voting for |
New commit ae188fe incorporates some feedback received around:
will keep posted during KF Community meeting on any further updates. |
Just so we are clear, I think |
- name: model-registry | ||
owners: | ||
- https://raw.githubusercontent.com/kubeflow/model-registry/main/OWNERS | ||
- name: spark-operator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- name: spark-operator | |
- name: feast | |
owners: | |
- https://raw.githubusercontent.com/feast-dev/feast/master/OWNERS | |
- name: spark-operator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe I should not add subprojects belonging outside of github.com/kubeflow here, what is the @kubeflow/kubeflow-steering-committee view on this?
Please review the charter. |
Would this working group be relevant for the minio replacement (seaweedfs) as well? I am currently working on a PoC in Kubeflow/manifests. |
address: kubeflow#673 (comment) Signed-off-by: tarilabs <[email protected]>
as suggested. Signed-off-by: tarilabs <[email protected]> Co-authored-by: Francisco Javier Arceo <[email protected]>
I've added all comments pertaining to Feast in a single commit with fa3c318, so to more easily manage that addition to this wg charter if required or by feedback from SC. |
not entirely sure, that to me is more a "storage"-related concern, while "data"-related concern expressed here are more orthogonal to the actual medium.
I'm very happy however to engage in discussions, since "storage" is also a dimension we're exploring for Model Registry (bringing in OCI as first class, but potentially others with an abstraction layer). Let me know your thoughts! |
Thank you for addressing the feedback @tarilabs! Given that we still have discussion around WG governance and what projects WGs should maintain: #673 (comment), should we include Feast addition as a separate PR after followup discussion ? From my point of view, initially we should just establish the Data WG with 2 Kubeflow components: Spark Operator and Model Registry, and after that we can update charter to include Feast and other projects that we want to maintain under this WG. Any thoughts @franciscojavierarceo @kubeflow/kubeflow-steering-committee @tarilabs ? |
I agree with you @franciscojavierarceo, but should we include Feast in the Data WG once we make Feast as part of Kubeflow core components ? |
Per my comment in the Community meeting, I support Feast as part of the WG Data and as a core KF component. I am glad to pursue that path or another, if that cannot be accomplished (as I believe a defined relationship would help both communities). |
@andreyvelich I am okay including Feast before making it a core component. :) |
Then kubeflow/manifests#2826 and kubeflow/pipelines#10998 might be interesting for you. |
@juliusvonkohout This issue is related to Kubeflow Pipelines (e.g. Pipelines WG), isn't ? |
Anyone who needs S3 storage in Kubeflow, but especially pipelines. |
Bumping this PR. What is missing to get this merged? |
I think, we need to make a decision with Feast. |
I'm following up on action item: raise WG proposal to Kubeflow per yesterday's Model Registry meeting (recording timestamp).
As discussed in KF community meeting.
Main links:
👉 I'm starting to raise a draft PR in order to "seed/bootstrap" the work in raising the request to form the WG--using a draft PR give us a branch we can collaborate on between stakeholders @andreyvelich @Tomcli @dhirajsb @rimolive
This also give us a medium we can keeps-tab-on so to report back on progress during Tuesdays' community plenary meetings, wdyt?