Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optional datadict mapping config to AbstractNode #35

Open
briangrahamww opened this issue Dec 11, 2019 · 0 comments
Open

Add optional datadict mapping config to AbstractNode #35

briangrahamww opened this issue Dec 11, 2019 · 0 comments
Labels
enhancement New feature or request

Comments

@briangrahamww
Copy link
Collaborator

briangrahamww commented Dec 11, 2019

@kzecchini brought up an issue where the sklearn model node requires a hardcoded datadict key that is used by the sklearn preprocessing node, but not necessarily other nodes which may connect to the sklearn model node from upstream. His example is a TF node that creates embeddings, which he wants to cluster using sklearn.

One option is to have a custom key converter / datadict mapping node to rename the datadict key. This seems like overkill for creating a new node.The other option is adding a node config parameter to specify the key to write to the data dict.

We have a lot of redundant code in each of our custom nodes to get upstream data with a specific key and assign it to a variable. We write this code enough that we should maybe abstract this functionality to primrose itself.

One option:

for_example:
  class: ExampleNode
  upstream_data:
    new_data_key_1: upstream_data_dict_key_1
    new_data_key_2: upstream_data_dict_key_2
    ...

would take data_object.data_dict[upstream_data_dict_key_1]
copy it to data_object.data_dict[new_data_key_1]
but then in your code, you still have to use get_upstream_data and then use the new key.

Option 2:

for_example:
  class: ExampleNode
  upstream_data:
    object_1: upstream_data_dict_key_1
    object_2: upstream_data_dict_key_2
    ...

would assign data_object.data_dict[upstream_data_dict_key_1] to object_1

The question is now how does this new object get passed to your custom node like the data_object does?
Maybe add *args to AbstracNode.run?
AbstracNode.run(self, data_object, *args) and methods in AbstracNode that are run on init?

I'm just brainstorming some ideas here. Feel free to weigh in @kzecchini

@briangrahamww briangrahamww added the enhancement New feature or request label Dec 11, 2019
@briangrahamww briangrahamww reopened this Mar 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant