Add default way to specify upstream data for this node #69

kzecchini · 2020-03-03T19:50:59Z

Right now custom nodes need to apply certain logic to find upstream data - sometimes filtering on keys. However upstream operations might cause the keys to be different names, or in a different format.

I think that there may be a way to implement a standard way to find upstream data in the AbstractNode class. Every node should be able to take a standard configuration which will search for upstream data for this custom node. We can specify the potential "data args" to a node in this way.

For example if I am looking for data_1, but it is keyed to my_upstream_key_1, we can have a configuration which fixes this mapping for us. Example:

...
class: MyCustomNode
upstream_data:
  filter_for_key: my_filter
  data_1_key: my_upstream_key_1
  data_2_key: my_upstream_key_2
...

Our documentation for each class can include the data which is needed in the data_object, for example:

data_1 (pd.DataFrame): dataframe of training data
data_2 (int): number of cv folds
...

In this way we can ensure that when we are searching upstream, we can always find the data by including an optional remapping.

The text was updated successfully, but these errors were encountered:

briangrahamww closed this as completed Mar 5, 2020

briangrahamww reopened this Mar 5, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add default way to specify upstream data for this node #69

Add default way to specify upstream data for this node #69

kzecchini commented Mar 3, 2020 •

edited

Loading

Add default way to specify upstream data for this node #69

Add default way to specify upstream data for this node #69

Comments

kzecchini commented Mar 3, 2020 • edited Loading

kzecchini commented Mar 3, 2020 •

edited

Loading