Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-15991: Infogram pydocs updates #15992

Open
wants to merge 10 commits into
base: rel-3.46.0
Choose a base branch
from
Open

Conversation

shaunyogeshwaran
Copy link
Contributor

@shaunyogeshwaran shaunyogeshwaran commented Dec 28, 2023

fixes #15991
Crafted new examples to showcase the different parameters.

@shaunyogeshwaran shaunyogeshwaran added this to the 3.44.0.4 milestone Dec 28, 2023
@shaunyogeshwaran shaunyogeshwaran self-assigned this Dec 28, 2023
@shaunyogeshwaran shaunyogeshwaran linked an issue Jan 3, 2024 that may be closed by this pull request
@shaunyogeshwaran shaunyogeshwaran changed the base branch from master to rel-3.44.0 January 3, 2024 07:34
@hannah-tillman hannah-tillman changed the title [DOCS] Infogram pydocs updates GH-15991: Infogram pydocs updates Jan 5, 2024
hannah-tillman
hannah-tillman previously approved these changes Jan 12, 2024
@mn-mikke mn-mikke modified the milestones: 3.44.0.4, 3.46.0.2 Mar 4, 2024
tomasfryda
tomasfryda previously approved these changes Mar 5, 2024
Copy link
Contributor

@tomasfryda tomasfryda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thank you @shaunyogeshwaran !

@@ -745,6 +762,23 @@ def total_information_threshold(self):
information is the x-axis of the Core Infogram. Default is -1 which gets set to 0.1.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of scope but I don't understand Default is -1 which gets set to 0.1.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the first line is not correct since the default is -0.1.

A number between 0 and 1 representing a threshold for total information, defaulting to 0.1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wendycwong any ideas about what's going on with the default values here?

(I can update the schema and fix the first line issue)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hannah-tillman I understand it now.

It comes from here:

https://github.com/h2oai/h2o-3/blob/master/h2o-admissibleml/src/main/java/hex/Infogram/Infogram.java#L185-L187

@wendycwong any reason why we not set it directly here?

Nevertheless, its out of scope of this PR. @shaunyogeshwaran.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@valenad1 @hannah-tillman

The value of -1 or -0.1 is used to denote that the user has not set any value. If the user has not set any value, we will set it to a default value of 0.1. There is a reason that the code needs to know if the user set that value. I cannot remember what it is now.

>>> x = train.columns
>>> x.remove(y)
>>> pcols = ["SEX", "MARRIAGE", "AGE"]
>>> ig = H2OInfogram(protected_columns=pcols, data_fraction=0.7)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work the same as split_frame?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -768,6 +802,23 @@ def net_information_threshold(self):
the y-axis of the Core Infogram. Default is -1 which gets set to 0.1.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. First line says number between 0-1,... Default -1 sets to 0.1..

@@ -816,6 +884,23 @@ def safety_index_threshold(self):
gets set to 0.1.

Type: ``float``, defaults to ``-1.0``.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. First line says number between 0-1,... Default -1 sets to 0.1..

@@ -792,6 +843,23 @@ def relevance_index_threshold(self):
which gets set to 0.1.

Type: ``float``, defaults to ``-1.0``.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. First line says number between 0-1,... Default -1 sets to 0.1..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @valenad1 has a good point. We could just set those values to 0.1 as default instead of setting it to -1 and then set it to 0.1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me make a quick change in a new PR.

Copy link
Collaborator

@valenad1 valenad1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the examples works, but I would use different number than default to show how the parameter change the result, and I would rather avoid any unused stuff in the example. So the test should be used for some metric be deleted from example

Copy link
Collaborator

@valenad1 valenad1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates @shaunyogeshwaran. Can you please run the all the examples ones again. E.g this will fail for sure.

@valenad1
Copy link
Collaborator

And please merge it to re-3.46.0, the rel-3.44.0 is not used anymore.

@valenad1
Copy link
Collaborator

@wendycwong

import h2o
from h2o.estimators.infogram import H2OInfogram
h2o.init()
f = "https://erin-data.s3.amazonaws.com/admissible/data/taiwan_credit_card_uci.csv"
col_types = {'SEX': "enum", 'MARRIAGE': "enum", 'default_payment_next_month': "enum"}
df = h2o.import_file(path=f, col_types=col_types)
y = "default_payment_next_month"
x = df.columns
x.remove(y)
gbm_params = {'ntrees':3}
pcols = ["SEX", "MARRIAGE", "AGE"]
ig = H2OInfogram(protected_columns=pcols)
ig.train(y=y, x=x, training_frame=df, algorithm_params=gbm_params)
ig.algorithm_params

gives me:

TypeError                                 Traceback (most recent call last)
/var/folders/2z/9gdhqbns0djdj9crb242sgc00000gn/T/ipykernel_9099/4280293040.py in <cell line: 13>()
     11 pcols = ["SEX", "MARRIAGE", "AGE"]
     12 ig = H2OInfogram(protected_columns=pcols)
---> 13 ig.train(y=y, x=x, training_frame=df, algorithm_params=gbm_params)
     14 ig.algorithm_params

~/h2o-3/_valenad-notebooks/../h2o-py/build/main/h2o/estimators/infogram.py in train(self, x, y, training_frame, verbose, **kwargs)
   1202                                                                                    " and <= 1."
   1203 
-> 1204         parms = sup._make_parms(x,y,training_frame, extend_parms_fn = extend_parms, **kwargs)
   1205 
   1206         sup._train(parms, verbose=verbose)

TypeError: _make_parms() got an unexpected keyword argument 'algorithm_params'.

Ticket here.

@valenad1 valenad1 modified the milestones: 3.46.0.2, 3.46.0.3 May 17, 2024
@shaunyogeshwaran shaunyogeshwaran changed the base branch from rel-3.44.0 to rel-3.46.0 May 20, 2024 07:44
@shaunyogeshwaran shaunyogeshwaran dismissed stale reviews from tomasfryda and hannah-tillman May 20, 2024 07:44

The base branch was changed.

h2o-bindings/bin/custom/python/gen_infogram.py Outdated Show resolved Hide resolved
h2o-bindings/bin/custom/python/gen_infogram.py Outdated Show resolved Hide resolved
h2o-bindings/bin/custom/python/gen_infogram.py Outdated Show resolved Hide resolved
h2o-bindings/bin/custom/python/gen_infogram.py Outdated Show resolved Hide resolved
h2o-py/h2o/estimators/infogram.py Outdated Show resolved Hide resolved
h2o-py/h2o/estimators/infogram.py Outdated Show resolved Hide resolved
h2o-py/h2o/estimators/infogram.py Outdated Show resolved Hide resolved
h2o-py/h2o/estimators/infogram.py Outdated Show resolved Hide resolved
@valenad1 valenad1 modified the milestones: 3.46.0.3, 3.46.0.4 Jun 10, 2024
@valenad1 valenad1 modified the milestones: 3.46.0.4, 3.46.0.5 Jul 9, 2024
@valenad1 valenad1 modified the milestones: 3.46.0.5, 3.46.0.6 Aug 22, 2024
@valenad1 valenad1 removed this from the 3.46.0.6 milestone Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Infogram pydocs updates
6 participants