-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] Add and Validate n_layers
, n_units
, activation
& dropout_rate
kwargs to MLPNetwork
#2338
base: main
Are you sure you want to change the base?
Conversation
Thank you for contributing to
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx for taking care of it.
In general:
- All parameters should be defined as private in constructor before starting anything
- the assertions and all the list checking should only be in build_network method not constructor, this will avoid causing issues on CI (check for example how fcn network is implemented)
- Also given we are parametrizing the network, the associated classifier and regressor should be parametrized as such (also docs) and use them when calling the network
- I left some other comments to check
): | ||
super().__init__() | ||
|
||
self._n_layers = n_layers | ||
|
||
if isinstance(activation, str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better to declare all of them as self before defining the internal versions, so self.activation = activation and then define self._activation
assert all( | ||
isinstance(a, str) for a in activation | ||
), "Activation must be a list of strings." | ||
assert ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
asserting the len() twice
): | ||
super().__init__() | ||
|
||
self._n_layers = n_layers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to define private for this one with underscore as it wont change
self._activation = activation | ||
|
||
if dropout_rate is None: | ||
self._dropout_rate = [0.2] * self._n_layers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the default value of dropout is not 0.2 in the original implementation (check main branch), its 0.1 then 0.2 then 0.2
Number of units in each dense layer. | ||
activation : Union[str, list[str]], optional (default='relu') | ||
Activation function(s) for each dense layer. | ||
dropout_rate : Union[int, float, list[Union[int, float]]], optional (default=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would obligate float always for dropout_rate and say that is should be between 0 and 1
assert all( | ||
isinstance(d, (int, float)) for d in dropout_rate | ||
), "Dropout rates must be int or float." | ||
assert ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
asserting on len twice
|
||
layer_2 = keras.layers.Dropout(0.2)(layer_1) | ||
layer_2 = keras.layers.Dense(500, activation="relu")(layer_2) | ||
x = keras.layers.Dropout(self._dropout_rate[0])(input_layer_flattened) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better to include this block in the loop
where each block in the loop will be one dropout layer followed by a dense layer
|
||
output_layer = keras.layers.Dropout(0.3)(layer_3) | ||
output_layer = keras.layers.Dropout(0.3)(x) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would define a parameter called dropout_last default to 0.3, and document it in details for that this dropout layer is applied at the end
Closes #2337 .