-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Init the refactoring #143
Merged
Merged
Init the refactoring #143
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
nhuet
force-pushed
the
refactoring
branch
4 times, most recently
from
March 18, 2024 22:24
a246ea9
to
3500ee8
Compare
Starting from python 3.9, we can - replace Tuple, Dict, List, Type from typing by buitins tuple, dict, list, type - replace typing.Callable by collections.abc.Callable - replace typing.Sequence by collections.abc.Sequence See https://peps.python.org/pep-0585/
This will reflect the structure of keras.layers. For instance - Dense can be found in keras.layers.core.dense, - DecomonDense will be found in decomon.layers.core.dense
Some imports are not working anymore because of the partial refactoring
With the refactoring, some code is not relevant anymore.
This is a multid equivalent of batch_dot() on tensors. We perform a dot product on tensors by summing over a range of axes instead of a single axis. In the first tensor, we perform it on the last axes, starting from a given one. In the second tensor, we perform it on the first axes (after batch axis), ending to a given one. The option `missing_batchsize` is used to apply batch_multi_dot even when the batch axis is missing for one of the entry. (In that case, we use keras.ops.tensordot under the hood.) This can be useful when combining an affine bounds having a batch size with a layer affine representation without the batch dimension. By default, the number of axes to merge is the number of non-batch axes in the first arg, so that a linear operator represented by tensor `w` operates on an input tensor `x` as `batch_multi_dot(x, w, missing_batchsize=(False, True)`
The same layer class is now used for forward and backward propagation. The goal is to simplified the implementation of the decomon version of a custom layer for the user wanting to use decomon algorithms on a model involving such a custom layer. When implementing a new decomon layer it is sufficient to implement - get_affine_bounds(): returns affine relaxation of the keras layer - get_ibp_bounds(): returns constant relaxation of the keras layer In the case of a linear keras layer, one can set the class attribute `linear` to True and then implement get_affine_representation() instead of get_affine_bounds(). Some computations will be simplified in this case. In particular, no oracle bounds are needed to propagate affine bounds. One can also, for performance reasons, override directly forward_affine_propagate() and backward_affine_propagate() instead of implementing get_affine_bounds(). The main attributes of the decomon layers are - layer: the keras underlying layer - perturbation_domain: on which we propagate bounds - ibp: do we propagate constant bounds (forward only) - affine: do we propagate affine bounds (meaningless for backward) - propagation: forward or backward, the direction of bounds propagation ibp and affine booleans replace the previous modes as they seem to be the relevant variables when testing inside decomon layer methods;
Was taking model_input_dim under the hypothesis that the model input was flatten. With this, it will allow to drop this hypothesis.
We update get_lower_box and get_upper_box so that we can use tensor-multid (like images) for x_min and x_max. For now, - we do not update get_lower_ball - we do not treat case where w.shape == b.shape
We only implement get_affine_representation(). We could probably compute faster, for tensor-multid inputs, by avoiding artificially creating a "big" weights representation, and working directly with the kernel itself.
Less naive version where we directly override forward_ibp_propagate, forward_affine_propagate, and backward_affine_propagate to avoid repeating artificially the kernel, in case of tensor-multid inputs.
"Diagonal" tensors can be represented by their diagonal. In that case, batch_multid_dot simplifies and results in a mere element-wise product, with the correct broadcasting. Diagonal tensors are tensors that can be, batch element by batch element represented as x_full = K.reshape(K.diag(K.ravel(x_diag)), x_diag.shape+x_diag.shape ) x_diag being their "diagonal" representation that can be multid. It will be useful when we represent an affine operator by (w, b) with weights tensor w of the same shape as bias tensor b.
When layer affine bounds or affine bounds to propagate are represented in diagonal mode (ie w.shape == b.shape), we need to specify it to batch_multid_dot. For not naive DecomonDense implementation, this is not a trivial task in backward mode, so not yet implemented.
By convention, empty affine bounds means identity bounds ie w_l=w_u=identity, b_l=b_u=0 thus we return the other bounds unchanged in that case.
… types - empty affine bounds => identity bounds - diagonal bounds - bounds w/o batch
…ounds For affine bounds empty (ie identity), we add a get_lower_x and get_upper_x method that return lower and upper bounds on the model input x. We implement it in box case. We also take car of diagonal/ w/o batchsize inputs in box case. Ball perturbation domains are to be completed later.
Add also a notebook showing graphs of several decomon models. With an example of customization to see more attributes and change color of a specific layer.
- InputsOutputsSpecs: - remove all methods from previous api, - remove perturbation_domain attribute - move into decomon.layers.inputs_outputs_specs - ForwardMode: removed, replaced by ibp + affine booleans - PerturbationDomain and related stuff -> decomon.perturbation_domain - decomon.core -> decomon.constants: contains only enumerations - decomon.keras_utils: remove unused operations like BatchDiag and BatchIdentity - decomon.utils: keep only activation relaxations, moved into decomon.layers.activations.utils. Remove get_linear_hull_s_shape() that relies on old inputs_outputs_specs api (and thus need ForwardMode => cannot be imported) - decomon.metrics: removed as rely on previous api and will fail to import (need ForwardMode) - decomon.wrappers: kept but untested. Need to be adapted - decomon.wrappers_with_tuning: removed as need decomon.metrics - decomon.models.crown: removed. Specific layers for conversion (like ReduceCrown, Fuse, ...) are now in dedicated modules within decomon.layers
nhuet
force-pushed
the
refactoring
branch
2 times, most recently
from
March 19, 2024 13:11
7e2c25b
to
d3cbec2
Compare
(no meaning outside BoxDomain)
Remove xfail for add + crown + multid
(using data_format metadata)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Full refactoring of the library
The main goal is to ease the introduction of custom layer by users, and thus the implementation of new decomon layers.
Main features:
DecomonLayer
get_affine_representation()
->w,b
so thatlayer(x)
can be written asx * w + b
forward_ibp_propagate()
: propagation of constant bounds through the layerget_affine_relaxation()
->w_l, b_l, w_u, b_u
such thatx* w_l + b_l <= layer(x) <= x* w_u + b_u
See the tutorial tutorials/z_Advanced/custom_layer.ipynb for more details.
w
are represented (batch by batch) by their "diagonal" (potentially multid) so that, for a given batch,w_generic = reshape(diag(ravel(w_diag)), w_diag.shape + w_diag.shape)
In that case,
w.shape == b.shape
batch_multi_dot
manage the productx * w
. This is a batch per batch dot product of tensors allowing x to be multi-dimensional. This function is responsible for managing all above mentioned representations forw
batch_dot()
orDot
layer calltensordot()
ForwardMode
is replaced by booleansibp
andaffine
(to unify the existing mix)Fuse, ReduceCrownBounds, ConvertOutput, ForwardInput, BackwardInput, DecomonOracle,
...These oracle bounds come from dedicated DecomonOracle bounds that convert the output of a forward layer from a first forward conversion, or the affine bounds from a subcrown launch starting from the layer of interest. In the future, other oracle could be used to inject external information on keras layers inputs bounds.
InputsOutputsSpecs
which is also responsible for detecting the representation of affine bounds.Goodies:
plot_model()
allows to show the graph of a decomon model. This is a modified version of original kerasplot_model()
(and thus still working on plain keras models) thatTo be continued: