-
Notifications
You must be signed in to change notification settings - Fork 330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split RandomlyResizedCrop into two API surfaces (RandomlyZoomedCrop, RandomCropAndResize) #738
Conversation
Hi I tried to test the new functionality but I'm getting an error every time. |
@martin-gorner In your case it is an autograph issue cause we are receiving Some samples are print("original_height ", original_height)
print("original_width ", original_width) |
Got it, will take a look. |
According to #676 (comment) we would need to know the dimensions of the input image. Is there any workaround to this? |
These images do have a fixed shape, even if TF's static shape inference lost it. Maybe tf.shape can help ? |
/gcbrun |
Great, did you test with RaggedTensors ? |
Yes, I tested it using your notebook itself. I was wondering if there is any way to display the older images as well so we can directly compare the effects of the layer. Here is the gist. |
@AdityaKane2001 looks like the tests fail! |
Thank you @AdityaKane2001. The gist is looking good. |
I just noticed that RandomCrop and RandomResizeCrop do not use the same arguments for the target size. It's width, height vs. target_size (a tuple made of (height, width)). Could this be aligned ?
|
I tested some more and I see black bars in the output: |
I will fix the tests after we have finalized the API. I'll take a look and get back to you. Thanks for pointing out the same. |
@AdityaKane2001 any updates? |
Sorry for the delay, but I am not able to pinpoint why the resultant images contain black bars. |
Hi Aditya, |
@martin-gorner @LukeWood @sayakpaul Please take a look at this gist. I have tried to incorporate all cases of output dimensions mentioned in the table below. I did not observe any distortion or black bars. Please let me know if there are any inconsistencies which I might have missed. Shorthands used for brevity: nw: new_width
|
I will clean up the code and make sure that it is graph mode compatible (we cannot use comparison operators directly). Please let me know if the functionality is as expected, I'll make the rest of the changes all at once after that is confirmed. Sorry for the delay on my side so far 😓 |
We can unit test for this, right? Using tf.ones(), rrc, and then assert no values are 0 |
@LukeWood Yes we can. Just wanted to get the functionality correct first, compatibility and tests can be handled easily after that. |
@AdityaKane2001 please update the |
@AdityaKane2001 lets actually do bounding box support in a follow up, this PR is already big enough and the previous implementation was bugged. |
/gcbrun |
/gcbrun |
/gcbrun |
/gcbrun |
/gcbrun |
/gcbrun |
Check out keras-team/keras-cv#738 for more information. Once this is merged we're breaking backwards compatibility to have a much nicer API name.
Accompanying PR: I checked KerasCV's keras.io docs - no references so we are safe to break this API for now. |
All tests are passing, we have #823 to migrate to the new API. |
Thank you @AdityaKane2001 for your hard work on these APIs. I feel confident in both to be useful - and now we can quickly ship OD workflows, contrastive workflows, etc. |
Check out keras-team/keras-cv#738 for more information. Once this is merged we're breaking backwards compatibility to have a much nicer API name.
…RandomCropAndResize) (keras-team#738) * Sync * Added zoom factor to RRC * Used tf.shape * dtype mismatch * Debugging... * Debugging... * Fix example * RRC uses preprocessing.transform now * Minor error * Minor bug * minor issue * minor issue * minor bug * Added unit tests * Fix serialization test * KerasCV simclr api update * Augmenter * Augmenter * serialization test * serialization test * fix failing test * Split RRC API into two layers * Split RRC API into two layers * Format serialization_test * Implemented bounding box support * Add preprocessing * serialization test * serialization test * serialization test * RandomCropAndResize in SimCLR * RandomCropAndResize in SimCLR * Update examples * Update examples * Update examples * Update target_size Co-authored-by: Luke Wood <[email protected]>
…RandomCropAndResize) (keras-team#738) * Sync * Added zoom factor to RRC * Used tf.shape * dtype mismatch * Debugging... * Debugging... * Fix example * RRC uses preprocessing.transform now * Minor error * Minor bug * minor issue * minor issue * minor bug * Added unit tests * Fix serialization test * KerasCV simclr api update * Augmenter * Augmenter * serialization test * serialization test * fix failing test * Split RRC API into two layers * Split RRC API into two layers * Format serialization_test * Implemented bounding box support * Add preprocessing * serialization test * serialization test * serialization test * RandomCropAndResize in SimCLR * RandomCropAndResize in SimCLR * Update examples * Update examples * Update examples * Update target_size Co-authored-by: Luke Wood <[email protected]>
* [nightly] Increase version to 0.15.0.dev64 * Updates for contrastive model saving. * [nightly] Increase version to 0.15.0.dev65 * [nightly] Increase version to 0.15.0.dev66 * [nightly] Increase version to 0.15.0.dev67 * [nightly] Increase version to 0.15.0.dev68 * [nightly] Increase version to 0.15.0.dev69 * [nightly] Increase version to 0.15.0.dev70 * [nightly] Increase version to 0.15.0.dev71 * Update losses to use Loss reduction. Losses previously computed the mean loss over the examples within the call() method. This may create issues when using multi GPU training. The call() method now returns the per example loss, and the final loss is computed using the losses.Loss reduction method. We also updated the from_config() method to include the parent class's reduction and name args. * Resnet18 returns as a SimilarityModel. We may want Resnet18 as a regular model, but keeping the output type as SimilarityModel to avoid mixed output types. * Fix various mypy and linter errors. * Add support for contrastive_model save and load. * Update unsupervised notebook with save and load. * Update the save and load. Add updated example and docs for save and load in the supervised hello world. * Updates to visualization notebook. * [nightly] Increase version to 0.15.0.dev72 * Unsupervised notebook update. * [nightly] Increase version to 0.15.0.dev73 * [nightly] Increase version to 0.15.0.dev74 * [nightly] Increase version to 0.15.0.dev75 * [nightly] Increase version to 0.15.0.dev76 * Notes on the unsupervised notebook draft. * [nightly] Increase version to 0.15.0.dev77 * [nightly] Increase version to 0.15.0.dev78 * [nightly] Increase version to 0.15.0.dev79 * Remove get_backbone() method and just have users access the backbone attribute directly. * Add new diagrams and updated copy to teh unsupervised notebook. * [nightly] Increase version to 0.15.0.dev80 * [nightly] Increase version to 0.15.0.dev81 * First finished draft of unsupervised_hello_world notebook * Updates to the README file. Add self-supervised info. * [nightly] Increase version to 0.15.0.dev82 * [nightly] Increase version to 0.15.0.dev83 * Update README.md * Remove augmentation arg from architectures. Architectures previously took a callable stack of augmentation layers that would be added after the input of the model. This could cause issues with saving and training on TPU. Users are now expected to add augmentation to either the data samplers / datasets or manually add it to the model. * Clean up example dir. * Fix flake8 errors in architectures. * Update API docs. * Bump version to 0.15.0 * Bump minor version to 0.16.0.dev0 * [nightly] Increase version to 0.16.0.dev1 * [nightly] Increase version to 0.16.0.dev2 * [nightly] Increase version to 0.16.0.dev3 * Distance and losses refactor (#222) * refactor distances call signature and add appropriate tests * refactor metrics for new distance call signature * make similarity losses compatible with asymmetric and non-square distance matrices * adapt and add test * remove newline * [nightly] Increase version to 0.16.0.dev4 * [nightly] Increase version to 0.16.0.dev5 * [nightly] Increase version to 0.16.0.dev6 * [nightly] Increase version to 0.16.0.dev7 * [nightly] Increase version to 0.16.0.dev8 * Cross-batch memory (XBM) (#225) * initiate XBM loss * add todo * add XBM tests * WIP: XBM serialization * XBM serialization * class docstring * remove todo * improve docstring * remove comment * [nightly] Increase version to 0.16.0.dev9 * [nightly] Increase version to 0.16.0.dev10 * [nightly] Increase version to 0.16.0.dev11 * [nightly] Increase version to 0.16.0.dev12 * [nightly] Increase version to 0.16.0.dev13 * [nightly] Increase version to 0.16.0.dev14 * [nightly] Increase version to 0.16.0.dev15 * [nightly] Increase version to 0.16.0.dev16 * [nightly] Increase version to 0.16.0.dev17 * [nightly] Increase version to 0.16.0.dev18 * [nightly] Increase version to 0.16.0.dev19 * [nightly] Increase version to 0.16.0.dev20 * [nightly] Increase version to 0.16.0.dev21 * [nightly] Increase version to 0.16.0.dev22 * Augmentor for Barlow Twins (#229) * Use list(range()) instead of comprehension as it is more pythonic. * Create barlow.py * Bump three in /tensorflow_similarity/visualization/projector_v2 (#228) Bumps [three](https://github.com/mrdoob/three.js) from 0.132.2 to 0.137.0. - [Release notes](https://github.com/mrdoob/three.js/releases) - [Commits](https://github.com/mrdoob/three.js/commits) --- updated-dependencies: - dependency-name: three dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Restructure class to be like Augmenter * Minor fixing of dead links (#230) * Fixed dead links * augmenter main to master * Spelling changes Auto Augment * MixupAndCutmix main to master * RandAugment main to master * RandomErasing main to master * Update SimCLRAugmenter.md * Update ClassificationMatch.md * Update ClassificationMetric.md * Update Evaluator.md * Update MemoryEvaluator.md * Update SimilarityModel.md * Update BinaryAccuracy.md * Update F1Score.md * Update FalsePositiveRate.md * Update NegativePredictiveValue.md * Update Precision.md * Update Recall.md Co-authored-by: Owen Vallis <[email protected]> * Fix minor typos (#226) Co-authored-by: Owen Vallis <[email protected]> * Update barlow.py * Update barlow.py * Update setup.py * Update barlow.py * Update barlow.py * Update barlow.py * Update barlow.py * Update barlow.py * revisions * Update __init__.py * Update __init__.py * Update color_jitter.py * Update barlow.py * Update barlow.py * Update barlow.py * Update setup.py Co-authored-by: Owen S Vallis <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Owen Vallis <[email protected]> Co-authored-by: Genrry Hernandez <[email protected]> * Fixed some bugs in augmenter. (#232) * Create barlow.py * Restructure class to be like Augmenter * Update barlow.py * Update barlow.py * Update setup.py * Update barlow.py * Update barlow.py * Update barlow.py * Update barlow.py * Update barlow.py * revisions * Update __init__.py * Update __init__.py * Update color_jitter.py * Update barlow.py * Update barlow.py * Update barlow.py * Update setup.py * fixed some bugs * Remove seed instance variable Co-authored-by: Owen Vallis <[email protected]> * [nightly] Increase version to 0.16.0.dev23 * [nightly] Increase version to 0.16.0.dev24 * [nightly] Increase version to 0.16.0.dev25 * [nightly] Increase version to 0.16.0.dev26 * [nightly] Increase version to 0.16.0.dev27 * [nightly] Increase version to 0.16.0.dev28 * [nightly] Increase version to 0.16.0.dev29 * [nightly] Increase version to 0.16.0.dev30 * [nightly] Increase version to 0.16.0.dev31 * [nightly] Increase version to 0.16.0.dev32 * [nightly] Increase version to 0.16.0.dev33 * [nightly] Increase version to 0.16.0.dev34 * [nightly] Increase version to 0.16.0.dev35 * [nightly] Increase version to 0.16.0.dev36 * [nightly] Increase version to 0.16.0.dev37 * [nightly] Increase version to 0.16.0.dev38 * [nightly] Increase version to 0.16.0.dev39 * [nightly] Increase version to 0.16.0.dev40 * [nightly] Increase version to 0.16.0.dev41 * [nightly] Increase version to 0.16.0.dev42 * [nightly] Increase version to 0.16.0.dev43 * [nightly] Increase version to 0.16.0.dev44 * [nightly] Increase version to 0.16.0.dev45 * [nightly] Increase version to 0.16.0.dev46 * Added test coverage for augmentation functions + barlow, simCLR augmenter (#235) * Create test_blur.py * Create test_color_jitter.py * Create test_crop.py * Create test_flip.py * Update test_crop.py * Update test_color_jitter.py * Create test_solarize.py * Create test_augmenters.py * Update test_flip.py * Update test_flip.py * Update test_flip.py * Update blur.py * Update blur.py * [nightly] Increase version to 0.16.0.dev47 * Change augmenters to use augmentation_utils (#238) * Fix corrupted JSON formatting in unsupervised notebook. * Added features of SplitValidationLoss callback to EvalCallback (#242) * Added features of SplitValidationLoss callback to EvalCallback Merged SplitValidationLoss into EvalCallbaclk * Refactored EvalCallback using utils.unpack_results * [nightly] Increase version to 0.16.0.dev48 * [nightly] Increase version to 0.16.0.dev49 * [nightly] Increase version to 0.16.0.dev50 * VicReg Loss - Improvement of Barlow Twins (#243) * VicReg Loss * Update vicreg.py * Update vicreg.py * Update vicreg.py * fix big bug * Update vicreg.py * Update vicreg.py * fixes * Update vicreg.py * [nightly] Increase version to 0.16.0.dev51 * [nightly] Increase version to 0.16.0.dev52 * Update tests for algebra.py * Coverage now at 100% * Convert tests to use tf.testing.TestCase * [nightly] Increase version to 0.16.0.dev53 * [nightly] Increase version to 0.16.0.dev54 * Fix corrupted formatting in visualization notebook. * [bug] Fix multisim loss offsets. The tfsim version of multisim uses distances instead of the inner product. However, multisim requires that we "center" the pairwise distances around 0. Here we add a new center param, which we set to 1.0 for cosine distance. Additionally, we also flip the lambda (lmda) param to add the threshold to the values instead of subtracting it. These changes will help improve the pos and neg weighting in the log1psumexp. * [nightly] Increase version to 0.16.0.dev55 * [bug] In losses.utils.logsumexp() tf.math.log(1 + x) should be tf.math.log(tf.math.exp(-my_max) + x). This is needed to properly account for removing the rowwise max before computing the logsumexp. * Make Augmentation Utilities More Customizable(reupload due to branch issues) (#255) * modifications of benchmark * test commit 123 * new changes to training * testo changes * works in colab... kind of * code is neat now * working on sampler problem * Update barlow.py * Update blur.py * Update color_jitter.py * Update color_jitter.py * Update barlow.py * Update barlow.py * Added vicreg for sync * Update vicreg.py * Update vicreg.py * Update vicreg.py * Update barlow.py * randomresizedcrop edits * Update barlow.py * allow to customize loss reduction * Update __init__.py * Delete sampler_test.py * Delete benchmark/supervised directory * Update barlow.py * added docstring on random_resized_crop * Allow user to set normalization * Update barlow.py * Update barlow.py * Update setup.py * remove pipfile * Delete Pipfile * Delete Pipfile.lock * Update cropping.py * Update cropping.py * Additive multiplicative changes * Update simclr.py * change additive, multiplicative * Update barlow.py * Update solarize.py * Update barlow.py * Update solarize.py * Update barlow.py * Update test_solarize.py * Update test_solarize.py * Update test_solarize.py Co-authored-by: Owen Vallis <[email protected]> * Refactor test_basic to use TestCase to improve flaky test results. * Fix Flake8 warnings. * Freeze all batchnorm architecture layers. We now freeze all BN layers when loading pre-trained weights in the effnet and resnet50 architectures. Previously, we only froze the BN layers if trainable was partial or frozen. When trainable was full, the BN layers would be trainable as well and this led to suboptimal training losses. * Improve EfficientNetSim docstring and type hints (#254) * Fix typos in docstring * Remove reference to image augmentation Image augmentation was previously removed, so purge it from the comment and docstring. * Correct input image type annotation * Fix #251. Check for model._index before calling Indexer methods. The Indexer is core to a number of the Similarity model methods. Add support for checking if the index exists and return a more informative AttributeError if the index hasn't been created yet. * Set random seeds for tfrecord samplers test. * All augmenters use the Tensor type from tensorflow_similarity.types. * [nightly] Increase version to 0.16.0.dev56 * Fix Tensor type error in callbacks. Unpacking the Lookup objects converts the python types to Tensors. This can lead to Tensor type errors. This commit adds support for taking the expected dtype of the model Tensors where possible. We also fix a bug where the EvalCallback was not logging the split metric names in the history. * Update doc strings in color_jitter. * Update the create index AttributeError text * [nightly] Increase version to 0.16.0.dev57 * Update Notebook examples. * Remove unneeded tf.function and register_keras_serializable decorators. Subclasses of tf.keras.losses.Loss will trace all child functions and we only need to register the subclassed loss to support deserialization. * Simplify MetricEmbedding layer. * Fix mypy type error in simsiam. Convert all constants to tf.constant. * Simplify the MetricEmbedding layer. Subclass layers.Dense directly. This simplifies the layer and also fixes function tracing during model save. * Fix test_tfrecord_samplers tests. * Update api documentation. TODO: This just generated the new docs. We still need to go through and clean up the documentation. * Update doc string and api for MetricEmbedding layer. * Bump to version 0.16 * Fix static type check error in memory store. The np.savez functions expect array_like values but we were passing List. Casting as np array should solve the issue. * Fix effnet test for TF 2.9 * Fix TFRecordDatasetSampler now returns correct number of examples per batch. * Bump dev version to 0.17.0.dev0. * [nightly] Increase version to 0.17.0.dev1 * [nightly] Increase version to 0.17.0.dev2 * [nightly] Increase version to 0.17.0.dev3 * [nightly] Increase version to 0.17.0.dev4 * [nightly] Increase version to 0.17.0.dev5 * [nightly] Increase version to 0.17.0.dev6 * [nightly] Increase version to 0.17.0.dev7 * [nightly] Increase version to 0.17.0.dev8 * Add support for configuring and running benchmarks for supervised losses. Add support for passing the same examples for both the query and indexed set when calling retrieval_metrics. Added a new param to each retrieval_metric that enables dropping the nearest neighbor. This is useful if the nearest neighbor exists in the indexed examples. * Update benchmark README and max line length in .flake8 * Updates to the benchmark code - Add clean_dir func to train file. - Add support for creating precision@k and map@k eval metrics - Fix typing issue in map@k. We now take the class counts type from the query label dtype. - Remove 1 count from class counts if we are dropping the first result. - Refactor the make functions in train to use a Dict for all the parameterized modules. * [nightly] Increase version to 0.17.0.dev9 * Fixed typo in slice id * black formatting * black formatting * Fixed typo to resolve #284 The function should be tf.concat instead of tf.constant, according to the description given above. This also resolves issue #284 * [nightly] Increase version to 0.17.0.dev10 * Update to match the API of the latest keras_cv version Check out keras-team/keras-cv#738 for more information. Once this is merged we're breaking backwards compatibility to have a much nicer API name. * Add clip_at_r to support computing MAP@R from map_at_k module. * Refactor benchmark components into separate modules. * Update benchmark configs to use smaller 1e-6 learning rates. Update train.py main to loop through the various embedding sizes in the architectures. * Fix tests for clip_at_r in map_at_k retrieval metric. Refactor the clip at r changes to use map_fn. * [nightly] Increase version to 0.17.0.dev11 * Update to benchmark configs and experiments with adding LR Schedule. * Update benchmark README * Black formatting for map_at_k * Add requirements file for benchmarks * Refactor benchmark code - Support filtering the set of experiments using a regex pattern passed in the args. - Add typing - Refactor the config parsing into a separate dataclass - Refactor the cross product of all params to use itertools product - Update requirements to add open-cv. This is needed for the caltech birds dataset. - Refactor the config to have a consistent use of the dict keys for object creation and add a separate name field for identifying the specific set of params associated with the object. * Add user prompt to continue/exit benchmark run after run_grps are listed. Update README to include example of filter pattern. * make_eval_data now returns a new list of augmented examples instead of updating the original list. Remove return when user input == Y * Set soft_margin default to True. The default value was previously set to False but the doc string stated the default value as True. * Set nmslib to brute force search and remove agg results. - Brute force search removes any noise introduced by an aprox NN search. - Removing the agg results as we will provide a utility for aggregating the result folders from each experiment. * Update loss ids in the losses component. - Removed the '_loss' suffix from the loss ids as it was redundent. - Add xmb, triplet loss, and soft nn loss to the losses config section. * Google Summer of Code (#286) * Added multiple negatives ranking loss * Added multimodal example * Added support for multiple distances in mnrl loss Added support for different distances in multiple negatives ranking loss * Added link to multimodal example notebook * black formatting * Using numerically stable implementation of logsumexp * Delete pyproject.toml * Updated pyproject.toml * Black formatting in multinegrank_loss * Updated pip install url to dev branch Co-authored-by: Owen Vallis <[email protected]> * resolve #299 Fix WarmupCosineDecay. * Previous version scaled the cosine decay by a linear warmup value. So the max value was max_lr*0.5*(1+cos(warmup_steps/total_steps*pi)) * New version has a linear warmup and then begins the cosine decay from cos(0.0) so the max value is now max_lr. * Previous version accepted a tensor of values, this is not needed. Simplified to accept a single scaler step value. * Updated tests to be consistent with the keras LearningRateSchedule tests. * Renamed class from WarmUpCosine to WarmupCosineDecay. This is more consistent with the Keras LearningRateSchedules. * [nightly] Increase version to 0.17.0.dev12 * Update pn_loss default params and doc string formatting. * Make soft_margin the default. The doc string stated this was the default but the param was set to False. * Make the default margin 0.1. The previous value was 1.0 which produced sub-optimal results when using cosine distance. * Reformat the doc strings to align with the google docstring style. * Add support for the pep585 annotations. Removed Callable and Union. * Update triplet_loss default params and doc string formatting. * Make soft_margin the default. The doc string stated this was the default but the param was set to False. * Make the default margin 0.1. The previous value was 1.0 which produced sub-optimal results when using cosine distance. * Reformat the doc strings to align with the google docstring style. * Add support for the pep585 annotations. Removed Callable and Union. * Update train to use new WarmupCosineDecay. * Updates to config params for both prod and single configs * Updates to component/losses to use the new defaults * Benchmark updates and bug fixes * calls to model.predict() now convert input to tensor using the CPU context. This avoids a mem leak when calling predict in a loop. * expose all NMSLib params in NMSLibSearch. This enables custom parametrization of the nsmlib indexes. * Update indexer to save and load Search objects from configs. Saving now works when passing a Search object to model.compile() * Update the benchmark configs to use the unique name as the key and provide a 'component' field for identifying the component to build. * Manually delete and clear all objects at the end of each benchmark loop to try and avoid memory leaks. However, we still leak mem in tf frameworks/ops.py * Make flage ignore E206 whitespace after ":". This was in conflict with the black formatter. * Enable manual trigger for testing workflow. * Refactor benchmark datasets to be a class. * Now supports creating custom Dataset objects to load other sources. Currently supports loading TFDS data. * Add support for define hyper_parameter versions of parameters to support KerasTuner search * Split the single train script into create_dataset, hyper_parameter_search and train * Update configuration files to remove the benchmark prefix. * Add support for retrieval metrics in callback. * Add support for R_precision and refactor map@k * map_at_k is now a subclass of precision_at_k. Reduces code duplication. * update names for precision_at_k and map_at_k when clip_at_k is set to true. Name no longer return an @k suffix but instead return wither R_Precision or map@R. * distances now return their params when get_config() is called. * Fix info string for mem samplers. * Memory samplers now correctly report the number of augmenations in the sampler object. * Fix mypy errors from newest mypy version. * Migrate to support pep585 where possible * Fix several small typing errors * Provide typing for 100% of modules * [nightly] Increase version to 0.17.0.dev13 * * GEM layers create a general pooling layer in the init, but we didn't pass the kwargs. This means the general pooling layer didn't have the dtype policy. This caused the GEM layers to fail when using a mixed_float dtype policy as the general pooling layer returns float32 and the GEM dtype policy is float16. The fix is to pass all kwargs onto the general pooling layer. * Patch bump * Cap the TF version at 2.9 for the current master branch. * Resolves Error while using projector (#301) * Resolves Error while using projector Since the new major release of Bokeh version 3.0, the `plot_width` (and `plot_height`) properties have been removed. These have been replaced by standard `width` and `height` for all layout-able models according to [Migration Guide](https://github.com/bokeh/bokeh/wiki/Migration-Guides#300). The update fixes the error generated by the `projector`. * backward compatible This update makes `projector` backward compatible with `bokeh` * Apply formatting to projector changes to fix warnings from Black and ISort. * [nightly] Increase version to 0.17.0.dev14 * Model Save Updates (#305) * Update github workflow test matrix to include py 3.10 and tf 2.7 and 2.10 * Update github workflow to use py 3.11 and tensorflow 2.11. * Fix testing error in test_schedules. import from keras should now be import from tensorflow.keras. * The optimizer serialize and deserialize are under schedules in TF instead of the learning_rate_schedules module from keras. * Turns out the workflow version must be < 3.11 * Python 3.10 requires TF >= 2.8. * Fix and simplify Contrastive Model save and load. * The old save and load manually loaded each inner model. This was required because we didn't build the outer model graph. * The new solution uses a factory function to infer the input shape and then connect all the inner models and pass the input and output to the contrastive model. This is enough for the standard model save to work. * Also silenced the INFO logs from nmslib. * Small formatting and cleanup in other files. * Remove extra print in EvalCallback * Fix order of contrastive model metrics so that losses come first. * Update unsupervised notebook to use new save and create functions. * Fix formatting in contrastive model module * [nightly] Increase version to 0.17.0.dev15 * Add MultiShotFileSampler (#307) * Add MultiShotFileSampler * Refactor MultiShotMemorySampler to use load_example_fn * Update MultiShotFileSampler to inherit from MultiShotMemorySampler * Fix typing errors in file_samplers * Loss tests refactor (#308) * Refactor the tests for the losses. * Use the tf.TestCase packages * Create utils for perfect and bad embedding examples. * Refactor Triplet and PN Loss to have a single margin param now with float | None. None will now set the soft_margin. * Replace basic tf.logsumexp with the TF sim stable logsumexp in the soft margin. * Fix bug in semi-hard mining where we don't have any valid negatives > max positive. Previously this defaulted to selecting the example in idx 0. We now take the negative that is closest to the maximal positive without going over, i.e., max(d(a,n)) <= max(d(a,p)). * Refactor the triplet loss tests. * Create a losses dir under tests. * Refactor the tests for the losses. * Use the tf.TestCase packages * Create utils for perfect and bad embedding examples. * Refactor Triplet and PN Loss to have a single margin param now with float | None. None will now set the soft_margin. * Replace basic tf.logsumexp with the TF sim stable logsumexp in the soft margin. * Fix bug in semi-hard mining where we don't have any valid negatives > max positive. Previously this defaulted to selecting the example in idx 0. We now take the negative that is closest to the maximal positive without going over, i.e., max(d(a,n)) <= max(d(a,p)). * Refactor the triplet loss tests. * Create a losses dir under tests. * Type empty_mask as BoolTensor to fix mypy error. We create a matrix of zeros as dtype bool and vector of ones as dtype bool, but mypy doesn't see these as BoolTensor type. This commit adds an explicit BoolTensor type to the empty_mask to fix this. * Fix formatting errors * fix formatting errors * [nightly] Increase version to 0.17.0.dev16 * Adding benchmark datasets component that was ignored due to datasets/ filter in the .gitignore * Float16 (#310) * Fix casting to use the default floatx where possible to avoid type errors when training using mixed precision or float16. * Update tests for supporting float16 * Remove float dtype parameterization of readme tests. They were too slow. * Fix casting error when passing constant scalar. Set policy in multihead test to ensure we reset the policy to a good state. * Remove duplicate long running test. This should speed up test by ~3min. * [nightly] Increase version to 0.17.0.dev17 * Remove references to outputs in contrastive model. (#311) * Remove references to outputs in contrastive model. We use the inputs and outputs to support saving the contrastive model using the Keras API, however, we override train and test steps as well as predict. This means we don't currently support multiple output heads on the embedding output. This PR removes all references to multi-headed outputs and explicitly sets the indexer to use the predictor output. * Provide default contrastive projector and predictor. Users had to provide their own MLP models for the projector and predictor. This required understanding more about the underlying algorithms. This change now adds default projector and predictor models based on the original papers. * Update unsupervised colab. * Comment out projector and predictor create model functions. We now automatically create the MLP models for users, but the commented code is left in case the user wants to customize them. * Verify that the model trains and reloads. * Loss and performance is slightly better than before. * Update the create_contrastive_model function to pass a list of outputs to better track the outputs. The model still overrides the predict function though as we need to apply the L2 Norm at the output. * Fix mypy error. * Update ouput var name and use epsilon constant. * [nightly] Increase version to 0.17.0.dev18 * Update release notes for 0.17.x * Update example notebooks. * Add patch to support passing custom NMSLibSearch objects. * Add temporary fix that passes the search object config to the make_search function in order to support resetting the search index. * NOTE: This is only temporary and a more general solution will be added in the new backend updates to search and store. * Updated the supervised visualization notebook to demo using the custom NMSLibSearch object. * Added warnings about the reset issues with custom objects in indexer. * Remove the old benchmark dataset file. * [nightly] Increase version to 0.17.0.dev19 * Update CLIP notebook and include search example. * Remove benchmark code from release * Set release version to 0.17.0 --------- Co-authored-by: Github Actions Bot <> Co-authored-by: Christoffer Hjort <[email protected]> Co-authored-by: dewball345 <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Genrry Hernandez <[email protected]> Co-authored-by: Abhishar Sinha <[email protected]> Co-authored-by: Emil Larsson <[email protected]> Co-authored-by: Abhishar Sinha <[email protected]> Co-authored-by: Luke Wood <[email protected]> Co-authored-by: Zoheb Abai <[email protected]> Co-authored-by: Mohammad Amin Haghpanah <[email protected]>
Check out keras-team/keras-cv#738 for more information. Once this is merged we're breaking backwards compatibility to have a much nicer API name.
* [nightly] Increase version to 0.15.0.dev64 * Updates for contrastive model saving. * [nightly] Increase version to 0.15.0.dev65 * [nightly] Increase version to 0.15.0.dev66 * [nightly] Increase version to 0.15.0.dev67 * [nightly] Increase version to 0.15.0.dev68 * [nightly] Increase version to 0.15.0.dev69 * [nightly] Increase version to 0.15.0.dev70 * [nightly] Increase version to 0.15.0.dev71 * Update losses to use Loss reduction. Losses previously computed the mean loss over the examples within the call() method. This may create issues when using multi GPU training. The call() method now returns the per example loss, and the final loss is computed using the losses.Loss reduction method. We also updated the from_config() method to include the parent class's reduction and name args. * Resnet18 returns as a SimilarityModel. We may want Resnet18 as a regular model, but keeping the output type as SimilarityModel to avoid mixed output types. * Fix various mypy and linter errors. * Add support for contrastive_model save and load. * Update unsupervised notebook with save and load. * Update the save and load. Add updated example and docs for save and load in the supervised hello world. * Updates to visualization notebook. * [nightly] Increase version to 0.15.0.dev72 * Unsupervised notebook update. * [nightly] Increase version to 0.15.0.dev73 * [nightly] Increase version to 0.15.0.dev74 * [nightly] Increase version to 0.15.0.dev75 * [nightly] Increase version to 0.15.0.dev76 * Notes on the unsupervised notebook draft. * [nightly] Increase version to 0.15.0.dev77 * [nightly] Increase version to 0.15.0.dev78 * [nightly] Increase version to 0.15.0.dev79 * Remove get_backbone() method and just have users access the backbone attribute directly. * Add new diagrams and updated copy to teh unsupervised notebook. * [nightly] Increase version to 0.15.0.dev80 * [nightly] Increase version to 0.15.0.dev81 * First finished draft of unsupervised_hello_world notebook * Updates to the README file. Add self-supervised info. * [nightly] Increase version to 0.15.0.dev82 * [nightly] Increase version to 0.15.0.dev83 * Update README.md * Remove augmentation arg from architectures. Architectures previously took a callable stack of augmentation layers that would be added after the input of the model. This could cause issues with saving and training on TPU. Users are now expected to add augmentation to either the data samplers / datasets or manually add it to the model. * Clean up example dir. * Fix flake8 errors in architectures. * Update API docs. * Bump version to 0.15.0 * Bump minor version to 0.16.0.dev0 * [nightly] Increase version to 0.16.0.dev1 * [nightly] Increase version to 0.16.0.dev2 * [nightly] Increase version to 0.16.0.dev3 * Distance and losses refactor (tensorflow#222) * refactor distances call signature and add appropriate tests * refactor metrics for new distance call signature * make similarity losses compatible with asymmetric and non-square distance matrices * adapt and add test * remove newline * [nightly] Increase version to 0.16.0.dev4 * [nightly] Increase version to 0.16.0.dev5 * [nightly] Increase version to 0.16.0.dev6 * [nightly] Increase version to 0.16.0.dev7 * [nightly] Increase version to 0.16.0.dev8 * Cross-batch memory (XBM) (tensorflow#225) * initiate XBM loss * add todo * add XBM tests * WIP: XBM serialization * XBM serialization * class docstring * remove todo * improve docstring * remove comment * [nightly] Increase version to 0.16.0.dev9 * [nightly] Increase version to 0.16.0.dev10 * [nightly] Increase version to 0.16.0.dev11 * [nightly] Increase version to 0.16.0.dev12 * [nightly] Increase version to 0.16.0.dev13 * [nightly] Increase version to 0.16.0.dev14 * [nightly] Increase version to 0.16.0.dev15 * [nightly] Increase version to 0.16.0.dev16 * [nightly] Increase version to 0.16.0.dev17 * [nightly] Increase version to 0.16.0.dev18 * [nightly] Increase version to 0.16.0.dev19 * [nightly] Increase version to 0.16.0.dev20 * [nightly] Increase version to 0.16.0.dev21 * [nightly] Increase version to 0.16.0.dev22 * Augmentor for Barlow Twins (tensorflow#229) * Use list(range()) instead of comprehension as it is more pythonic. * Create barlow.py * Bump three in /tensorflow_similarity/visualization/projector_v2 (tensorflow#228) Bumps [three](https://github.com/mrdoob/three.js) from 0.132.2 to 0.137.0. - [Release notes](https://github.com/mrdoob/three.js/releases) - [Commits](https://github.com/mrdoob/three.js/commits) --- updated-dependencies: - dependency-name: three dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Restructure class to be like Augmenter * Minor fixing of dead links (tensorflow#230) * Fixed dead links * augmenter main to master * Spelling changes Auto Augment * MixupAndCutmix main to master * RandAugment main to master * RandomErasing main to master * Update SimCLRAugmenter.md * Update ClassificationMatch.md * Update ClassificationMetric.md * Update Evaluator.md * Update MemoryEvaluator.md * Update SimilarityModel.md * Update BinaryAccuracy.md * Update F1Score.md * Update FalsePositiveRate.md * Update NegativePredictiveValue.md * Update Precision.md * Update Recall.md Co-authored-by: Owen Vallis <[email protected]> * Fix minor typos (tensorflow#226) Co-authored-by: Owen Vallis <[email protected]> * Update barlow.py * Update barlow.py * Update setup.py * Update barlow.py * Update barlow.py * Update barlow.py * Update barlow.py * Update barlow.py * revisions * Update __init__.py * Update __init__.py * Update color_jitter.py * Update barlow.py * Update barlow.py * Update barlow.py * Update setup.py Co-authored-by: Owen S Vallis <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Owen Vallis <[email protected]> Co-authored-by: Genrry Hernandez <[email protected]> * Fixed some bugs in augmenter. (tensorflow#232) * Create barlow.py * Restructure class to be like Augmenter * Update barlow.py * Update barlow.py * Update setup.py * Update barlow.py * Update barlow.py * Update barlow.py * Update barlow.py * Update barlow.py * revisions * Update __init__.py * Update __init__.py * Update color_jitter.py * Update barlow.py * Update barlow.py * Update barlow.py * Update setup.py * fixed some bugs * Remove seed instance variable Co-authored-by: Owen Vallis <[email protected]> * [nightly] Increase version to 0.16.0.dev23 * [nightly] Increase version to 0.16.0.dev24 * [nightly] Increase version to 0.16.0.dev25 * [nightly] Increase version to 0.16.0.dev26 * [nightly] Increase version to 0.16.0.dev27 * [nightly] Increase version to 0.16.0.dev28 * [nightly] Increase version to 0.16.0.dev29 * [nightly] Increase version to 0.16.0.dev30 * [nightly] Increase version to 0.16.0.dev31 * [nightly] Increase version to 0.16.0.dev32 * [nightly] Increase version to 0.16.0.dev33 * [nightly] Increase version to 0.16.0.dev34 * [nightly] Increase version to 0.16.0.dev35 * [nightly] Increase version to 0.16.0.dev36 * [nightly] Increase version to 0.16.0.dev37 * [nightly] Increase version to 0.16.0.dev38 * [nightly] Increase version to 0.16.0.dev39 * [nightly] Increase version to 0.16.0.dev40 * [nightly] Increase version to 0.16.0.dev41 * [nightly] Increase version to 0.16.0.dev42 * [nightly] Increase version to 0.16.0.dev43 * [nightly] Increase version to 0.16.0.dev44 * [nightly] Increase version to 0.16.0.dev45 * [nightly] Increase version to 0.16.0.dev46 * Added test coverage for augmentation functions + barlow, simCLR augmenter (tensorflow#235) * Create test_blur.py * Create test_color_jitter.py * Create test_crop.py * Create test_flip.py * Update test_crop.py * Update test_color_jitter.py * Create test_solarize.py * Create test_augmenters.py * Update test_flip.py * Update test_flip.py * Update test_flip.py * Update blur.py * Update blur.py * [nightly] Increase version to 0.16.0.dev47 * Change augmenters to use augmentation_utils (tensorflow#238) * Fix corrupted JSON formatting in unsupervised notebook. * Added features of SplitValidationLoss callback to EvalCallback (tensorflow#242) * Added features of SplitValidationLoss callback to EvalCallback Merged SplitValidationLoss into EvalCallbaclk * Refactored EvalCallback using utils.unpack_results * [nightly] Increase version to 0.16.0.dev48 * [nightly] Increase version to 0.16.0.dev49 * [nightly] Increase version to 0.16.0.dev50 * VicReg Loss - Improvement of Barlow Twins (tensorflow#243) * VicReg Loss * Update vicreg.py * Update vicreg.py * Update vicreg.py * fix big bug * Update vicreg.py * Update vicreg.py * fixes * Update vicreg.py * [nightly] Increase version to 0.16.0.dev51 * [nightly] Increase version to 0.16.0.dev52 * Update tests for algebra.py * Coverage now at 100% * Convert tests to use tf.testing.TestCase * [nightly] Increase version to 0.16.0.dev53 * [nightly] Increase version to 0.16.0.dev54 * Fix corrupted formatting in visualization notebook. * [bug] Fix multisim loss offsets. The tfsim version of multisim uses distances instead of the inner product. However, multisim requires that we "center" the pairwise distances around 0. Here we add a new center param, which we set to 1.0 for cosine distance. Additionally, we also flip the lambda (lmda) param to add the threshold to the values instead of subtracting it. These changes will help improve the pos and neg weighting in the log1psumexp. * [nightly] Increase version to 0.16.0.dev55 * [bug] In losses.utils.logsumexp() tf.math.log(1 + x) should be tf.math.log(tf.math.exp(-my_max) + x). This is needed to properly account for removing the rowwise max before computing the logsumexp. * Make Augmentation Utilities More Customizable(reupload due to branch issues) (tensorflow#255) * modifications of benchmark * test commit 123 * new changes to training * testo changes * works in colab... kind of * code is neat now * working on sampler problem * Update barlow.py * Update blur.py * Update color_jitter.py * Update color_jitter.py * Update barlow.py * Update barlow.py * Added vicreg for sync * Update vicreg.py * Update vicreg.py * Update vicreg.py * Update barlow.py * randomresizedcrop edits * Update barlow.py * allow to customize loss reduction * Update __init__.py * Delete sampler_test.py * Delete benchmark/supervised directory * Update barlow.py * added docstring on random_resized_crop * Allow user to set normalization * Update barlow.py * Update barlow.py * Update setup.py * remove pipfile * Delete Pipfile * Delete Pipfile.lock * Update cropping.py * Update cropping.py * Additive multiplicative changes * Update simclr.py * change additive, multiplicative * Update barlow.py * Update solarize.py * Update barlow.py * Update solarize.py * Update barlow.py * Update test_solarize.py * Update test_solarize.py * Update test_solarize.py Co-authored-by: Owen Vallis <[email protected]> * Refactor test_basic to use TestCase to improve flaky test results. * Fix Flake8 warnings. * Freeze all batchnorm architecture layers. We now freeze all BN layers when loading pre-trained weights in the effnet and resnet50 architectures. Previously, we only froze the BN layers if trainable was partial or frozen. When trainable was full, the BN layers would be trainable as well and this led to suboptimal training losses. * Improve EfficientNetSim docstring and type hints (tensorflow#254) * Fix typos in docstring * Remove reference to image augmentation Image augmentation was previously removed, so purge it from the comment and docstring. * Correct input image type annotation * Fix tensorflow#251. Check for model._index before calling Indexer methods. The Indexer is core to a number of the Similarity model methods. Add support for checking if the index exists and return a more informative AttributeError if the index hasn't been created yet. * Set random seeds for tfrecord samplers test. * All augmenters use the Tensor type from tensorflow_similarity.types. * [nightly] Increase version to 0.16.0.dev56 * Fix Tensor type error in callbacks. Unpacking the Lookup objects converts the python types to Tensors. This can lead to Tensor type errors. This commit adds support for taking the expected dtype of the model Tensors where possible. We also fix a bug where the EvalCallback was not logging the split metric names in the history. * Update doc strings in color_jitter. * Update the create index AttributeError text * [nightly] Increase version to 0.16.0.dev57 * Update Notebook examples. * Remove unneeded tf.function and register_keras_serializable decorators. Subclasses of tf.keras.losses.Loss will trace all child functions and we only need to register the subclassed loss to support deserialization. * Simplify MetricEmbedding layer. * Fix mypy type error in simsiam. Convert all constants to tf.constant. * Simplify the MetricEmbedding layer. Subclass layers.Dense directly. This simplifies the layer and also fixes function tracing during model save. * Fix test_tfrecord_samplers tests. * Update api documentation. TODO: This just generated the new docs. We still need to go through and clean up the documentation. * Update doc string and api for MetricEmbedding layer. * Bump to version 0.16 * Fix static type check error in memory store. The np.savez functions expect array_like values but we were passing List. Casting as np array should solve the issue. * Fix effnet test for TF 2.9 * Fix TFRecordDatasetSampler now returns correct number of examples per batch. * Bump dev version to 0.17.0.dev0. * [nightly] Increase version to 0.17.0.dev1 * [nightly] Increase version to 0.17.0.dev2 * [nightly] Increase version to 0.17.0.dev3 * [nightly] Increase version to 0.17.0.dev4 * [nightly] Increase version to 0.17.0.dev5 * [nightly] Increase version to 0.17.0.dev6 * [nightly] Increase version to 0.17.0.dev7 * [nightly] Increase version to 0.17.0.dev8 * Add support for configuring and running benchmarks for supervised losses. Add support for passing the same examples for both the query and indexed set when calling retrieval_metrics. Added a new param to each retrieval_metric that enables dropping the nearest neighbor. This is useful if the nearest neighbor exists in the indexed examples. * Update benchmark README and max line length in .flake8 * Updates to the benchmark code - Add clean_dir func to train file. - Add support for creating precision@k and map@k eval metrics - Fix typing issue in map@k. We now take the class counts type from the query label dtype. - Remove 1 count from class counts if we are dropping the first result. - Refactor the make functions in train to use a Dict for all the parameterized modules. * [nightly] Increase version to 0.17.0.dev9 * Fixed typo in slice id * black formatting * black formatting * Fixed typo to resolve tensorflow#284 The function should be tf.concat instead of tf.constant, according to the description given above. This also resolves issue tensorflow#284 * [nightly] Increase version to 0.17.0.dev10 * Update to match the API of the latest keras_cv version Check out keras-team/keras-cv#738 for more information. Once this is merged we're breaking backwards compatibility to have a much nicer API name. * Add clip_at_r to support computing MAP@R from map_at_k module. * Refactor benchmark components into separate modules. * Update benchmark configs to use smaller 1e-6 learning rates. Update train.py main to loop through the various embedding sizes in the architectures. * Fix tests for clip_at_r in map_at_k retrieval metric. Refactor the clip at r changes to use map_fn. * [nightly] Increase version to 0.17.0.dev11 * Update to benchmark configs and experiments with adding LR Schedule. * Update benchmark README * Black formatting for map_at_k * Add requirements file for benchmarks * Refactor benchmark code - Support filtering the set of experiments using a regex pattern passed in the args. - Add typing - Refactor the config parsing into a separate dataclass - Refactor the cross product of all params to use itertools product - Update requirements to add open-cv. This is needed for the caltech birds dataset. - Refactor the config to have a consistent use of the dict keys for object creation and add a separate name field for identifying the specific set of params associated with the object. * Add user prompt to continue/exit benchmark run after run_grps are listed. Update README to include example of filter pattern. * make_eval_data now returns a new list of augmented examples instead of updating the original list. Remove return when user input == Y * Set soft_margin default to True. The default value was previously set to False but the doc string stated the default value as True. * Set nmslib to brute force search and remove agg results. - Brute force search removes any noise introduced by an aprox NN search. - Removing the agg results as we will provide a utility for aggregating the result folders from each experiment. * Update loss ids in the losses component. - Removed the '_loss' suffix from the loss ids as it was redundent. - Add xmb, triplet loss, and soft nn loss to the losses config section. * Google Summer of Code (tensorflow#286) * Added multiple negatives ranking loss * Added multimodal example * Added support for multiple distances in mnrl loss Added support for different distances in multiple negatives ranking loss * Added link to multimodal example notebook * black formatting * Using numerically stable implementation of logsumexp * Delete pyproject.toml * Updated pyproject.toml * Black formatting in multinegrank_loss * Updated pip install url to dev branch Co-authored-by: Owen Vallis <[email protected]> * resolve tensorflow#299 Fix WarmupCosineDecay. * Previous version scaled the cosine decay by a linear warmup value. So the max value was max_lr*0.5*(1+cos(warmup_steps/total_steps*pi)) * New version has a linear warmup and then begins the cosine decay from cos(0.0) so the max value is now max_lr. * Previous version accepted a tensor of values, this is not needed. Simplified to accept a single scaler step value. * Updated tests to be consistent with the keras LearningRateSchedule tests. * Renamed class from WarmUpCosine to WarmupCosineDecay. This is more consistent with the Keras LearningRateSchedules. * [nightly] Increase version to 0.17.0.dev12 * Update pn_loss default params and doc string formatting. * Make soft_margin the default. The doc string stated this was the default but the param was set to False. * Make the default margin 0.1. The previous value was 1.0 which produced sub-optimal results when using cosine distance. * Reformat the doc strings to align with the google docstring style. * Add support for the pep585 annotations. Removed Callable and Union. * Update triplet_loss default params and doc string formatting. * Make soft_margin the default. The doc string stated this was the default but the param was set to False. * Make the default margin 0.1. The previous value was 1.0 which produced sub-optimal results when using cosine distance. * Reformat the doc strings to align with the google docstring style. * Add support for the pep585 annotations. Removed Callable and Union. * Update train to use new WarmupCosineDecay. * Updates to config params for both prod and single configs * Updates to component/losses to use the new defaults * Benchmark updates and bug fixes * calls to model.predict() now convert input to tensor using the CPU context. This avoids a mem leak when calling predict in a loop. * expose all NMSLib params in NMSLibSearch. This enables custom parametrization of the nsmlib indexes. * Update indexer to save and load Search objects from configs. Saving now works when passing a Search object to model.compile() * Update the benchmark configs to use the unique name as the key and provide a 'component' field for identifying the component to build. * Manually delete and clear all objects at the end of each benchmark loop to try and avoid memory leaks. However, we still leak mem in tf frameworks/ops.py * Make flage ignore E206 whitespace after ":". This was in conflict with the black formatter. * Enable manual trigger for testing workflow. * Refactor benchmark datasets to be a class. * Now supports creating custom Dataset objects to load other sources. Currently supports loading TFDS data. * Add support for define hyper_parameter versions of parameters to support KerasTuner search * Split the single train script into create_dataset, hyper_parameter_search and train * Update configuration files to remove the benchmark prefix. * Add support for retrieval metrics in callback. * Add support for R_precision and refactor map@k * map_at_k is now a subclass of precision_at_k. Reduces code duplication. * update names for precision_at_k and map_at_k when clip_at_k is set to true. Name no longer return an @k suffix but instead return wither R_Precision or map@R. * distances now return their params when get_config() is called. * Fix info string for mem samplers. * Memory samplers now correctly report the number of augmenations in the sampler object. * Fix mypy errors from newest mypy version. * Migrate to support pep585 where possible * Fix several small typing errors * Provide typing for 100% of modules * [nightly] Increase version to 0.17.0.dev13 * * GEM layers create a general pooling layer in the init, but we didn't pass the kwargs. This means the general pooling layer didn't have the dtype policy. This caused the GEM layers to fail when using a mixed_float dtype policy as the general pooling layer returns float32 and the GEM dtype policy is float16. The fix is to pass all kwargs onto the general pooling layer. * Patch bump * Cap the TF version at 2.9 for the current master branch. * Resolves Error while using projector (tensorflow#301) * Resolves Error while using projector Since the new major release of Bokeh version 3.0, the `plot_width` (and `plot_height`) properties have been removed. These have been replaced by standard `width` and `height` for all layout-able models according to [Migration Guide](https://github.com/bokeh/bokeh/wiki/Migration-Guides#300). The update fixes the error generated by the `projector`. * backward compatible This update makes `projector` backward compatible with `bokeh` * Apply formatting to projector changes to fix warnings from Black and ISort. * [nightly] Increase version to 0.17.0.dev14 * Model Save Updates (tensorflow#305) * Update github workflow test matrix to include py 3.10 and tf 2.7 and 2.10 * Update github workflow to use py 3.11 and tensorflow 2.11. * Fix testing error in test_schedules. import from keras should now be import from tensorflow.keras. * The optimizer serialize and deserialize are under schedules in TF instead of the learning_rate_schedules module from keras. * Turns out the workflow version must be < 3.11 * Python 3.10 requires TF >= 2.8. * Fix and simplify Contrastive Model save and load. * The old save and load manually loaded each inner model. This was required because we didn't build the outer model graph. * The new solution uses a factory function to infer the input shape and then connect all the inner models and pass the input and output to the contrastive model. This is enough for the standard model save to work. * Also silenced the INFO logs from nmslib. * Small formatting and cleanup in other files. * Remove extra print in EvalCallback * Fix order of contrastive model metrics so that losses come first. * Update unsupervised notebook to use new save and create functions. * Fix formatting in contrastive model module * [nightly] Increase version to 0.17.0.dev15 * Add MultiShotFileSampler (tensorflow#307) * Add MultiShotFileSampler * Refactor MultiShotMemorySampler to use load_example_fn * Update MultiShotFileSampler to inherit from MultiShotMemorySampler * Fix typing errors in file_samplers * Loss tests refactor (tensorflow#308) * Refactor the tests for the losses. * Use the tf.TestCase packages * Create utils for perfect and bad embedding examples. * Refactor Triplet and PN Loss to have a single margin param now with float | None. None will now set the soft_margin. * Replace basic tf.logsumexp with the TF sim stable logsumexp in the soft margin. * Fix bug in semi-hard mining where we don't have any valid negatives > max positive. Previously this defaulted to selecting the example in idx 0. We now take the negative that is closest to the maximal positive without going over, i.e., max(d(a,n)) <= max(d(a,p)). * Refactor the triplet loss tests. * Create a losses dir under tests. * Refactor the tests for the losses. * Use the tf.TestCase packages * Create utils for perfect and bad embedding examples. * Refactor Triplet and PN Loss to have a single margin param now with float | None. None will now set the soft_margin. * Replace basic tf.logsumexp with the TF sim stable logsumexp in the soft margin. * Fix bug in semi-hard mining where we don't have any valid negatives > max positive. Previously this defaulted to selecting the example in idx 0. We now take the negative that is closest to the maximal positive without going over, i.e., max(d(a,n)) <= max(d(a,p)). * Refactor the triplet loss tests. * Create a losses dir under tests. * Type empty_mask as BoolTensor to fix mypy error. We create a matrix of zeros as dtype bool and vector of ones as dtype bool, but mypy doesn't see these as BoolTensor type. This commit adds an explicit BoolTensor type to the empty_mask to fix this. * Fix formatting errors * fix formatting errors * [nightly] Increase version to 0.17.0.dev16 * Adding benchmark datasets component that was ignored due to datasets/ filter in the .gitignore * Float16 (tensorflow#310) * Fix casting to use the default floatx where possible to avoid type errors when training using mixed precision or float16. * Update tests for supporting float16 * Remove float dtype parameterization of readme tests. They were too slow. * Fix casting error when passing constant scalar. Set policy in multihead test to ensure we reset the policy to a good state. * Remove duplicate long running test. This should speed up test by ~3min. * [nightly] Increase version to 0.17.0.dev17 * Remove references to outputs in contrastive model. (tensorflow#311) * Remove references to outputs in contrastive model. We use the inputs and outputs to support saving the contrastive model using the Keras API, however, we override train and test steps as well as predict. This means we don't currently support multiple output heads on the embedding output. This PR removes all references to multi-headed outputs and explicitly sets the indexer to use the predictor output. * Provide default contrastive projector and predictor. Users had to provide their own MLP models for the projector and predictor. This required understanding more about the underlying algorithms. This change now adds default projector and predictor models based on the original papers. * Update unsupervised colab. * Comment out projector and predictor create model functions. We now automatically create the MLP models for users, but the commented code is left in case the user wants to customize them. * Verify that the model trains and reloads. * Loss and performance is slightly better than before. * Update the create_contrastive_model function to pass a list of outputs to better track the outputs. The model still overrides the predict function though as we need to apply the L2 Norm at the output. * Fix mypy error. * Update ouput var name and use epsilon constant. * [nightly] Increase version to 0.17.0.dev18 * Update release notes for 0.17.x * Update example notebooks. * Add patch to support passing custom NMSLibSearch objects. * Add temporary fix that passes the search object config to the make_search function in order to support resetting the search index. * NOTE: This is only temporary and a more general solution will be added in the new backend updates to search and store. * Updated the supervised visualization notebook to demo using the custom NMSLibSearch object. * Added warnings about the reset issues with custom objects in indexer. * Remove the old benchmark dataset file. * [nightly] Increase version to 0.17.0.dev19 * Update CLIP notebook and include search example. * Remove benchmark code from release * Set release version to 0.17.0 --------- Co-authored-by: Github Actions Bot <> Co-authored-by: Christoffer Hjort <[email protected]> Co-authored-by: dewball345 <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Genrry Hernandez <[email protected]> Co-authored-by: Abhishar Sinha <[email protected]> Co-authored-by: Emil Larsson <[email protected]> Co-authored-by: Abhishar Sinha <[email protected]> Co-authored-by: Luke Wood <[email protected]> Co-authored-by: Zoheb Abai <[email protected]> Co-authored-by: Mohammad Amin Haghpanah <[email protected]>
* [nightly] Increase version to 0.16.0.dev1 * [nightly] Increase version to 0.16.0.dev2 * [nightly] Increase version to 0.16.0.dev3 * Distance and losses refactor (#222) * refactor distances call signature and add appropriate tests * refactor metrics for new distance call signature * make similarity losses compatible with asymmetric and non-square distance matrices * adapt and add test * remove newline * [nightly] Increase version to 0.16.0.dev4 * [nightly] Increase version to 0.16.0.dev5 * [nightly] Increase version to 0.16.0.dev6 * [nightly] Increase version to 0.16.0.dev7 * [nightly] Increase version to 0.16.0.dev8 * Cross-batch memory (XBM) (#225) * initiate XBM loss * add todo * add XBM tests * WIP: XBM serialization * XBM serialization * class docstring * remove todo * improve docstring * remove comment * [nightly] Increase version to 0.16.0.dev9 * [nightly] Increase version to 0.16.0.dev10 * [nightly] Increase version to 0.16.0.dev11 * [nightly] Increase version to 0.16.0.dev12 * [nightly] Increase version to 0.16.0.dev13 * [nightly] Increase version to 0.16.0.dev14 * [nightly] Increase version to 0.16.0.dev15 * [nightly] Increase version to 0.16.0.dev16 * [nightly] Increase version to 0.16.0.dev17 * [nightly] Increase version to 0.16.0.dev18 * [nightly] Increase version to 0.16.0.dev19 * [nightly] Increase version to 0.16.0.dev20 * [nightly] Increase version to 0.16.0.dev21 * [nightly] Increase version to 0.16.0.dev22 * Augmentor for Barlow Twins (#229) * Use list(range()) instead of comprehension as it is more pythonic. * Create barlow.py * Bump three in /tensorflow_similarity/visualization/projector_v2 (#228) Bumps [three](https://github.com/mrdoob/three.js) from 0.132.2 to 0.137.0. - [Release notes](https://github.com/mrdoob/three.js/releases) - [Commits](https://github.com/mrdoob/three.js/commits) --- updated-dependencies: - dependency-name: three dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Restructure class to be like Augmenter * Minor fixing of dead links (#230) * Fixed dead links * augmenter main to master * Spelling changes Auto Augment * MixupAndCutmix main to master * RandAugment main to master * RandomErasing main to master * Update SimCLRAugmenter.md * Update ClassificationMatch.md * Update ClassificationMetric.md * Update Evaluator.md * Update MemoryEvaluator.md * Update SimilarityModel.md * Update BinaryAccuracy.md * Update F1Score.md * Update FalsePositiveRate.md * Update NegativePredictiveValue.md * Update Precision.md * Update Recall.md Co-authored-by: Owen Vallis <[email protected]> * Fix minor typos (#226) Co-authored-by: Owen Vallis <[email protected]> * Update barlow.py * Update barlow.py * Update setup.py * Update barlow.py * Update barlow.py * Update barlow.py * Update barlow.py * Update barlow.py * revisions * Update __init__.py * Update __init__.py * Update color_jitter.py * Update barlow.py * Update barlow.py * Update barlow.py * Update setup.py Co-authored-by: Owen S Vallis <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Owen Vallis <[email protected]> Co-authored-by: Genrry Hernandez <[email protected]> * Fixed some bugs in augmenter. (#232) * Create barlow.py * Restructure class to be like Augmenter * Update barlow.py * Update barlow.py * Update setup.py * Update barlow.py * Update barlow.py * Update barlow.py * Update barlow.py * Update barlow.py * revisions * Update __init__.py * Update __init__.py * Update color_jitter.py * Update barlow.py * Update barlow.py * Update barlow.py * Update setup.py * fixed some bugs * Remove seed instance variable Co-authored-by: Owen Vallis <[email protected]> * [nightly] Increase version to 0.16.0.dev23 * [nightly] Increase version to 0.16.0.dev24 * [nightly] Increase version to 0.16.0.dev25 * [nightly] Increase version to 0.16.0.dev26 * [nightly] Increase version to 0.16.0.dev27 * [nightly] Increase version to 0.16.0.dev28 * [nightly] Increase version to 0.16.0.dev29 * [nightly] Increase version to 0.16.0.dev30 * [nightly] Increase version to 0.16.0.dev31 * [nightly] Increase version to 0.16.0.dev32 * [nightly] Increase version to 0.16.0.dev33 * [nightly] Increase version to 0.16.0.dev34 * [nightly] Increase version to 0.16.0.dev35 * [nightly] Increase version to 0.16.0.dev36 * [nightly] Increase version to 0.16.0.dev37 * [nightly] Increase version to 0.16.0.dev38 * [nightly] Increase version to 0.16.0.dev39 * [nightly] Increase version to 0.16.0.dev40 * [nightly] Increase version to 0.16.0.dev41 * [nightly] Increase version to 0.16.0.dev42 * [nightly] Increase version to 0.16.0.dev43 * [nightly] Increase version to 0.16.0.dev44 * [nightly] Increase version to 0.16.0.dev45 * [nightly] Increase version to 0.16.0.dev46 * Added test coverage for augmentation functions + barlow, simCLR augmenter (#235) * Create test_blur.py * Create test_color_jitter.py * Create test_crop.py * Create test_flip.py * Update test_crop.py * Update test_color_jitter.py * Create test_solarize.py * Create test_augmenters.py * Update test_flip.py * Update test_flip.py * Update test_flip.py * Update blur.py * Update blur.py * [nightly] Increase version to 0.16.0.dev47 * Change augmenters to use augmentation_utils (#238) * Fix corrupted JSON formatting in unsupervised notebook. * Added features of SplitValidationLoss callback to EvalCallback (#242) * Added features of SplitValidationLoss callback to EvalCallback Merged SplitValidationLoss into EvalCallbaclk * Refactored EvalCallback using utils.unpack_results * [nightly] Increase version to 0.16.0.dev48 * [nightly] Increase version to 0.16.0.dev49 * [nightly] Increase version to 0.16.0.dev50 * VicReg Loss - Improvement of Barlow Twins (#243) * VicReg Loss * Update vicreg.py * Update vicreg.py * Update vicreg.py * fix big bug * Update vicreg.py * Update vicreg.py * fixes * Update vicreg.py * [nightly] Increase version to 0.16.0.dev51 * [nightly] Increase version to 0.16.0.dev52 * Update tests for algebra.py * Coverage now at 100% * Convert tests to use tf.testing.TestCase * [nightly] Increase version to 0.16.0.dev53 * [nightly] Increase version to 0.16.0.dev54 * Fix corrupted formatting in visualization notebook. * [bug] Fix multisim loss offsets. The tfsim version of multisim uses distances instead of the inner product. However, multisim requires that we "center" the pairwise distances around 0. Here we add a new center param, which we set to 1.0 for cosine distance. Additionally, we also flip the lambda (lmda) param to add the threshold to the values instead of subtracting it. These changes will help improve the pos and neg weighting in the log1psumexp. * [nightly] Increase version to 0.16.0.dev55 * [bug] In losses.utils.logsumexp() tf.math.log(1 + x) should be tf.math.log(tf.math.exp(-my_max) + x). This is needed to properly account for removing the rowwise max before computing the logsumexp. * Make Augmentation Utilities More Customizable(reupload due to branch issues) (#255) * modifications of benchmark * test commit 123 * new changes to training * testo changes * works in colab... kind of * code is neat now * working on sampler problem * Update barlow.py * Update blur.py * Update color_jitter.py * Update color_jitter.py * Update barlow.py * Update barlow.py * Added vicreg for sync * Update vicreg.py * Update vicreg.py * Update vicreg.py * Update barlow.py * randomresizedcrop edits * Update barlow.py * allow to customize loss reduction * Update __init__.py * Delete sampler_test.py * Delete benchmark/supervised directory * Update barlow.py * added docstring on random_resized_crop * Allow user to set normalization * Update barlow.py * Update barlow.py * Update setup.py * remove pipfile * Delete Pipfile * Delete Pipfile.lock * Update cropping.py * Update cropping.py * Additive multiplicative changes * Update simclr.py * change additive, multiplicative * Update barlow.py * Update solarize.py * Update barlow.py * Update solarize.py * Update barlow.py * Update test_solarize.py * Update test_solarize.py * Update test_solarize.py Co-authored-by: Owen Vallis <[email protected]> * Refactor test_basic to use TestCase to improve flaky test results. * Fix Flake8 warnings. * Freeze all batchnorm architecture layers. We now freeze all BN layers when loading pre-trained weights in the effnet and resnet50 architectures. Previously, we only froze the BN layers if trainable was partial or frozen. When trainable was full, the BN layers would be trainable as well and this led to suboptimal training losses. * Improve EfficientNetSim docstring and type hints (#254) * Fix typos in docstring * Remove reference to image augmentation Image augmentation was previously removed, so purge it from the comment and docstring. * Correct input image type annotation * Fix #251. Check for model._index before calling Indexer methods. The Indexer is core to a number of the Similarity model methods. Add support for checking if the index exists and return a more informative AttributeError if the index hasn't been created yet. * Set random seeds for tfrecord samplers test. * All augmenters use the Tensor type from tensorflow_similarity.types. * [nightly] Increase version to 0.16.0.dev56 * Fix Tensor type error in callbacks. Unpacking the Lookup objects converts the python types to Tensors. This can lead to Tensor type errors. This commit adds support for taking the expected dtype of the model Tensors where possible. We also fix a bug where the EvalCallback was not logging the split metric names in the history. * Update doc strings in color_jitter. * Update the create index AttributeError text * [nightly] Increase version to 0.16.0.dev57 * Update Notebook examples. * Remove unneeded tf.function and register_keras_serializable decorators. Subclasses of tf.keras.losses.Loss will trace all child functions and we only need to register the subclassed loss to support deserialization. * Simplify MetricEmbedding layer. * Fix mypy type error in simsiam. Convert all constants to tf.constant. * Simplify the MetricEmbedding layer. Subclass layers.Dense directly. This simplifies the layer and also fixes function tracing during model save. * Fix test_tfrecord_samplers tests. * Update api documentation. TODO: This just generated the new docs. We still need to go through and clean up the documentation. * Update doc string and api for MetricEmbedding layer. * Bump to version 0.16 * Fix static type check error in memory store. The np.savez functions expect array_like values but we were passing List. Casting as np array should solve the issue. * Fix effnet test for TF 2.9 * Fix TFRecordDatasetSampler now returns correct number of examples per batch. * Bump dev version to 0.17.0.dev0. * [nightly] Increase version to 0.17.0.dev1 * [nightly] Increase version to 0.17.0.dev2 * [nightly] Increase version to 0.17.0.dev3 * [nightly] Increase version to 0.17.0.dev4 * [nightly] Increase version to 0.17.0.dev5 * [nightly] Increase version to 0.17.0.dev6 * [nightly] Increase version to 0.17.0.dev7 * [nightly] Increase version to 0.17.0.dev8 * Add support for configuring and running benchmarks for supervised losses. Add support for passing the same examples for both the query and indexed set when calling retrieval_metrics. Added a new param to each retrieval_metric that enables dropping the nearest neighbor. This is useful if the nearest neighbor exists in the indexed examples. * Update benchmark README and max line length in .flake8 * Updates to the benchmark code - Add clean_dir func to train file. - Add support for creating precision@k and map@k eval metrics - Fix typing issue in map@k. We now take the class counts type from the query label dtype. - Remove 1 count from class counts if we are dropping the first result. - Refactor the make functions in train to use a Dict for all the parameterized modules. * [nightly] Increase version to 0.17.0.dev9 * Fixed typo in slice id * black formatting * black formatting * Fixed typo to resolve #284 The function should be tf.concat instead of tf.constant, according to the description given above. This also resolves issue #284 * [nightly] Increase version to 0.17.0.dev10 * Update to match the API of the latest keras_cv version Check out keras-team/keras-cv#738 for more information. Once this is merged we're breaking backwards compatibility to have a much nicer API name. * Add clip_at_r to support computing MAP@R from map_at_k module. * Refactor benchmark components into separate modules. * Update benchmark configs to use smaller 1e-6 learning rates. Update train.py main to loop through the various embedding sizes in the architectures. * Fix tests for clip_at_r in map_at_k retrieval metric. Refactor the clip at r changes to use map_fn. * [nightly] Increase version to 0.17.0.dev11 * Update to benchmark configs and experiments with adding LR Schedule. * Update benchmark README * Black formatting for map_at_k * Add requirements file for benchmarks * Refactor benchmark code - Support filtering the set of experiments using a regex pattern passed in the args. - Add typing - Refactor the config parsing into a separate dataclass - Refactor the cross product of all params to use itertools product - Update requirements to add open-cv. This is needed for the caltech birds dataset. - Refactor the config to have a consistent use of the dict keys for object creation and add a separate name field for identifying the specific set of params associated with the object. * Add user prompt to continue/exit benchmark run after run_grps are listed. Update README to include example of filter pattern. * make_eval_data now returns a new list of augmented examples instead of updating the original list. Remove return when user input == Y * Set soft_margin default to True. The default value was previously set to False but the doc string stated the default value as True. * Set nmslib to brute force search and remove agg results. - Brute force search removes any noise introduced by an aprox NN search. - Removing the agg results as we will provide a utility for aggregating the result folders from each experiment. * Update loss ids in the losses component. - Removed the '_loss' suffix from the loss ids as it was redundent. - Add xmb, triplet loss, and soft nn loss to the losses config section. * Google Summer of Code (#286) * Added multiple negatives ranking loss * Added multimodal example * Added support for multiple distances in mnrl loss Added support for different distances in multiple negatives ranking loss * Added link to multimodal example notebook * black formatting * Using numerically stable implementation of logsumexp * Delete pyproject.toml * Updated pyproject.toml * Black formatting in multinegrank_loss * Updated pip install url to dev branch Co-authored-by: Owen Vallis <[email protected]> * resolve #299 Fix WarmupCosineDecay. * Previous version scaled the cosine decay by a linear warmup value. So the max value was max_lr*0.5*(1+cos(warmup_steps/total_steps*pi)) * New version has a linear warmup and then begins the cosine decay from cos(0.0) so the max value is now max_lr. * Previous version accepted a tensor of values, this is not needed. Simplified to accept a single scaler step value. * Updated tests to be consistent with the keras LearningRateSchedule tests. * Renamed class from WarmUpCosine to WarmupCosineDecay. This is more consistent with the Keras LearningRateSchedules. * [nightly] Increase version to 0.17.0.dev12 * Update pn_loss default params and doc string formatting. * Make soft_margin the default. The doc string stated this was the default but the param was set to False. * Make the default margin 0.1. The previous value was 1.0 which produced sub-optimal results when using cosine distance. * Reformat the doc strings to align with the google docstring style. * Add support for the pep585 annotations. Removed Callable and Union. * Update triplet_loss default params and doc string formatting. * Make soft_margin the default. The doc string stated this was the default but the param was set to False. * Make the default margin 0.1. The previous value was 1.0 which produced sub-optimal results when using cosine distance. * Reformat the doc strings to align with the google docstring style. * Add support for the pep585 annotations. Removed Callable and Union. * Update train to use new WarmupCosineDecay. * Updates to config params for both prod and single configs * Updates to component/losses to use the new defaults * Benchmark updates and bug fixes * calls to model.predict() now convert input to tensor using the CPU context. This avoids a mem leak when calling predict in a loop. * expose all NMSLib params in NMSLibSearch. This enables custom parametrization of the nsmlib indexes. * Update indexer to save and load Search objects from configs. Saving now works when passing a Search object to model.compile() * Update the benchmark configs to use the unique name as the key and provide a 'component' field for identifying the component to build. * Manually delete and clear all objects at the end of each benchmark loop to try and avoid memory leaks. However, we still leak mem in tf frameworks/ops.py * Make flage ignore E206 whitespace after ":". This was in conflict with the black formatter. * Enable manual trigger for testing workflow. * Refactor benchmark datasets to be a class. * Now supports creating custom Dataset objects to load other sources. Currently supports loading TFDS data. * Add support for define hyper_parameter versions of parameters to support KerasTuner search * Split the single train script into create_dataset, hyper_parameter_search and train * Update configuration files to remove the benchmark prefix. * Add support for retrieval metrics in callback. * Add support for R_precision and refactor map@k * map_at_k is now a subclass of precision_at_k. Reduces code duplication. * update names for precision_at_k and map_at_k when clip_at_k is set to true. Name no longer return an @k suffix but instead return wither R_Precision or map@R. * distances now return their params when get_config() is called. * Fix info string for mem samplers. * Memory samplers now correctly report the number of augmenations in the sampler object. * Fix mypy errors from newest mypy version. * Migrate to support pep585 where possible * Fix several small typing errors * Provide typing for 100% of modules * [nightly] Increase version to 0.17.0.dev13 * * GEM layers create a general pooling layer in the init, but we didn't pass the kwargs. This means the general pooling layer didn't have the dtype policy. This caused the GEM layers to fail when using a mixed_float dtype policy as the general pooling layer returns float32 and the GEM dtype policy is float16. The fix is to pass all kwargs onto the general pooling layer. * Patch bump * Cap the TF version at 2.9 for the current master branch. * Resolves Error while using projector (#301) * Resolves Error while using projector Since the new major release of Bokeh version 3.0, the `plot_width` (and `plot_height`) properties have been removed. These have been replaced by standard `width` and `height` for all layout-able models according to [Migration Guide](https://github.com/bokeh/bokeh/wiki/Migration-Guides#300). The update fixes the error generated by the `projector`. * backward compatible This update makes `projector` backward compatible with `bokeh` * Apply formatting to projector changes to fix warnings from Black and ISort. * [nightly] Increase version to 0.17.0.dev14 * Model Save Updates (#305) * Update github workflow test matrix to include py 3.10 and tf 2.7 and 2.10 * Update github workflow to use py 3.11 and tensorflow 2.11. * Fix testing error in test_schedules. import from keras should now be import from tensorflow.keras. * The optimizer serialize and deserialize are under schedules in TF instead of the learning_rate_schedules module from keras. * Turns out the workflow version must be < 3.11 * Python 3.10 requires TF >= 2.8. * Fix and simplify Contrastive Model save and load. * The old save and load manually loaded each inner model. This was required because we didn't build the outer model graph. * The new solution uses a factory function to infer the input shape and then connect all the inner models and pass the input and output to the contrastive model. This is enough for the standard model save to work. * Also silenced the INFO logs from nmslib. * Small formatting and cleanup in other files. * Remove extra print in EvalCallback * Fix order of contrastive model metrics so that losses come first. * Update unsupervised notebook to use new save and create functions. * Fix formatting in contrastive model module * [nightly] Increase version to 0.17.0.dev15 * Add MultiShotFileSampler (#307) * Add MultiShotFileSampler * Refactor MultiShotMemorySampler to use load_example_fn * Update MultiShotFileSampler to inherit from MultiShotMemorySampler * Fix typing errors in file_samplers * Loss tests refactor (#308) * Refactor the tests for the losses. * Use the tf.TestCase packages * Create utils for perfect and bad embedding examples. * Refactor Triplet and PN Loss to have a single margin param now with float | None. None will now set the soft_margin. * Replace basic tf.logsumexp with the TF sim stable logsumexp in the soft margin. * Fix bug in semi-hard mining where we don't have any valid negatives > max positive. Previously this defaulted to selecting the example in idx 0. We now take the negative that is closest to the maximal positive without going over, i.e., max(d(a,n)) <= max(d(a,p)). * Refactor the triplet loss tests. * Create a losses dir under tests. * Refactor the tests for the losses. * Use the tf.TestCase packages * Create utils for perfect and bad embedding examples. * Refactor Triplet and PN Loss to have a single margin param now with float | None. None will now set the soft_margin. * Replace basic tf.logsumexp with the TF sim stable logsumexp in the soft margin. * Fix bug in semi-hard mining where we don't have any valid negatives > max positive. Previously this defaulted to selecting the example in idx 0. We now take the negative that is closest to the maximal positive without going over, i.e., max(d(a,n)) <= max(d(a,p)). * Refactor the triplet loss tests. * Create a losses dir under tests. * Type empty_mask as BoolTensor to fix mypy error. We create a matrix of zeros as dtype bool and vector of ones as dtype bool, but mypy doesn't see these as BoolTensor type. This commit adds an explicit BoolTensor type to the empty_mask to fix this. * Fix formatting errors * fix formatting errors * [nightly] Increase version to 0.17.0.dev16 * Adding benchmark datasets component that was ignored due to datasets/ filter in the .gitignore * Float16 (#310) * Fix casting to use the default floatx where possible to avoid type errors when training using mixed precision or float16. * Update tests for supporting float16 * Remove float dtype parameterization of readme tests. They were too slow. * Fix casting error when passing constant scalar. Set policy in multihead test to ensure we reset the policy to a good state. * Remove duplicate long running test. This should speed up test by ~3min. * [nightly] Increase version to 0.17.0.dev17 * Remove references to outputs in contrastive model. (#311) * Remove references to outputs in contrastive model. We use the inputs and outputs to support saving the contrastive model using the Keras API, however, we override train and test steps as well as predict. This means we don't currently support multiple output heads on the embedding output. This PR removes all references to multi-headed outputs and explicitly sets the indexer to use the predictor output. * Provide default contrastive projector and predictor. Users had to provide their own MLP models for the projector and predictor. This required understanding more about the underlying algorithms. This change now adds default projector and predictor models based on the original papers. * Update unsupervised colab. * Comment out projector and predictor create model functions. We now automatically create the MLP models for users, but the commented code is left in case the user wants to customize them. * Verify that the model trains and reloads. * Loss and performance is slightly better than before. * Update the create_contrastive_model function to pass a list of outputs to better track the outputs. The model still overrides the predict function though as we need to apply the L2 Norm at the output. * Fix mypy error. * Update ouput var name and use epsilon constant. * [nightly] Increase version to 0.17.0.dev18 * Update release notes for 0.17.x * adding linear search, faiss ann search, cached storage, and redis storage. Also refactoring indexer class for easing implementation of indexers that would depend on packages that include both search and storage. * formatting * formatting and fixing couple of issues * fix the backward compatibility issue * add dependencies * remove dependencies * fixed typing issues * remove extra typing * move evaluator * fix tests * switch from dbm to shelve * set temp dir for storing cached store * add debug logging * add debug logging * specify dbm implementation for cross-machine compatibility * switch to ndb.dumb as other options not available on all machines * fix import orders * remove extraneous logging * ensure only names are stored in metadata * separate store from store_type, and search from search_type, needed for serialization of metadata * use str path * put typing in one place * add canonical name for consistent reload * accept canonical_name * Remove optional * adding more tests * pass str for path * support more distances for LinearSearch * add indexing colab * applying fixes to PR review comments * typo * fix the tests for no normalization * add distance * fix typing * Update example notebooks. * Add patch to support passing custom NMSLibSearch objects. * Add temporary fix that passes the search object config to the make_search function in order to support resetting the search index. * NOTE: This is only temporary and a more general solution will be added in the new backend updates to search and store. * Updated the supervised visualization notebook to demo using the custom NMSLibSearch object. * Added warnings about the reset issues with custom objects in indexer. * Remove the old benchmark dataset file. * [nightly] Increase version to 0.17.0.dev19 * Update CLIP notebook and include search example. * Updating dev version to 0.18.0.dev0 Merged 0.17.0 into master and bumping the dev versions to prepare for the next release. * remove double definition * [nightly] Increase version to 0.18.0.dev1 * Indexing (#321) * fix formatting * small fixes * adding reset to Search * adding reset to stores * add reset to indexer * add more tests --------- Co-authored-by: Ali Zand <[email protected]> * [nightly] Increase version to 0.18.0.dev2 * Cherrypick master (#331) * Update similarity_model.py Update verbose printing to display the count of indexed items. Verbose output was missing an f-string prefix and also returned the entire shape. Now we just return the number of examples. * 0.17 patches (#325) * fixes #323 Default indexer distance is now cosine in Sim Model. Calling create_index method now defaults to cosine distance. Additionally, auto distance defaults to cosine if no distance is passed to compile. * fixes #322 remove all calls to tf.convert_to_tensor in SimModel. * Update gitignore to exclude models and datasets from the example notebooks. * Update multi-modal notebook to remove the call to compile. * Patch bump * Remove check for tf.shape in index. Input can also be tuple or dict, so we should use len() here. * Update github workflow tests to use TF >= 2.8 * Tensor slice sampler (#329) * Create tfdata_sampler.py Initial version of new tf.data.Dataset sampler. * Refactor and clean up the tf data sampler. * Add initial tests for tfdata_sampler * Reformat TFDataSampler test file. * Fix proto dep issue in github workflow tests. py 3.10 breaks with protobuf > 3.20.x * Setting env var didn't work. Trying again with pinning the protobuf version to 3.20.1 * Check TF version before creating the tf dataset counter. * Format file * Remove as_numpy_iterator when creating the list of grouped datasets. * Also move class_list filter to before the group_by function * Apply the total_examples_per_class as a take() function on each grouped dataset * Remove as much casting as possible from the dataset. Certain functions expect an int64 though and require casting. * Refactor to move the filter by class list out of the window_group_by function. * Add class list filter test. * Move augment_fn and load_fn to before the repeat and batch functions. This change means the aug and load functions apply per example now. This will make it easier to apply random augmentations per example and is more consistent with how we implemented it in the existing memory sampler. This change also improves the tests for all parts of the module. * Add support for handling tuple and dict values for y. This change adds support for passing a callable to parse the correct class id element for batch sampling. By default y is assumed to be a 1D tensor with the class ids and the function is lambda y:y. Otherwise we accept an int or str and construct a parser to get the class id tensor. * Update email for github actions bot to fix CLA errors in PR * Fix import order and remove typing imports * Fix import check in search init. * Small updates to tfdata_sampler doc string * [nightly] Increase version to 0.18.0.dev3 * Remove version check and replace with try except attribute error. (#332) * [nightly] Increase version to 0.18.0.dev4 * Fix #333 correct typo in memory_store to_dataframe. * [nightly] Increase version to 0.18.0.dev5 * Refactoring unit tests for increased test coverage (#320) * Refactor similiarity unit tests to Tensorflow TestCase, reduce usage of Numpy API * Refactor similarity unittests to reduce usage of numpy and increase overall coverage * Merge branch 'development' of https://github.com/tensorflow/similarity into development * reformat tf search initialization file * Update indexer test files from recent push * Refactor similiarity unit tests to Tensorflow TestCase, reduce usage of Numpy API * Refactor similarity unittests to reduce usage of numpy and increase overall coverage * Merge branch 'development' of https://github.com/tensorflow/similarity into development * reformat tf search initialization file * Update indexer test files from recent push * Cleaned up and reformatted fiels * Sort test_file_samplers file * Fix formatting. * [nightly] Increase version to 0.18.0.dev6 * Cleanup imports to legacy tensorflow.python.keras (#336) * Cleanup imports to legacy tensorflow.python.keras for tensorflow_similarity Remove call to conv_utils and explicitly define normalize_data_format for channels check. * reformat layers.py * [nightly] Increase version to 0.18.0.dev7 * Dev cleanup (#344) * ensure no -1 is returned from FAISS * use string keys for redis store * update indexing colab * ensure faiss returned -1s are filtered out, given faiss returns -1s wherever it does not find the neighbors * fix a bug with distances * fix formatting --------- Co-authored-by: Ali Zand <[email protected]> * Fix formatting and modue import sort order. * [nightly] Increase version to 0.18.0.dev8 --------- Co-authored-by: Github Actions Bot <> Co-authored-by: Christoffer Hjort <[email protected]> Co-authored-by: dewball345 <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Genrry Hernandez <[email protected]> Co-authored-by: Abhishar Sinha <[email protected]> Co-authored-by: Emil Larsson <[email protected]> Co-authored-by: Abhishar Sinha <[email protected]> Co-authored-by: Luke Wood <[email protected]> Co-authored-by: Zoheb Abai <[email protected]> Co-authored-by: Mohammad Amin Haghpanah <[email protected]> Co-authored-by: Ali Zand <[email protected]> Co-authored-by: Ali Zand <[email protected]> Co-authored-by: Github Actions Bot <[email protected]> Co-authored-by: Abel Theodros <[email protected]>
* Distance and losses refactor (#222) * refactor distances call signature and add appropriate tests * refactor metrics for new distance call signature * make similarity losses compatible with asymmetric and non-square distance matrices * adapt and add test * remove newline * [nightly] Increase version to 0.16.0.dev4 * [nightly] Increase version to 0.16.0.dev5 * [nightly] Increase version to 0.16.0.dev6 * [nightly] Increase version to 0.16.0.dev7 * [nightly] Increase version to 0.16.0.dev8 * Cross-batch memory (XBM) (#225) * initiate XBM loss * add todo * add XBM tests * WIP: XBM serialization * XBM serialization * class docstring * remove todo * improve docstring * remove comment * [nightly] Increase version to 0.16.0.dev9 * [nightly] Increase version to 0.16.0.dev10 * [nightly] Increase version to 0.16.0.dev11 * [nightly] Increase version to 0.16.0.dev12 * [nightly] Increase version to 0.16.0.dev13 * [nightly] Increase version to 0.16.0.dev14 * [nightly] Increase version to 0.16.0.dev15 * [nightly] Increase version to 0.16.0.dev16 * [nightly] Increase version to 0.16.0.dev17 * [nightly] Increase version to 0.16.0.dev18 * [nightly] Increase version to 0.16.0.dev19 * [nightly] Increase version to 0.16.0.dev20 * [nightly] Increase version to 0.16.0.dev21 * [nightly] Increase version to 0.16.0.dev22 * Augmentor for Barlow Twins (#229) * Use list(range()) instead of comprehension as it is more pythonic. * Create barlow.py * Bump three in /tensorflow_similarity/visualization/projector_v2 (#228) Bumps [three](https://github.com/mrdoob/three.js) from 0.132.2 to 0.137.0. - [Release notes](https://github.com/mrdoob/three.js/releases) - [Commits](https://github.com/mrdoob/three.js/commits) --- updated-dependencies: - dependency-name: three dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Restructure class to be like Augmenter * Minor fixing of dead links (#230) * Fixed dead links * augmenter main to master * Spelling changes Auto Augment * MixupAndCutmix main to master * RandAugment main to master * RandomErasing main to master * Update SimCLRAugmenter.md * Update ClassificationMatch.md * Update ClassificationMetric.md * Update Evaluator.md * Update MemoryEvaluator.md * Update SimilarityModel.md * Update BinaryAccuracy.md * Update F1Score.md * Update FalsePositiveRate.md * Update NegativePredictiveValue.md * Update Precision.md * Update Recall.md Co-authored-by: Owen Vallis <[email protected]> * Fix minor typos (#226) Co-authored-by: Owen Vallis <[email protected]> * Update barlow.py * Update barlow.py * Update setup.py * Update barlow.py * Update barlow.py * Update barlow.py * Update barlow.py * Update barlow.py * revisions * Update __init__.py * Update __init__.py * Update color_jitter.py * Update barlow.py * Update barlow.py * Update barlow.py * Update setup.py Co-authored-by: Owen S Vallis <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Owen Vallis <[email protected]> Co-authored-by: Genrry Hernandez <[email protected]> * Fixed some bugs in augmenter. (#232) * Create barlow.py * Restructure class to be like Augmenter * Update barlow.py * Update barlow.py * Update setup.py * Update barlow.py * Update barlow.py * Update barlow.py * Update barlow.py * Update barlow.py * revisions * Update __init__.py * Update __init__.py * Update color_jitter.py * Update barlow.py * Update barlow.py * Update barlow.py * Update setup.py * fixed some bugs * Remove seed instance variable Co-authored-by: Owen Vallis <[email protected]> * [nightly] Increase version to 0.16.0.dev23 * [nightly] Increase version to 0.16.0.dev24 * [nightly] Increase version to 0.16.0.dev25 * [nightly] Increase version to 0.16.0.dev26 * [nightly] Increase version to 0.16.0.dev27 * [nightly] Increase version to 0.16.0.dev28 * [nightly] Increase version to 0.16.0.dev29 * [nightly] Increase version to 0.16.0.dev30 * [nightly] Increase version to 0.16.0.dev31 * [nightly] Increase version to 0.16.0.dev32 * [nightly] Increase version to 0.16.0.dev33 * [nightly] Increase version to 0.16.0.dev34 * [nightly] Increase version to 0.16.0.dev35 * [nightly] Increase version to 0.16.0.dev36 * [nightly] Increase version to 0.16.0.dev37 * [nightly] Increase version to 0.16.0.dev38 * [nightly] Increase version to 0.16.0.dev39 * [nightly] Increase version to 0.16.0.dev40 * [nightly] Increase version to 0.16.0.dev41 * [nightly] Increase version to 0.16.0.dev42 * [nightly] Increase version to 0.16.0.dev43 * [nightly] Increase version to 0.16.0.dev44 * [nightly] Increase version to 0.16.0.dev45 * [nightly] Increase version to 0.16.0.dev46 * Added test coverage for augmentation functions + barlow, simCLR augmenter (#235) * Create test_blur.py * Create test_color_jitter.py * Create test_crop.py * Create test_flip.py * Update test_crop.py * Update test_color_jitter.py * Create test_solarize.py * Create test_augmenters.py * Update test_flip.py * Update test_flip.py * Update test_flip.py * Update blur.py * Update blur.py * [nightly] Increase version to 0.16.0.dev47 * Change augmenters to use augmentation_utils (#238) * Fix corrupted JSON formatting in unsupervised notebook. * Added features of SplitValidationLoss callback to EvalCallback (#242) * Added features of SplitValidationLoss callback to EvalCallback Merged SplitValidationLoss into EvalCallbaclk * Refactored EvalCallback using utils.unpack_results * [nightly] Increase version to 0.16.0.dev48 * [nightly] Increase version to 0.16.0.dev49 * [nightly] Increase version to 0.16.0.dev50 * VicReg Loss - Improvement of Barlow Twins (#243) * VicReg Loss * Update vicreg.py * Update vicreg.py * Update vicreg.py * fix big bug * Update vicreg.py * Update vicreg.py * fixes * Update vicreg.py * [nightly] Increase version to 0.16.0.dev51 * [nightly] Increase version to 0.16.0.dev52 * Update tests for algebra.py * Coverage now at 100% * Convert tests to use tf.testing.TestCase * [nightly] Increase version to 0.16.0.dev53 * [nightly] Increase version to 0.16.0.dev54 * Fix corrupted formatting in visualization notebook. * [bug] Fix multisim loss offsets. The tfsim version of multisim uses distances instead of the inner product. However, multisim requires that we "center" the pairwise distances around 0. Here we add a new center param, which we set to 1.0 for cosine distance. Additionally, we also flip the lambda (lmda) param to add the threshold to the values instead of subtracting it. These changes will help improve the pos and neg weighting in the log1psumexp. * [nightly] Increase version to 0.16.0.dev55 * [bug] In losses.utils.logsumexp() tf.math.log(1 + x) should be tf.math.log(tf.math.exp(-my_max) + x). This is needed to properly account for removing the rowwise max before computing the logsumexp. * Make Augmentation Utilities More Customizable(reupload due to branch issues) (#255) * modifications of benchmark * test commit 123 * new changes to training * testo changes * works in colab... kind of * code is neat now * working on sampler problem * Update barlow.py * Update blur.py * Update color_jitter.py * Update color_jitter.py * Update barlow.py * Update barlow.py * Added vicreg for sync * Update vicreg.py * Update vicreg.py * Update vicreg.py * Update barlow.py * randomresizedcrop edits * Update barlow.py * allow to customize loss reduction * Update __init__.py * Delete sampler_test.py * Delete benchmark/supervised directory * Update barlow.py * added docstring on random_resized_crop * Allow user to set normalization * Update barlow.py * Update barlow.py * Update setup.py * remove pipfile * Delete Pipfile * Delete Pipfile.lock * Update cropping.py * Update cropping.py * Additive multiplicative changes * Update simclr.py * change additive, multiplicative * Update barlow.py * Update solarize.py * Update barlow.py * Update solarize.py * Update barlow.py * Update test_solarize.py * Update test_solarize.py * Update test_solarize.py Co-authored-by: Owen Vallis <[email protected]> * Refactor test_basic to use TestCase to improve flaky test results. * Fix Flake8 warnings. * Freeze all batchnorm architecture layers. We now freeze all BN layers when loading pre-trained weights in the effnet and resnet50 architectures. Previously, we only froze the BN layers if trainable was partial or frozen. When trainable was full, the BN layers would be trainable as well and this led to suboptimal training losses. * Improve EfficientNetSim docstring and type hints (#254) * Fix typos in docstring * Remove reference to image augmentation Image augmentation was previously removed, so purge it from the comment and docstring. * Correct input image type annotation * Fix #251. Check for model._index before calling Indexer methods. The Indexer is core to a number of the Similarity model methods. Add support for checking if the index exists and return a more informative AttributeError if the index hasn't been created yet. * Set random seeds for tfrecord samplers test. * All augmenters use the Tensor type from tensorflow_similarity.types. * [nightly] Increase version to 0.16.0.dev56 * Fix Tensor type error in callbacks. Unpacking the Lookup objects converts the python types to Tensors. This can lead to Tensor type errors. This commit adds support for taking the expected dtype of the model Tensors where possible. We also fix a bug where the EvalCallback was not logging the split metric names in the history. * Update doc strings in color_jitter. * Update the create index AttributeError text * [nightly] Increase version to 0.16.0.dev57 * Update Notebook examples. * Remove unneeded tf.function and register_keras_serializable decorators. Subclasses of tf.keras.losses.Loss will trace all child functions and we only need to register the subclassed loss to support deserialization. * Simplify MetricEmbedding layer. * Fix mypy type error in simsiam. Convert all constants to tf.constant. * Simplify the MetricEmbedding layer. Subclass layers.Dense directly. This simplifies the layer and also fixes function tracing during model save. * Fix test_tfrecord_samplers tests. * Update api documentation. TODO: This just generated the new docs. We still need to go through and clean up the documentation. * Update doc string and api for MetricEmbedding layer. * Bump to version 0.16 * Fix static type check error in memory store. The np.savez functions expect array_like values but we were passing List. Casting as np array should solve the issue. * Fix effnet test for TF 2.9 * Fix TFRecordDatasetSampler now returns correct number of examples per batch. * Bump dev version to 0.17.0.dev0. * [nightly] Increase version to 0.17.0.dev1 * [nightly] Increase version to 0.17.0.dev2 * [nightly] Increase version to 0.17.0.dev3 * [nightly] Increase version to 0.17.0.dev4 * [nightly] Increase version to 0.17.0.dev5 * [nightly] Increase version to 0.17.0.dev6 * [nightly] Increase version to 0.17.0.dev7 * [nightly] Increase version to 0.17.0.dev8 * Add support for configuring and running benchmarks for supervised losses. Add support for passing the same examples for both the query and indexed set when calling retrieval_metrics. Added a new param to each retrieval_metric that enables dropping the nearest neighbor. This is useful if the nearest neighbor exists in the indexed examples. * Update benchmark README and max line length in .flake8 * Updates to the benchmark code - Add clean_dir func to train file. - Add support for creating precision@k and map@k eval metrics - Fix typing issue in map@k. We now take the class counts type from the query label dtype. - Remove 1 count from class counts if we are dropping the first result. - Refactor the make functions in train to use a Dict for all the parameterized modules. * [nightly] Increase version to 0.17.0.dev9 * Fixed typo in slice id * black formatting * black formatting * Fixed typo to resolve #284 The function should be tf.concat instead of tf.constant, according to the description given above. This also resolves issue #284 * [nightly] Increase version to 0.17.0.dev10 * Update to match the API of the latest keras_cv version Check out keras-team/keras-cv#738 for more information. Once this is merged we're breaking backwards compatibility to have a much nicer API name. * Add clip_at_r to support computing MAP@R from map_at_k module. * Refactor benchmark components into separate modules. * Update benchmark configs to use smaller 1e-6 learning rates. Update train.py main to loop through the various embedding sizes in the architectures. * Fix tests for clip_at_r in map_at_k retrieval metric. Refactor the clip at r changes to use map_fn. * [nightly] Increase version to 0.17.0.dev11 * Update to benchmark configs and experiments with adding LR Schedule. * Update benchmark README * Black formatting for map_at_k * Add requirements file for benchmarks * Refactor benchmark code - Support filtering the set of experiments using a regex pattern passed in the args. - Add typing - Refactor the config parsing into a separate dataclass - Refactor the cross product of all params to use itertools product - Update requirements to add open-cv. This is needed for the caltech birds dataset. - Refactor the config to have a consistent use of the dict keys for object creation and add a separate name field for identifying the specific set of params associated with the object. * Add user prompt to continue/exit benchmark run after run_grps are listed. Update README to include example of filter pattern. * make_eval_data now returns a new list of augmented examples instead of updating the original list. Remove return when user input == Y * Set soft_margin default to True. The default value was previously set to False but the doc string stated the default value as True. * Set nmslib to brute force search and remove agg results. - Brute force search removes any noise introduced by an aprox NN search. - Removing the agg results as we will provide a utility for aggregating the result folders from each experiment. * Update loss ids in the losses component. - Removed the '_loss' suffix from the loss ids as it was redundent. - Add xmb, triplet loss, and soft nn loss to the losses config section. * Google Summer of Code (#286) * Added multiple negatives ranking loss * Added multimodal example * Added support for multiple distances in mnrl loss Added support for different distances in multiple negatives ranking loss * Added link to multimodal example notebook * black formatting * Using numerically stable implementation of logsumexp * Delete pyproject.toml * Updated pyproject.toml * Black formatting in multinegrank_loss * Updated pip install url to dev branch Co-authored-by: Owen Vallis <[email protected]> * resolve #299 Fix WarmupCosineDecay. * Previous version scaled the cosine decay by a linear warmup value. So the max value was max_lr*0.5*(1+cos(warmup_steps/total_steps*pi)) * New version has a linear warmup and then begins the cosine decay from cos(0.0) so the max value is now max_lr. * Previous version accepted a tensor of values, this is not needed. Simplified to accept a single scaler step value. * Updated tests to be consistent with the keras LearningRateSchedule tests. * Renamed class from WarmUpCosine to WarmupCosineDecay. This is more consistent with the Keras LearningRateSchedules. * [nightly] Increase version to 0.17.0.dev12 * Update pn_loss default params and doc string formatting. * Make soft_margin the default. The doc string stated this was the default but the param was set to False. * Make the default margin 0.1. The previous value was 1.0 which produced sub-optimal results when using cosine distance. * Reformat the doc strings to align with the google docstring style. * Add support for the pep585 annotations. Removed Callable and Union. * Update triplet_loss default params and doc string formatting. * Make soft_margin the default. The doc string stated this was the default but the param was set to False. * Make the default margin 0.1. The previous value was 1.0 which produced sub-optimal results when using cosine distance. * Reformat the doc strings to align with the google docstring style. * Add support for the pep585 annotations. Removed Callable and Union. * Update train to use new WarmupCosineDecay. * Updates to config params for both prod and single configs * Updates to component/losses to use the new defaults * Benchmark updates and bug fixes * calls to model.predict() now convert input to tensor using the CPU context. This avoids a mem leak when calling predict in a loop. * expose all NMSLib params in NMSLibSearch. This enables custom parametrization of the nsmlib indexes. * Update indexer to save and load Search objects from configs. Saving now works when passing a Search object to model.compile() * Update the benchmark configs to use the unique name as the key and provide a 'component' field for identifying the component to build. * Manually delete and clear all objects at the end of each benchmark loop to try and avoid memory leaks. However, we still leak mem in tf frameworks/ops.py * Make flage ignore E206 whitespace after ":". This was in conflict with the black formatter. * Enable manual trigger for testing workflow. * Refactor benchmark datasets to be a class. * Now supports creating custom Dataset objects to load other sources. Currently supports loading TFDS data. * Add support for define hyper_parameter versions of parameters to support KerasTuner search * Split the single train script into create_dataset, hyper_parameter_search and train * Update configuration files to remove the benchmark prefix. * Add support for retrieval metrics in callback. * Add support for R_precision and refactor map@k * map_at_k is now a subclass of precision_at_k. Reduces code duplication. * update names for precision_at_k and map_at_k when clip_at_k is set to true. Name no longer return an @k suffix but instead return wither R_Precision or map@R. * distances now return their params when get_config() is called. * Fix info string for mem samplers. * Memory samplers now correctly report the number of augmenations in the sampler object. * Fix mypy errors from newest mypy version. * Migrate to support pep585 where possible * Fix several small typing errors * Provide typing for 100% of modules * [nightly] Increase version to 0.17.0.dev13 * * GEM layers create a general pooling layer in the init, but we didn't pass the kwargs. This means the general pooling layer didn't have the dtype policy. This caused the GEM layers to fail when using a mixed_float dtype policy as the general pooling layer returns float32 and the GEM dtype policy is float16. The fix is to pass all kwargs onto the general pooling layer. * Patch bump * Cap the TF version at 2.9 for the current master branch. * Resolves Error while using projector (#301) * Resolves Error while using projector Since the new major release of Bokeh version 3.0, the `plot_width` (and `plot_height`) properties have been removed. These have been replaced by standard `width` and `height` for all layout-able models according to [Migration Guide](https://github.com/bokeh/bokeh/wiki/Migration-Guides#300). The update fixes the error generated by the `projector`. * backward compatible This update makes `projector` backward compatible with `bokeh` * Apply formatting to projector changes to fix warnings from Black and ISort. * [nightly] Increase version to 0.17.0.dev14 * Model Save Updates (#305) * Update github workflow test matrix to include py 3.10 and tf 2.7 and 2.10 * Update github workflow to use py 3.11 and tensorflow 2.11. * Fix testing error in test_schedules. import from keras should now be import from tensorflow.keras. * The optimizer serialize and deserialize are under schedules in TF instead of the learning_rate_schedules module from keras. * Turns out the workflow version must be < 3.11 * Python 3.10 requires TF >= 2.8. * Fix and simplify Contrastive Model save and load. * The old save and load manually loaded each inner model. This was required because we didn't build the outer model graph. * The new solution uses a factory function to infer the input shape and then connect all the inner models and pass the input and output to the contrastive model. This is enough for the standard model save to work. * Also silenced the INFO logs from nmslib. * Small formatting and cleanup in other files. * Remove extra print in EvalCallback * Fix order of contrastive model metrics so that losses come first. * Update unsupervised notebook to use new save and create functions. * Fix formatting in contrastive model module * [nightly] Increase version to 0.17.0.dev15 * Add MultiShotFileSampler (#307) * Add MultiShotFileSampler * Refactor MultiShotMemorySampler to use load_example_fn * Update MultiShotFileSampler to inherit from MultiShotMemorySampler * Fix typing errors in file_samplers * Loss tests refactor (#308) * Refactor the tests for the losses. * Use the tf.TestCase packages * Create utils for perfect and bad embedding examples. * Refactor Triplet and PN Loss to have a single margin param now with float | None. None will now set the soft_margin. * Replace basic tf.logsumexp with the TF sim stable logsumexp in the soft margin. * Fix bug in semi-hard mining where we don't have any valid negatives > max positive. Previously this defaulted to selecting the example in idx 0. We now take the negative that is closest to the maximal positive without going over, i.e., max(d(a,n)) <= max(d(a,p)). * Refactor the triplet loss tests. * Create a losses dir under tests. * Refactor the tests for the losses. * Use the tf.TestCase packages * Create utils for perfect and bad embedding examples. * Refactor Triplet and PN Loss to have a single margin param now with float | None. None will now set the soft_margin. * Replace basic tf.logsumexp with the TF sim stable logsumexp in the soft margin. * Fix bug in semi-hard mining where we don't have any valid negatives > max positive. Previously this defaulted to selecting the example in idx 0. We now take the negative that is closest to the maximal positive without going over, i.e., max(d(a,n)) <= max(d(a,p)). * Refactor the triplet loss tests. * Create a losses dir under tests. * Type empty_mask as BoolTensor to fix mypy error. We create a matrix of zeros as dtype bool and vector of ones as dtype bool, but mypy doesn't see these as BoolTensor type. This commit adds an explicit BoolTensor type to the empty_mask to fix this. * Fix formatting errors * fix formatting errors * [nightly] Increase version to 0.17.0.dev16 * Adding benchmark datasets component that was ignored due to datasets/ filter in the .gitignore * Float16 (#310) * Fix casting to use the default floatx where possible to avoid type errors when training using mixed precision or float16. * Update tests for supporting float16 * Remove float dtype parameterization of readme tests. They were too slow. * Fix casting error when passing constant scalar. Set policy in multihead test to ensure we reset the policy to a good state. * Remove duplicate long running test. This should speed up test by ~3min. * [nightly] Increase version to 0.17.0.dev17 * Remove references to outputs in contrastive model. (#311) * Remove references to outputs in contrastive model. We use the inputs and outputs to support saving the contrastive model using the Keras API, however, we override train and test steps as well as predict. This means we don't currently support multiple output heads on the embedding output. This PR removes all references to multi-headed outputs and explicitly sets the indexer to use the predictor output. * Provide default contrastive projector and predictor. Users had to provide their own MLP models for the projector and predictor. This required understanding more about the underlying algorithms. This change now adds default projector and predictor models based on the original papers. * Update unsupervised colab. * Comment out projector and predictor create model functions. We now automatically create the MLP models for users, but the commented code is left in case the user wants to customize them. * Verify that the model trains and reloads. * Loss and performance is slightly better than before. * Update the create_contrastive_model function to pass a list of outputs to better track the outputs. The model still overrides the predict function though as we need to apply the L2 Norm at the output. * Fix mypy error. * Update ouput var name and use epsilon constant. * [nightly] Increase version to 0.17.0.dev18 * Update release notes for 0.17.x * adding linear search, faiss ann search, cached storage, and redis storage. Also refactoring indexer class for easing implementation of indexers that would depend on packages that include both search and storage. * formatting * formatting and fixing couple of issues * fix the backward compatibility issue * add dependencies * remove dependencies * fixed typing issues * remove extra typing * move evaluator * fix tests * switch from dbm to shelve * set temp dir for storing cached store * add debug logging * add debug logging * specify dbm implementation for cross-machine compatibility * switch to ndb.dumb as other options not available on all machines * fix import orders * remove extraneous logging * ensure only names are stored in metadata * separate store from store_type, and search from search_type, needed for serialization of metadata * use str path * put typing in one place * add canonical name for consistent reload * accept canonical_name * Remove optional * adding more tests * pass str for path * support more distances for LinearSearch * add indexing colab * applying fixes to PR review comments * typo * fix the tests for no normalization * add distance * fix typing * Update example notebooks. * Add patch to support passing custom NMSLibSearch objects. * Add temporary fix that passes the search object config to the make_search function in order to support resetting the search index. * NOTE: This is only temporary and a more general solution will be added in the new backend updates to search and store. * Updated the supervised visualization notebook to demo using the custom NMSLibSearch object. * Added warnings about the reset issues with custom objects in indexer. * Remove the old benchmark dataset file. * [nightly] Increase version to 0.17.0.dev19 * Update CLIP notebook and include search example. * Updating dev version to 0.18.0.dev0 Merged 0.17.0 into master and bumping the dev versions to prepare for the next release. * remove double definition * [nightly] Increase version to 0.18.0.dev1 * Indexing (#321) * fix formatting * small fixes * adding reset to Search * adding reset to stores * add reset to indexer * add more tests --------- Co-authored-by: Ali Zand <[email protected]> * [nightly] Increase version to 0.18.0.dev2 * Cherrypick master (#331) * Update similarity_model.py Update verbose printing to display the count of indexed items. Verbose output was missing an f-string prefix and also returned the entire shape. Now we just return the number of examples. * 0.17 patches (#325) * fixes #323 Default indexer distance is now cosine in Sim Model. Calling create_index method now defaults to cosine distance. Additionally, auto distance defaults to cosine if no distance is passed to compile. * fixes #322 remove all calls to tf.convert_to_tensor in SimModel. * Update gitignore to exclude models and datasets from the example notebooks. * Update multi-modal notebook to remove the call to compile. * Patch bump * Remove check for tf.shape in index. Input can also be tuple or dict, so we should use len() here. * Update github workflow tests to use TF >= 2.8 * Tensor slice sampler (#329) * Create tfdata_sampler.py Initial version of new tf.data.Dataset sampler. * Refactor and clean up the tf data sampler. * Add initial tests for tfdata_sampler * Reformat TFDataSampler test file. * Fix proto dep issue in github workflow tests. py 3.10 breaks with protobuf > 3.20.x * Setting env var didn't work. Trying again with pinning the protobuf version to 3.20.1 * Check TF version before creating the tf dataset counter. * Format file * Remove as_numpy_iterator when creating the list of grouped datasets. * Also move class_list filter to before the group_by function * Apply the total_examples_per_class as a take() function on each grouped dataset * Remove as much casting as possible from the dataset. Certain functions expect an int64 though and require casting. * Refactor to move the filter by class list out of the window_group_by function. * Add class list filter test. * Move augment_fn and load_fn to before the repeat and batch functions. This change means the aug and load functions apply per example now. This will make it easier to apply random augmentations per example and is more consistent with how we implemented it in the existing memory sampler. This change also improves the tests for all parts of the module. * Add support for handling tuple and dict values for y. This change adds support for passing a callable to parse the correct class id element for batch sampling. By default y is assumed to be a 1D tensor with the class ids and the function is lambda y:y. Otherwise we accept an int or str and construct a parser to get the class id tensor. * Update email for github actions bot to fix CLA errors in PR * Fix import order and remove typing imports * Fix import check in search init. * Small updates to tfdata_sampler doc string * [nightly] Increase version to 0.18.0.dev3 * Remove version check and replace with try except attribute error. (#332) * [nightly] Increase version to 0.18.0.dev4 * Fix #333 correct typo in memory_store to_dataframe. * [nightly] Increase version to 0.18.0.dev5 * Refactoring unit tests for increased test coverage (#320) * Refactor similiarity unit tests to Tensorflow TestCase, reduce usage of Numpy API * Refactor similarity unittests to reduce usage of numpy and increase overall coverage * Merge branch 'development' of https://github.com/tensorflow/similarity into development * reformat tf search initialization file * Update indexer test files from recent push * Refactor similiarity unit tests to Tensorflow TestCase, reduce usage of Numpy API * Refactor similarity unittests to reduce usage of numpy and increase overall coverage * Merge branch 'development' of https://github.com/tensorflow/similarity into development * reformat tf search initialization file * Update indexer test files from recent push * Cleaned up and reformatted fiels * Sort test_file_samplers file * Fix formatting. * [nightly] Increase version to 0.18.0.dev6 * Cleanup imports to legacy tensorflow.python.keras (#336) * Cleanup imports to legacy tensorflow.python.keras for tensorflow_similarity Remove call to conv_utils and explicitly define normalize_data_format for channels check. * reformat layers.py * [nightly] Increase version to 0.18.0.dev7 * Dev cleanup (#344) * ensure no -1 is returned from FAISS * use string keys for redis store * update indexing colab * ensure faiss returned -1s are filtered out, given faiss returns -1s wherever it does not find the neighbors * fix a bug with distances * fix formatting --------- Co-authored-by: Ali Zand <[email protected]> * [nightly] Increase version to 0.18.0.dev8 * Merge master into dev (#348) Merging master back into dev for consistency. * Checking changes for 0.18 release. (#355) * Checking changes for 0.18 release. * Refactor setup to make as many deps optional as possible [Note: might come back to this if it's awkward to run the hello world] * Clean up types and try to increase consistency between the different Search and Store classes * Implement lazy loading for all Search and Store modules to avoid having to load the deps. * Fix and update tests * Migrate to using pathlib where possible * Other small fixes and updates. * Add all deps needed for testing in the git workflow and try using only typing for search and store utils. * Migrate basic flow test to tf.TestCase. * Update to use BatchNorm with synchronized and finish eval of Notebooks. * Bump TF version check for BatchNorm Sync to 2.12 * Reorder the Search class methods to better align with Search base class * Update Faiss to return cosine distance values for distance type cosine * Add support for using inner_product with Faiss * Small refactor to Search methods * Flatten the example directory. * Refactor distance, store, search, and losses deserialization and serialization to better support custom 'str' for loading. * Fix distance imports in the models. * Add new distance module. * Flaky test on git workflows. Try updating to match local venv. * Set typing imports to only import during type checking. This cleans up our imports and prevents us from loading unused modules. * Update all notebooks * Remove all but the basic model and sampler modules from main package imports * Update the dataset and hyper_param scripts for benchmarking * Clean up the Redis tests and mocks to check all methods called. * Fix mock import for redis test. * Wrap tf.data.Dataset in CPU context to avoid GPU OOM errors. * Add ModuleNotFoundError with pip install info for missing extra_require deps when loading the relevent modules. * [nightly] Increase version to 0.18.0.dev9 --------- Co-authored-by: Owen S Vallis <[email protected]> Co-authored-by: Christoffer Hjort <[email protected]> Co-authored-by: Github Actions Bot <> Co-authored-by: dewball345 <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Owen Vallis <[email protected]> Co-authored-by: Genrry Hernandez <[email protected]> Co-authored-by: Abhishar Sinha <[email protected]> Co-authored-by: Emil Larsson <[email protected]> Co-authored-by: Abhishar Sinha <[email protected]> Co-authored-by: Luke Wood <[email protected]> Co-authored-by: Zoheb Abai <[email protected]> Co-authored-by: Mohammad Amin Haghpanah <[email protected]> Co-authored-by: Ali Zand <[email protected]> Co-authored-by: Ali Zand <[email protected]> Co-authored-by: Github Actions Bot <[email protected]> Co-authored-by: Abel Theodros <[email protected]>
…RandomCropAndResize) (keras-team#738) * Sync * Added zoom factor to RRC * Used tf.shape * dtype mismatch * Debugging... * Debugging... * Fix example * RRC uses preprocessing.transform now * Minor error * Minor bug * minor issue * minor issue * minor bug * Added unit tests * Fix serialization test * KerasCV simclr api update * Augmenter * Augmenter * serialization test * serialization test * fix failing test * Split RRC API into two layers * Split RRC API into two layers * Format serialization_test * Implemented bounding box support * Add preprocessing * serialization test * serialization test * serialization test * RandomCropAndResize in SimCLR * RandomCropAndResize in SimCLR * Update examples * Update examples * Update examples * Update target_size Co-authored-by: Luke Wood <[email protected]>
@sayakpaul @LukeWood @martin-gorner
As discussed in #676 I have updated the RRC code to incorporate
zoom_factor
. I will be adding a Colab gist demonstrating the layer and to verify that it works as expected./auto Closes #676