[Converter] Support Keras V3 Conversion (#7691)

* Support Keras V3 Conversion * Add script to mapping TFJS classes with TF modules This reverts commit 3bd7e8f. * Put into another branch * Script for building the map Draft a script for build map of TFJS class and TF module path. * Add tfjs to v3 converter * Rename the file * Merge remote-tracking branch 'upstream/master' into V3ScriptForMapping * Improve the code snippet for texture to tensor (#7694) DOC * Improve example * add * Fix tfjs-release not updating all tfjs versions of subpackages (#7550) Some TFJS packages, like wasm, have examples or demos in them. These usually depend on the parent package, but the parent package is not marked as to be updated when updating the subpackage dependency versions. For an example of this, see #7547. Update the TFJS dependencies of these subpackages to the release version if they are `link:` dependencies. * [wasm] Fix cos and tan for large float numbers (#7689) * Fix sin/cos workaround * Add tests for large numbers * Fix tan * Exclude new tests in webgl and webgpu * Fix * Exclude tests in tfjs-node * Update * Fix * Fix * Fix * Remove comments * [wasm] Update xnnpack (#7507) * wip * Add xnn_caches * Upgrade xnnpack * exp * Update xnnpack deps * Fix xnn cache * TEST * Cleanup * Cleanup * Cleanup * Update xnnpack * Add flag to avoid unused function * Add comment * Add config to turn xnnpack logs off * Add sha256 for emsdk * Update xnnpack and toolchain, and disable xnn caches * Fix lint * Remove unused include * Recover the default backend (#7709) * Do not throw an error when killing the verdaccio process (#7695) Killing the verdaccio process throws an error because the disconnect event emits when the process is killed. We throw an error on a disconnect to catch any unexpected verdaccio disconnections. Fix this by deregistering the disconnect handler before killing the verdaccio process. * webgpu: Optimize SpaceToBatchND (#7703) * webgpu: Optimize SpaceToBatchND Fuse pad and transpose to one shader. See 20% improvement for SpaceToBatchND in DeepLabV3 * webgpu: Replace timestamp-query-in-passes with timestamp-query (#7714) * webgpu: Replace timestamp-query-in-passes with timestamp-query Timestamp-query has a broader support than timestamp-query-in-passes on all platforms, including macOS. Note that Chrome switch '--disable-dawn-features=disallow_unsafe_apis' is still needed now as the timestamp has the accuracy of nanosecond, which is too accurate to be safe. Later changes in Chrome may lift this limitation. * webgpu: Fix timestamp query (#7723) If a pass that needs timestamp query follows a pass without timestamp query, querySet may not be created as expected. This PR fixes this issue. * webgpu: Move the readSync warning position (#7724) The warning should only happen when there is real data reading from gpu to cpu. * Fix indexeddb for 1GB models (#7725) * Fix indexeddb for 1GB models Fixes #7702 by concatenating model weights into a single ArrayBuffer before sending them to IndexedDB. A better solution would be to store the model as multiple records, but this quick fix is easy to implement solves the issue for most current models (~1GB). * Use CompositeArrayBuffer.join * Add npmVersion command line argument to benchmark NPM versions (#7674) * npmVersion CLIarg * add readme description --------- Co-authored-by: Linchenn <[email protected]> Co-authored-by: Matthew Soulanille <[email protected]> * Migrate stale management probot to Github action. (#7570) Update the workflow file to replace 'stale-master' probot with a github action. It will add 'stale' label to inactive issues/PRs if contains 'stat:awaiting response' label in case of further inactivity it will close the issue with proper comment. * Fix getLayer() API (#7665) * Fix getLayer() API * Apply suggested changes * Script for building the map Draft a script for build map of TFJS class and TF module path. * build(deps): bump socket.io-parser from 4.2.2 to 4.2.4 in /tfjs-vis (#7731) Bumps [socket.io-parser](https://github.com/socketio/socket.io-parser) from 4.2.2 to 4.2.4. - [Release notes](https://github.com/socketio/socket.io-parser/releases) - [Changelog](https://github.com/socketio/socket.io-parser/blob/main/CHANGELOG.md) - [Commits](socketio/socket.io-parser@4.2.2...4.2.4) --- updated-dependencies: - dependency-name: socket.io-parser dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Soulanille <[email protected]> * build(deps): bump socket.io-parser from 4.2.1 to 4.2.4 in /tfjs-automl (#7730) Bumps [socket.io-parser](https://github.com/socketio/socket.io-parser) from 4.2.1 to 4.2.4. - [Release notes](https://github.com/socketio/socket.io-parser/releases) - [Changelog](https://github.com/socketio/socket.io-parser/blob/main/CHANGELOG.md) - [Commits](socketio/socket.io-parser@4.2.1...4.2.4) --- updated-dependencies: - dependency-name: socket.io-parser dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * build(deps): bump socket.io-parser from 4.2.1 to 4.2.4 in /tfjs (#7729) Bumps [socket.io-parser](https://github.com/socketio/socket.io-parser) from 4.2.1 to 4.2.4. - [Release notes](https://github.com/socketio/socket.io-parser/releases) - [Changelog](https://github.com/socketio/socket.io-parser/blob/main/CHANGELOG.md) - [Commits](socketio/socket.io-parser@4.2.1...4.2.4) --- updated-dependencies: - dependency-name: socket.io-parser dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [webgpu] More preparation for element-wise binary op restructuring (#7666) The current code isn't great in that the vec4 shaders have diverged from the scalar ones more than necessary. Here is the common preparation work, so that following refactoring can be done on a per-op basis. * Revert "[wasm] Update xnnpack (#7507)" (#7735) This reverts commit c66f302. * Github token (#7734) * Github token * Github token * Github token Github token * Apply suggestions from code review Co-authored-by: Matthew Soulanille <[email protected]> --------- Co-authored-by: Matthew Soulanille <[email protected]> * [webgpu] Update ADD,COMPLEX_MULTIPLY_*,DIV,MUL,SQUARED_DIFFERENCE,SUB (#7737) * Add register name when register the class object (#7717) * Save model * Support Keras V3 Conversion * Run yarn before running the release e2e tests (#7687) * Registered name prototype * Update register class method to support registered name * Revert "Support Keras V3 Conversion" This reverts commit 3bd7e8f. * revert converter changes * Apply suggested changes * Apply suggested changes * Fix lint * fix lint * Remove throw errors --------- Co-authored-by: Matthew Soulanille <[email protected]> * Rename the file * Merge remote-tracking branch 'upstream/master' into V3ScriptForMapping * move script location * Merge branch 'V3ScriptForMapping' into SupportV3 * Add v3 conversion functions in converter * Fix some nit * Merge branch 'SupportV3' into AddE2ETestForV3 * remove unused import * remove blank * fix import * fix import * Add tests for the mapper and rename the files * update import * fix store path * resolve comments * Update license and remove build_map() function * use private function * remove build_map() usage since script has been updated. * add exception * Update the build file * add module mapper into build file * Remove unused functions. * remove comments --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: Linchenn <[email protected]> Co-authored-by: Matthew Soulanille <[email protected]> Co-authored-by: Chunnien Chan <[email protected]> Co-authored-by: Jiajia Qin <[email protected]> Co-authored-by: Yang Gu <[email protected]> Co-authored-by: wrighkv1 <[email protected]> Co-authored-by: Shivam Mishra <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Jiajie Hu <[email protected]> Co-authored-by: Dedongala <[email protected]> Co-authored-by: Matthew Soulanille <[email protected]>
tensorflow · Jun 23, 2023 · 6ab5271 · 6ab5271
1 parent fb85e90
commit 6ab5271
Show file tree

Hide file tree

Showing 5 changed files with 380 additions and 6 deletions.
diff --git a/tfjs-converter/python/tensorflowjs/converters/BUILD.bazel b/tfjs-converter/python/tensorflowjs/converters/BUILD.bazel
@@ -27,6 +27,13 @@ py_library(
     deps = ["//tfjs-converter/python/tensorflowjs:version"],
 )
 
+py_library(
+    name = "tf_module_mapper",
+    srcs = ["tf_module_mapper.py"],
+    srcs_version = "PY3",
+    deps = [],
+)
+
 py_library(
     name = "keras_h5_conversion",
     srcs = ["keras_h5_conversion.py"],
@@ -74,6 +81,7 @@ py_test(
     deps = [
         ":keras_h5_conversion",
         ":keras_tfjs_loader",
+        ":tf_module_mapper",
         "//tfjs-converter/python/tensorflowjs:expect_numpy_installed",
         "//tfjs-converter/python/tensorflowjs:expect_tensorflow_installed",
     ],
@@ -272,6 +280,7 @@ py_binary(
         ":common",
         ":keras_h5_conversion",
         ":keras_tfjs_loader",
+        ":tf_module_mapper",
         ":tf_saved_model_conversion_v2",
         "//tfjs-converter/python/tensorflowjs:expect_h5py_installed",
         "//tfjs-converter/python/tensorflowjs:expect_tensorflow_installed",

diff --git a/tfjs-converter/python/tensorflowjs/converters/common.py b/tfjs-converter/python/tensorflowjs/converters/common.py
@@ -40,6 +40,7 @@
 # Model formats.
 KERAS_SAVED_MODEL = 'keras_saved_model'
 KERAS_MODEL = 'keras'
+KERAS_KERAS_MODEL = 'keras_keras'
 TF_SAVED_MODEL = 'tf_saved_model'
 TF_HUB_MODEL = 'tf_hub'
 TFJS_GRAPH_MODEL = 'tfjs_graph_model'

diff --git a/tfjs-converter/python/tensorflowjs/converters/converter.py b/tfjs-converter/python/tensorflowjs/converters/converter.py
@@ -35,6 +35,7 @@
 from tensorflowjs.converters import keras_h5_conversion as conversion
 from tensorflowjs.converters import keras_tfjs_loader
 from tensorflowjs.converters import tf_saved_model_conversion_v2
+from zipfile import ZipFile, is_zipfile
 
 
 def dispatch_keras_h5_to_tfjs_layers_model_conversion(
@@ -100,6 +101,88 @@ def dispatch_keras_h5_to_tfjs_layers_model_conversion(
 
   return model_json, groups
 
+def dispatch_keras_keras_to_tfjs_layers_model_conversion(
+    v3_path,
+    output_dir=None,
+    quantization_dtype_map=None,
+    split_weights_by_layer=False,
+    weight_shard_size_bytes=1024 * 1024 * 4,
+    metadata=None,
+):
+  """Converts a Keras v3 .keras file to TensorFlow.js format.
+
+  Args:
+    v3_path: path to an .keras file containing keras model data as a `str`.
+    output_dir: Output directory to which the TensorFlow.js-format model JSON
+      file and weights files will be written. If the directory does not exist,
+      it will be created.
+    quantization_dtype_map: A mapping from dtype (`uint8`, `uint16`, `float16`)
+      to weights. The weight mapping supports wildcard substitution.
+    split_weights_by_layer: Whether to split the weights into separate weight
+      groups (corresponding to separate binary weight files) layer by layer
+      (Default: `False`).
+    weight_shard_size_bytes: Shard size (in bytes) of the weight files.
+      The size of each weight file will be <= this value.
+    metadata: User defined metadata map.
+
+  Returns:
+    (model_json, groups)
+      model_json: a json dictionary (empty if unused) for model topology.
+      groups: an array of weight_groups as defined in tfjs weights_writer.
+  """
+  if not os.path.exists(v3_path):
+      raise ValueError("Nonexistent path to .keras file: %s" % v3_path)
+  if os.path.isdir(v3_path):
+      raise ValueError(
+          "Expected path to point to an .keras file, but it points to a "
+          "directory: %s" % v3_path
+      )
+  file_path = str(v3_path)
+  if not file_path.endswith(".keras"):
+      raise ValueError(
+          "Invalid `filepath` argument: expected a `.keras` extension. "
+          f"Received: filepath={file_path}"
+      )
+  with ZipFile(v3_path, "r") as zip_file:
+      zip_file.extractall(path=os.path.dirname(v3_path))
+  dir_path = os.path.dirname(file_path)
+  meta_data_json_path = os.path.join(dir_path, "metadata.json")
+  config_json_path = os.path.join(dir_path, "config.json")
+  model_weights_path = os.path.join(dir_path, "model.weights.h5")
+  h5_file = h5py.File(model_weights_path, "r")
+  with open(config_json_path, "rt") as conf:
+      try:
+          config_file = json.load(conf)
+      except (ValueError, IOError):
+          raise ValueError(
+              "The input path is expected to contain valid JSON content, "
+              "but cannot read valid JSON content from %s." % config_json_path
+          )
+
+  with open(meta_data_json_path, "rt") as meta_json:
+      try:
+          meta_file = json.load(meta_json)
+      except (ValueError, IOError):
+          raise ValueError(
+              "The input path is expected to contain valid JSON content, "
+              "but cannot read valid JSON content from %s." % meta_data_json_path
+          )
+
+  model_json, groups = conversion.h5_v3_merged_saved_model_to_tfjs_format(
+      h5_file, meta_file, config_file, split_by_layer=split_weights_by_layer
+  )
+
+  if output_dir:
+    if os.path.isfile(output_dir):
+      raise ValueError(
+          'Output path "%s" already exists as a file' % output_dir)
+    if not os.path.isdir(output_dir):
+      os.makedirs(output_dir)
+    conversion.write_artifacts(
+      model_json, groups, output_dir, quantization_dtype_map,
+      weight_shard_size_bytes=weight_shard_size_bytes, metadata=metadata)
+
+  return model_json, groups
 
 def dispatch_keras_h5_to_tfjs_graph_model_conversion(
     h5_path, output_dir=None,
@@ -209,7 +292,6 @@ def dispatch_keras_saved_model_to_tensorflowjs_conversion(
     # Delete temporary .h5 file.
     os.remove(temp_h5_path)
 
-
 def dispatch_tensorflowjs_to_keras_h5_conversion(config_json_path, h5_path):
   """Converts a TensorFlow.js Layers model format to Keras H5.
 
@@ -247,6 +329,42 @@ def dispatch_tensorflowjs_to_keras_h5_conversion(config_json_path, h5_path):
     model = keras_tfjs_loader.load_keras_model(config_json_path)
     model.save(h5_path)
 
+def dispatch_tensorflowjs_to_keras_keras_conversion(config_json_path, v3_path):
+  """Converts a TensorFlow.js Layers model format to Keras V3 format.
+
+  Args:
+    config_json_path: Path to the JSON file that includes the model's
+      topology and weights manifest, in tensorflowjs format.
+    v3_path: Path for the to-be-created Keras V3 model file.
+
+  Raises:
+    ValueError, if `config_json_path` is not a path to a valid JSON
+      file.
+  """
+  if os.path.isdir(config_json_path):
+    raise ValueError(
+        'For input_type=tfjs_layers_model & output_format=keras_keras, '
+        'the input path should be a model.json '
+        'file, but received a directory.')
+  if os.path.isdir(v3_path):
+    raise ValueError(
+        'For input_type=tfjs_layers_model & output_format=keras_keras, '
+        'the output path should be the path to a .keras file, '
+        'but received an existing directory (%s).' % v3_path)
+
+  # Verify that config_json_path points to a JSON file.
+  with open(config_json_path, 'rt') as f:
+    try:
+      json.load(f)
+    except (ValueError, IOError):
+      raise ValueError(
+          'For input_type=tfjs_layers_model & output_format=keras_keras, '
+          'the input path is expected to contain valid JSON content, '
+          'but cannot read valid JSON content from %s.' % config_json_path)
+
+  model = keras_tfjs_loader.load_keras_keras_model(config_json_path)
+  tf.keras.saving.save_model(model, v3_path, save_format="keras")
+
 
 def dispatch_tensorflowjs_to_keras_saved_model_conversion(
     config_json_path, keras_saved_model_path):
@@ -503,6 +621,14 @@ def _dispatch_converter(input_format,
         split_weights_by_layer=args.split_weights_by_layer,
         weight_shard_size_bytes=weight_shard_size_bytes,
         metadata=metadata_map)
+  elif (input_format == common.KERAS_KERAS_MODEL and
+        output_format == common.TFJS_LAYERS_MODEL):
+    dispatch_keras_keras_to_tfjs_layers_model_conversion(
+        args.input_path, output_dir=args.output_path,
+        quantization_dtype_map=quantization_dtype_map,
+        split_weights_by_layer=args.split_weights_by_layer,
+        weight_shard_size_bytes=weight_shard_size_bytes,
+        metadata=metadata_map)
   elif (input_format == common.KERAS_MODEL and
         output_format == common.TFJS_GRAPH_MODEL):
     dispatch_keras_h5_to_tfjs_graph_model_conversion(
@@ -555,6 +681,10 @@ def _dispatch_converter(input_format,
         output_format == common.KERAS_MODEL):
     dispatch_tensorflowjs_to_keras_h5_conversion(args.input_path,
                                                  args.output_path)
+  elif (input_format == common.TFJS_LAYERS_MODEL and
+        output_format == common.KERAS_KERAS_MODEL):
+    dispatch_tensorflowjs_to_keras_keras_conversion(args.input_path,
+                                                 args.output_path)
   elif (input_format == common.TFJS_LAYERS_MODEL and
         output_format == common.KERAS_SAVED_MODEL):
     dispatch_tensorflowjs_to_keras_saved_model_conversion(args.input_path,
@@ -615,7 +745,7 @@ def get_arg_parser():
       type=str,
       required=False,
       default=common.TF_SAVED_MODEL,
-      choices=set([common.KERAS_MODEL, common.KERAS_SAVED_MODEL,
+      choices=set([common.KERAS_MODEL, common.KERAS_SAVED_MODEL, common.KERAS_KERAS_MODEL,
                    common.TF_SAVED_MODEL, common.TF_HUB_MODEL,
                    common.TFJS_LAYERS_MODEL, common.TF_FROZEN_MODEL]),
       help='Input format. '
@@ -637,7 +767,7 @@ def get_arg_parser():
       type=str,
       required=False,
       choices=set([common.KERAS_MODEL, common.KERAS_SAVED_MODEL,
-                   common.TFJS_LAYERS_MODEL, common.TFJS_GRAPH_MODEL]),
+                   common.TFJS_LAYERS_MODEL, common.TFJS_GRAPH_MODEL, common.KERAS_KERAS_MODEL]),
       help='Output format. Default: tfjs_graph_model.')
   parser.add_argument(
       '--%s' % common.SIGNATURE_NAME,
@@ -751,6 +881,7 @@ def get_arg_parser():
 
 def convert(arguments):
   args = get_arg_parser().parse_args(arguments)
+
   if args.show_version:
     print('\ntensorflowjs %s\n' % version.version)
     print('Dependency versions:')

diff --git a/tfjs-converter/python/tensorflowjs/converters/keras_h5_conversion.py b/tfjs-converter/python/tensorflowjs/converters/keras_h5_conversion.py
@@ -87,6 +87,33 @@ def _convert_h5_group(group):
 
   return group_out
 
+def _convert_v3_group(group, actual_layer_name):
+  """Construct a weights group entry.
+
+  Args:
+    group: The HDF5 group data, possibly nested.
+
+  Returns:
+    An array of weight groups (see `write_weights` in TensorFlow.js).
+  """
+  group_out = []
+  list_of_folder = [as_text(name) for name in group]
+  if 'vars' in list_of_folder:
+    names = group['vars']
+    if not names:
+      return group_out
+    name_list = [as_text(name) for name in names]
+    weight_values = [np.array(names[weight_name]) for weight_name in name_list]
+    name_list = [os.path.join(actual_layer_name, item) for item in name_list]
+    group_out += [{
+    'name': normalize_weight_name(weight_name),
+    'data': weight_value
+    } for (weight_name, weight_value) in zip(name_list, weight_values)]
+  else:
+    for key in list_of_folder:
+      group_out += _convert_v3_group(group[key], actual_layer_name)
+  return group_out
+
 
 def _check_version(h5file):
   """Check version compatiility.
@@ -128,6 +155,18 @@ def _ensure_h5file(h5file):
 def _ensure_json_dict(item):
   return item if isinstance(item, dict) else json.loads(as_text(item))
 
+def _discard_v3_keys(json_dict, keys_to_delete):
+  if isinstance(json_dict, dict):
+    keys = list(json_dict.keys())
+    for key in keys:
+      if key in keys_to_delete:
+        del json_dict[key]
+      else:
+        _discard_v3_keys(json_dict[key], keys_to_delete)
+  elif isinstance(json_dict, list):
+    for item in json_dict:
+      _discard_v3_keys(item, keys_to_delete)
+
 
 # https://github.com/tensorflow/tfjs/issues/1255, b/124791387
 # In tensorflow version 1.13 and some alpha and nightly-preview versions,
@@ -211,6 +250,56 @@ def h5_merged_saved_model_to_tfjs_format(h5file, split_by_layer=False):
         groups[0] += group
   return model_json, groups
 
+def h5_v3_merged_saved_model_to_tfjs_format(h5file, meta_file, config_file,split_by_layer=False):
+  """Load topology & weight values from HDF5 file and convert.
+
+  The HDF5 weights file is one generated by Keras's save_model method or model.save()
+
+  N.B.:
+  1) This function works only on HDF5 values from Keras version 3.
+  2) This function does not perform conversion for special weights including
+      ConvLSTM2D and CuDNNLSTM.
+
+  Args:
+    h5file: An instance of h5py.File, or the path to an h5py file.
+    split_by_layer: (Optional) whether the weights of different layers are
+      to be stored in separate weight groups (Default: `False`).
+
+  Returns:
+    (model_json, groups)
+      model_json: a JSON dictionary holding topology and system metadata.
+      group: an array of group_weights as defined in tfjs write_weights.
+
+  Raises:
+    ValueError: If the Keras version of the HDF5 file is not supported.
+  """
+  h5file = _ensure_h5file(h5file)
+  model_json = dict()
+  model_json['keras_version'] = meta_file['keras_version']
+
+  keys_to_remove = ["module", "registered_name", "date_saved"]
+  config = _ensure_json_dict(config_file)
+  _discard_v3_keys(config, keys_to_remove)
+  model_json['model_config'] = config
+  translate_class_names(model_json['model_config'])
+  if 'training_config' in h5file.attrs:
+    model_json['training_config'] = _ensure_json_dict(
+        h5file.attrs['training_config'])
+
+  groups = [] if split_by_layer else [[]]
+
+  model_weights = h5file['_layer_checkpoint_dependencies']
+  layer_names = [as_text(n) for n in model_weights]
+
+  for index, layer_name in enumerate(layer_names):
+    group_of_weights = model_weights[layer_name]
+    group = _convert_v3_group(group_of_weights, layer_name)
+    if group:
+      if split_by_layer:
+        groups.append(group)
+      else:
+        groups[0] += group
+  return model_json, groups
 
 def h5_weights_to_tfjs_format(h5file, split_by_layer=False):
   """Load weight values from a Keras HDF5 file and to a binary format.