-
Hi, dummy_token = np.zeros((1, 512)) My problem is, that i cannont find a way for the bentoml.keras.runner to serve this model. I do the following: keras_runner = bentoml.keras.load_runner("same_model:latest") This runs into a problem as the runner interprets the input as [None, 2, 512] instead of the required [[None, 512], [None, 512]]. I can understand how this happens due to bentos adaptive batching but Im yet unable to come up with a solution. Are multi input model just not supported as of yet or is there a specific way I can tell bento that I basically required it to do adaptive batching on both Inputs? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 5 replies
-
Hi @cchrkoc - Multi input models are well supported in BentoML. If you don't need adaptive batching for your use case, you can save your model again with bentoml.keras.save_model("same_model:latest", model_inst, signatures={'predict': {'batchable': False}})
keras_runner = bentoml.keras.get("same_model:latest").to_runner()
keras_runner.init_local()
dummy_token = np.zeros((1, 512))
dummy_mask = np.zeros((1, 512))
res = keras_runner.predict.run([dummy_token, dummy_mask]) It is also possible to do adaptive batching on both inputs. In your case, the input batch dimension should be set to 1, since the first dimension is always 2 (containing token array and mask array). You can learn more about model/runner signatures from here: https://docs.bentoml.org/en/latest/concepts/model.html#model-signatures bentoml.keras.save_model("same_model:latest", model_inst, signatures={'predict': {'batchable': True, 'batch_dim': (1, 0)}})
keras_runner = bentoml.keras.get("same_model:latest").to_runner()
keras_runner.init_local()
dummy_token = np.zeros((1, 512))
dummy_mask = np.zeros((1, 512))
res = keras_runner.predict.run([dummy_token, dummy_mask]) Note that in both cases, the |
Beta Was this translation helpful? Give feedback.
Hi @cchrkoc - Multi input models are well supported in BentoML.
If you don't need adaptive batching for your use case, you can save your model again with
batchable=False
signature (with the new 1.0.0rc1 release):It is also possible to do adaptive batching on both inputs. In your case, the input batch dimension should be set to 1, since the first dimension is always 2 (cont…