Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decision forests prediction question #19

Closed
Arnold1 opened this issue Jun 3, 2022 · 22 comments
Closed

Decision forests prediction question #19

Arnold1 opened this issue Jun 3, 2022 · 22 comments

Comments

@Arnold1
Copy link

Arnold1 commented Jun 3, 2022

Hi,

Is the generated Yggdrasil decision forests model the same format as other tf models?
Could I use https://github.com/galeone/tfgo and call predict from a golang app?

I run into an issue with bazel when building the standalone example - any idea what could be the issue?

root@efc8844082ba:/notebooks/yggdrasil-decision-forests# uname -a
Linux efc8844082ba 5.10.103-0-virt #1-Alpine SMP Tue, 08 Mar 2022 10:06:11 +0000 x86_64 x86_64 x86_64 GNU/Linux
root@efc8844082ba:/notebooks/yggdrasil-decision-forests# 
root@efc8844082ba:/notebooks/yggdrasil-decision-forests# 
root@efc8844082ba:/notebooks/yggdrasil-decision-forests# cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04.4 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.4 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal

root@3ac4b2a4ad3d:/notebooks/yggdrasil-decision-forests# uname -a
Linux efc8844082ba 5.10.103-0-virt #1-Alpine SMP Tue, 08 Mar 2022 10:06:11 +0000 x86_64 x86_64 x86_64 GNU/Linux

# root@3ac4b2a4ad3d:/notebooks/yggdrasil-decision-forests# bazel --version
bazel 5.1.1

root@3ac4b2a4ad3d:/notebooks/yggdrasil-decision-forests# bazel build //yggdrasil_decision_forests/cli:all --config=linux_cpp17 --config=linux_avx2
Extracting Bazel installation...
Starting local Bazel server and connecting to it...
INFO: Reading rc options for 'build' from /notebooks/yggdrasil-decision-forests/.bazelrc:
  Inherited 'common' options: --experimental_repo_remote_exec --incompatible_restrict_string_escapes=false
ERROR: --incompatible_restrict_string_escapes=false :: Unrecognized option: --incompatible_restrict_string_escapes=false

Here is how I install bazel: https://docs.bazel.build/versions/main/install-ubuntu.html#19

how to fix the issue:
disable this line: https://github.com/google/yggdrasil-decision-forests/blob/main/.bazelrc#L43

but than I got his error:

root@efc8844082ba:/notebooks/yggdrasil-decision-forests# bazel build //yggdrasil_decision_forests/cli:all --config=linux_cpp17 --config=linux_avx2
Extracting Bazel installation...
Starting local Bazel server and connecting to it...
INFO: Options provided by the client:
  Inherited 'common' options: --isatty=1 --terminal_columns=96
INFO: Reading rc options for 'build' from /notebooks/yggdrasil-decision-forests/.bazelrc:
  Inherited 'common' options: --experimental_repo_remote_exec
INFO: Reading rc options for 'build' from /notebooks/yggdrasil-decision-forests/.bazelrc:
  'build' options: -c opt --spawn_strategy=standalone --announce_rc --noincompatible_strict_action_env --define=use_fast_cpp_protos=true --define=allow_oversize_protos=true --define=grpc_no_ares=true --color=yes
INFO: Found applicable config definition build:linux_cpp17 in file /notebooks/yggdrasil-decision-forests/.bazelrc: --cxxopt=-std=c++17 --host_cxxopt=-std=c++17 --config=linux
INFO: Found applicable config definition build:linux in file /notebooks/yggdrasil-decision-forests/.bazelrc: --copt=-fdiagnostics-color=always --copt=-w --host_copt=-w
INFO: Found applicable config definition build:linux_avx2 in file /notebooks/yggdrasil-decision-forests/.bazelrc: --copt=-mavx2
DEBUG: /root/.cache/bazel/_bazel_root/e69e42dd9f08c8f44fd8644c44ecd3fd/external/org_tensorflow/third_party/repo.bzl:108:14: 
Warning: skipping import of repository 'com_google_absl' because it already exists.
DEBUG: /root/.cache/bazel/_bazel_root/e69e42dd9f08c8f44fd8644c44ecd3fd/external/org_tensorflow/third_party/repo.bzl:108:14: 
Warning: skipping import of repository 'farmhash_archive' because it already exists.
DEBUG: /root/.cache/bazel/_bazel_root/e69e42dd9f08c8f44fd8644c44ecd3fd/external/org_tensorflow/third_party/repo.bzl:108:14: 
Warning: skipping import of repository 'com_google_protobuf' because it already exists.
DEBUG: /root/.cache/bazel/_bazel_root/e69e42dd9f08c8f44fd8644c44ecd3fd/external/org_tensorflow/third_party/repo.bzl:108:14: 
Warning: skipping import of repository 'com_google_googletest' because it already exists.
DEBUG: /root/.cache/bazel/_bazel_root/e69e42dd9f08c8f44fd8644c44ecd3fd/external/org_tensorflow/third_party/repo.bzl:108:14: 
Warning: skipping import of repository 'zlib' because it already exists.
DEBUG: /root/.cache/bazel/_bazel_root/e69e42dd9f08c8f44fd8644c44ecd3fd/external/org_tensorflow/third_party/repo.bzl:108:14: 
Warning: skipping import of repository 'rules_cc' because it already exists.
DEBUG: /root/.cache/bazel/_bazel_root/e69e42dd9f08c8f44fd8644c44ecd3fd/external/org_tensorflow/third_party/repo.bzl:108:14: 
Warning: skipping import of repository 'rules_python' because it already exists.
DEBUG: /root/.cache/bazel/_bazel_root/e69e42dd9f08c8f44fd8644c44ecd3fd/external/org_tensorflow/third_party/repo.bzl:108:14: 
Warning: skipping import of repository 'bazel_skylib' because it already exists.
INFO: Repository local_execution_config_python instantiated at:
  /notebooks/yggdrasil-decision-forests/WORKSPACE:38:4: in <toplevel>
  /root/.cache/bazel/_bazel_root/e69e42dd9f08c8f44fd8644c44ecd3fd/external/org_tensorflow/tensorflow/workspace2.bzl:1108:19: in workspace
  /root/.cache/bazel/_bazel_root/e69e42dd9f08c8f44fd8644c44ecd3fd/external/org_tensorflow/tensorflow/workspace2.bzl:84:27: in _tf_toolchains
  /root/.cache/bazel/_bazel_root/e69e42dd9f08c8f44fd8644c44ecd3fd/external/tf_toolchains/toolchains/remote_config/configs.bzl:6:28: in initialize_rbe_configs
  /root/.cache/bazel/_bazel_root/e69e42dd9f08c8f44fd8644c44ecd3fd/external/tf_toolchains/toolchains/remote_config/rbe_config.bzl:158:27: in _tensorflow_local_config
Repository rule local_python_configure defined at:
  /root/.cache/bazel/_bazel_root/e69e42dd9f08c8f44fd8644c44ecd3fd/external/org_tensorflow/third_party/py/python_configure.bzl:275:41: in <toplevel>
ERROR: An error occurred during the fetch of repository 'local_execution_config_python':
   Traceback (most recent call last):
	File "/root/.cache/bazel/_bazel_root/e69e42dd9f08c8f44fd8644c44ecd3fd/external/org_tensorflow/third_party/py/python_configure.bzl", line 213, column 39, in _create_local_python_repository
		numpy_include = _get_numpy_include(repository_ctx, python_bin) + "/numpy"
	File "/root/.cache/bazel/_bazel_root/e69e42dd9f08c8f44fd8644c44ecd3fd/external/org_tensorflow/third_party/py/python_configure.bzl", line 187, column 19, in _get_numpy_include
		return execute(
	File "/root/.cache/bazel/_bazel_root/e69e42dd9f08c8f44fd8644c44ecd3fd/external/org_tensorflow/third_party/remote_config/common.bzl", line 219, column 13, in execute
		fail(
Error in fail: Problem getting numpy include path.
OpenBLAS WARNING - could not determine the L2 cache size on this system, assuming 256k
Is numpy installed?
ERROR: /notebooks/yggdrasil-decision-forests/WORKSPACE:38:4: fetching local_python_configure rule //external:local_execution_config_python: Traceback (most recent call last):
	File "/root/.cache/bazel/_bazel_root/e69e42dd9f08c8f44fd8644c44ecd3fd/external/org_tensorflow/third_party/py/python_configure.bzl", line 213, column 39, in _create_local_python_repository
		numpy_include = _get_numpy_include(repository_ctx, python_bin) + "/numpy"
	File "/root/.cache/bazel/_bazel_root/e69e42dd9f08c8f44fd8644c44ecd3fd/external/org_tensorflow/third_party/py/python_configure.bzl", line 187, column 19, in _get_numpy_include
		return execute(
	File "/root/.cache/bazel/_bazel_root/e69e42dd9f08c8f44fd8644c44ecd3fd/external/org_tensorflow/third_party/remote_config/common.bzl", line 219, column 13, in execute
		fail(
Error in fail: Problem getting numpy include path.
OpenBLAS WARNING - could not determine the L2 cache size on this system, assuming 256k
Is numpy installed?
INFO: Repository go_sdk instantiated at:
  /notebooks/yggdrasil-decision-forests/WORKSPACE:42:4: in <toplevel>
  /root/.cache/bazel/_bazel_root/e69e42dd9f08c8f44fd8644c44ecd3fd/external/org_tensorflow/tensorflow/workspace0.bzl:117:20: in workspace
  /root/.cache/bazel/_bazel_root/e69e42dd9f08c8f44fd8644c44ecd3fd/external/com_github_grpc_grpc/bazel/grpc_extra_deps.bzl:36:27: in grpc_extra_deps
  /root/.cache/bazel/_bazel_root/e69e42dd9f08c8f44fd8644c44ecd3fd/external/io_bazel_rules_go/go/toolchain/toolchains.bzl:379:28: in go_register_toolchains
  /root/.cache/bazel/_bazel_root/e69e42dd9f08c8f44fd8644c44ecd3fd/external/io_bazel_rules_go/go/private/sdk.bzl:65:21: in go_download_sdk
Repository rule _go_download_sdk defined at:
  /root/.cache/bazel/_bazel_root/e69e42dd9f08c8f44fd8644c44ecd3fd/external/io_bazel_rules_go/go/private/sdk.bzl:53:35: in <toplevel>
ERROR: Analysis of target '//yggdrasil_decision_forests/cli:all_file_systems' failed; build aborted: Problem getting numpy include path.
OpenBLAS WARNING - could not determine the L2 cache size on this system, assuming 256k
Is numpy installed?
INFO: Elapsed time: 49.501s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (11 packages loaded, 15 targets configured)
    currently loading: @bazel_tools//tools/python ... (2 packages)
    Fetching https://dl.google.com/go/go1.12.5.linux-amd64.tar.gz; 1,613,824B

my dockerfile (to reproduce the hazel error):

# image is based on:
# https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/dockerfiles/dockerfiles/cpu.Dockerfile

ARG UBUNTU_VERSION=20.04

FROM ubuntu:${UBUNTU_VERSION} as base

ENV DEBIAN_FRONTEND=noninteractive

RUN apt-get update && apt-get install -y curl

# See http://bugs.python.org/issue19846
ENV LANG C.UTF-8

RUN apt-get update && apt-get install -y \
    python3 \
    python3-pip

RUN python3 -m pip --no-cache-dir install --upgrade \
    "pip<20.3" \
    setuptools

# Some TF tools expect a "python" binary
RUN ln -s $(which python3) /usr/local/bin/python

# Options:
#   tensorflow
#   tensorflow-gpu
#   tf-nightly
#   tf-nightly-gpu
# Set --build-arg TF_PACKAGE_VERSION=1.11.0rc0 to install a specific version.
# Installs the latest version by default.
ARG TF_PACKAGE=tensorflow
ARG TF_PACKAGE_VERSION=
RUN python3 -m pip install --no-cache-dir ${TF_PACKAGE}${TF_PACKAGE_VERSION:+==${TF_PACKAGE_VERSION}}

# install tensorflow_decision_forests and numpy
RUN pip3 install tensorflow_decision_forests --upgrade
RUN python3 -m pip install numpy

# install bazel
RUN apt install apt-transport-https curl gnupg -y
RUN curl -fsSL https://bazel.build/bazel-release.pub.gpg | gpg --dearmor > bazel.gpg
RUN mv bazel.gpg /etc/apt/trusted.gpg.d/
RUN echo "deb [arch=amd64] https://storage.googleapis.com/bazel-apt stable jdk1.8" | tee /etc/apt/sources.list.d/bazel.list
RUN apt update && apt install bazel -y
RUN apt update && apt full-upgrade -y
RUN apt install bazel-1.0.0 -y
#RUN ln -s /usr/bin/bazel-1.0.0 /usr/bin/bazel
RUN bazel --version

# WORKDIR /tf
# VOLUME ["/tf"]

COPY bashrc /etc/bash.bashrc
RUN chmod a+rwx /etc/bash.bashrc

cc @achoum

@SnoopJ
Copy link

SnoopJ commented Jun 3, 2022

User came to #python on the Libera.chat IRC network for help with this issue, and after a good bit of investigation, we discovered that this failure is caused by the OpenBLAS warning emitted when importing numpy. This warning is benign and the include path is still correctly reported, but TF's bazel configuration treats it as a fatal error. I have reported this as an upstream bug, see tensorflow/tensorflow#56346

Users who want to patch around this problem will need to pass allow_failure=True in the execute() call of _get_numpy_include() (third_party/py/python_configure.bzl) in versions of TensorFlow that have this parameter, or edit their third_party/remote_config/common.bzl file to modify execute() to treat this failure as nonfatal in older versions (what ended up being necessary in this user's case)

@Arnold1
Copy link
Author

Arnold1 commented Jun 3, 2022

@SnoopJeDi thanks again. I installed bazel version 4.0.0 - now the build works fine. the fix @SnoopJeDi mentioned is still needed.

@rstz
Copy link
Collaborator

rstz commented Jun 7, 2022

Hi,

regarding your first question:
YDF models cannot be used with vanilla Tensorflow since they rely on the YDF code (the one in this repository) for inference as well. The Tensorflow Decision Forests (TF-DF) package provides integration into standard Tensorflow, adding two custom ops to Tensorflow that import the YDF code. In Python, this process is frictionless and the resulting models can be combined freely e.g. through Keras; have a look at our tutorials for more info on that. The TF-DF repository also contains information about integration into other products such as TF-Serving.

I'm not familiar with the project you linked, but it sounds like it is not compatible with TF custom ops and will therefore not work with YDF/TF-DF at this time.

If you have interesting use cases not covered by our current APIs, please let us know, so we might consider them for future releases!

@Arnold1
Copy link
Author

Arnold1 commented Jun 8, 2022

@rstz im interested if there is a way to call model prediction from Golang - any idea if thats possible and it might be ago and calling c or c++? I would like to get more infos about it.
im assuming https://github.com/tensorflow/tensorflow/tree/master/tensorflow/go will not help - except I add those new custom ops myself - might be lots of work?

I'm not familiar with the project you linked, but it galeone/tfgo#58 it is not compatible with TF custom ops and will therefore not work with YDF/TF-DF at this time.

are this custom ops difficult to add? is there more documentation on it?

also, is there even a way to call decision forests with C_API?
regarding c_api I saw some decision here: https://discuss.tensorflow.org/t/decision-forests-issue-with-c-api/7434 - has anything changed now?

@janpfeifer
Copy link
Collaborator

hi @Arnold1 , for the inference in Golang, let's follow on issue #115. Hopefully we can have a native impelementation (albeit not complete) not too far in the future.

About galeone/tfgo: indeed there needs to be a way of linking the TF Custom Ops. I would guess (but not sure) there should be a way, one could use the compiled custom ops from TF-DF's PyPI (pip) packages -- but I have no idea of how to have them loaded. Maybe ask in the Tensorflow Forum ?

@janpfeifer
Copy link
Collaborator

Btw, the official TensorFlow Go API has the LoadLibrary function, maybe that would work ?

One has just to be careful to use the pre-built TF library and prebuilt TF-DF custom ops on the same version. Taking the most recent theoretically would work -- but I haven't tried it.

@Arnold1
Copy link
Author

Arnold1 commented Jun 11, 2022

@janpfeifer attached an example in C - looks good from the logs? How to convert that code to Go? for Go I also need the tf .so?

the TF_SessionRun call gives me segmentation fault - any idea what I do wrong? im not sure if I define input and output tensors correctly...

#include <stdio.h>
#include <tensorflow/c/c_api.h>

void NoOpDeallocator(void* data, size_t a, void* b) {}

int main() {
  TF_Graph *Graph = TF_NewGraph();
  TF_Status *Status = TF_NewStatus();
  TF_SessionOptions *SessionOpts = TF_NewSessionOptions();
  TF_Buffer *RunOpts = NULL;
  TF_Library *library;

  library = TF_LoadLibrary("/usr/local/lib/python3.10/dist-packages/tensorflow_decision_forests/tensorflow/ops/inference/inference.so",
                              Status);

  const char *saved_model_dir = "/tmp/my_saved_model/";
  const char *tags = "serve";
  int ntags = 1;

  TF_Session *Session = TF_LoadSessionFromSavedModel(
      SessionOpts, RunOpts, saved_model_dir, &tags, ntags, Graph, NULL, Status);

  printf("status: %s\n", TF_Message(Status));

  if(TF_GetCode(Status) == TF_OK) {
    printf("loaded\n");
  }else{
    printf("not loaded\n");
  }

  /* Get Input Tensor */
  int NumInputs = 14;

  TF_Output* Input = malloc(sizeof(TF_Output) * NumInputs);
  TF_Output t0 = {TF_GraphOperationByName(Graph, "serving_default_age"), 0};

  if(t0.oper == NULL)
    printf("ERROR: Failed TF_GraphOperationByName serving_default_input_1\n");
  else
    printf("TF_GraphOperationByName serving_default_input_1 is OK\n");

  Input[0] = t0;

  TF_Output t1 = {TF_GraphOperationByName(Graph, "serving_default_capital_gain"), 0};

  if(t1.oper == NULL)
    printf("ERROR: Failed TF_GraphOperationByName serving_default_input_2\n");
  else
    printf("TF_GraphOperationByName serving_default_input_2 is OK\n");

  Input[1] = t1;

  TF_Output t2 = {TF_GraphOperationByName(Graph, "serving_default_capital_loss"), 0};

  if(t2.oper == NULL)
    printf("ERROR: Failed TF_GraphOperationByName serving_default_input_3\n");
  else
    printf("TF_GraphOperationByName serving_default_input_3 is OK\n");

  Input[2] = t2;

  TF_Output t3 = {TF_GraphOperationByName(Graph, "serving_default_education"), 0};

  if(t3.oper == NULL)
    printf("ERROR: Failed TF_GraphOperationByName serving_default_input_4\n");
  else
    printf("TF_GraphOperationByName serving_default_input_4 is OK\n");

  Input[3] = t3;

  TF_Output t4 = {TF_GraphOperationByName(Graph, "serving_default_education_num"), 0};

  if(t4.oper == NULL)
    printf("ERROR: Failed TF_GraphOperationByName serving_default_input_5\n");
  else
    printf("TF_GraphOperationByName serving_default_input_5 is OK\n");

  Input[4] = t4;

  TF_Output t5 = {TF_GraphOperationByName(Graph, "serving_default_fnlwgt"), 0};

  if(t5.oper == NULL)
    printf("ERROR: Failed TF_GraphOperationByName serving_default_input_6\n");
  else
    printf("TF_GraphOperationByName serving_default_input_6 is OK\n");

  Input[5] = t5;

  TF_Output t6 = {TF_GraphOperationByName(Graph, "serving_default_hours_per_week"), 0};

  if(t6.oper == NULL)
    printf("ERROR: Failed TF_GraphOperationByName serving_default_input_7\n");
  else
    printf("TF_GraphOperationByName serving_default_input_7 is OK\n");

  Input[6] = t6;

  TF_Output t7 = {TF_GraphOperationByName(Graph, "serving_default_marital_status"), 0};

  if(t7.oper == NULL)
    printf("ERROR: Failed TF_GraphOperationByName serving_default_input_8\n");
  else
    printf("TF_GraphOperationByName serving_default_input_8 is OK\n");

  Input[7] = t7;

  TF_Output t8 = {TF_GraphOperationByName(Graph, "serving_default_native_country"), 0};

  if(t8.oper == NULL)
    printf("ERROR: Failed TF_GraphOperationByName serving_default_input_9\n");
  else
    printf("TF_GraphOperationByName serving_default_input_9 is OK\n");

  Input[8] = t8;

  TF_Output t9 = {TF_GraphOperationByName(Graph, "serving_default_occupation"), 0};

  if(t9.oper == NULL)
    printf("ERROR: Failed TF_GraphOperationByName serving_default_input_10\n");
  else
    printf("TF_GraphOperationByName serving_default_input_10 is OK\n");

  Input[9] = t9;

  TF_Output t10 = {TF_GraphOperationByName(Graph, "serving_default_race"), 0};

  if(t10.oper == NULL)
    printf("ERROR: Failed TF_GraphOperationByName serving_default_input_11\n");
  else
    printf("TF_GraphOperationByName serving_default_input_11 is OK\n");

  Input[9] = t10;

  TF_Output t11 = {TF_GraphOperationByName(Graph, "serving_default_relationship"), 0};

  if(t11.oper == NULL)
    printf("ERROR: Failed TF_GraphOperationByName serving_default_input_12\n");
  else
    printf("TF_GraphOperationByName serving_default_input_12 is OK\n");

  Input[10] = t11;

  TF_Output t12 = {TF_GraphOperationByName(Graph, "serving_default_sex"), 0};

  if(t12.oper == NULL)
    printf("ERROR: Failed TF_GraphOperationByName serving_default_input_13\n");
  else
    printf("TF_GraphOperationByName serving_default_input_13 is OK\n");

  Input[11] = t12;

  TF_Output t13 = {TF_GraphOperationByName(Graph, "serving_default_workclass"), 0};

  if(t13.oper == NULL)
    printf("ERROR: Failed TF_GraphOperationByName serving_default_input_14\n");
  else
    printf("TF_GraphOperationByName serving_default_input_14 is OK\n");

  Input[12] = t13;

  // Get Output tensor
  int NumOutputs = 1;
  TF_Output* Output = malloc(sizeof(TF_Output) * NumOutputs);
  TF_Output tout = {TF_GraphOperationByName(Graph, "StatefulPartitionedCall_9"), 0};

  if(tout.oper == NULL)
      printf("ERROR: Failed TF_GraphOperationByName StatefulPartitionedCall\n");
  else
    printf("TF_GraphOperationByName StatefulPartitionedCall is OK\n");

  Output[0] = tout;

  /* Allocate data for inputs and outputs */
  TF_Tensor** InputValues  = (TF_Tensor**)malloc(sizeof(TF_Tensor*)*NumInputs);
  TF_Tensor** OutputValues = (TF_Tensor**)malloc(sizeof(TF_Tensor*)*NumOutputs);

  int ndims = 1;
  int64_t dims[] = {1};
  int64_t data[] = {20};

  int ndata = sizeof(int64_t);
  TF_Tensor* int_tensor0 = TF_NewTensor(TF_INT64, dims, ndims, data, ndata, &NoOpDeallocator, 0);

  if (int_tensor0 != NULL)
    printf("TF_NewTensor is OK\n");
  else
    printf("ERROR: Failed TF_NewTensor\n");

  TF_Tensor* int_tensor1 = TF_NewTensor(TF_INT64, dims, ndims, data, ndata, &NoOpDeallocator, 0);

  if (int_tensor1 != NULL)
    printf("TF_NewTensor is OK\n");
  else
    printf("ERROR: Failed TF_NewTensor\n");

  TF_Tensor* int_tensor2 = TF_NewTensor(TF_INT64, dims, ndims, data, ndata, &NoOpDeallocator, 0);

  if (int_tensor2 != NULL)
    printf("TF_NewTensor is OK\n");
  else
    printf("ERROR: Failed TF_NewTensor\n");

  const char test_string[] = "borkborkborkborkborkborkborkbork";
  TF_TString tstr[1];
  TF_TString_Init(&tstr[0]);
  TF_TString_Copy(&tstr[0], test_string, sizeof(test_string) - 1);
  TF_Tensor* int_tensor3 = TF_NewTensor(TF_STRING, 0, 0, &tstr[0], sizeof(tstr), &NoOpDeallocator, 0);

  if (int_tensor3 != NULL)
    printf("TF_NewTensor is OK\n");
  else
    printf("ERROR: Failed TF_NewTensor\n");

  TF_Tensor* int_tensor4 = TF_NewTensor(TF_INT64, dims, ndims, data, ndata, &NoOpDeallocator, 0);

  if (int_tensor4 != NULL)
    printf("TF_NewTensor is OK\n");
  else
    printf("ERROR: Failed TF_NewTensor\n");

  TF_Tensor* int_tensor5 = TF_NewTensor(TF_INT64, dims, ndims, data, ndata, &NoOpDeallocator, 0);

  if (int_tensor5 != NULL)
    printf("TF_NewTensor is OK\n");
  else
    printf("ERROR: Failed TF_NewTensor\n");

  TF_Tensor* int_tensor6 = TF_NewTensor(TF_INT64, dims, ndims, data, ndata, &NoOpDeallocator, 0);

  if (int_tensor6 != NULL)
    printf("TF_NewTensor is OK\n");
  else
    printf("ERROR: Failed TF_NewTensor\n");

  TF_Tensor* int_tensor7 = TF_NewTensor(TF_INT64, dims, ndims, data, ndata, &NoOpDeallocator, 0);

  if (int_tensor7 != NULL)
    printf("TF_NewTensor is OK\n");
  else
    printf("ERROR: Failed TF_NewTensor\n");

  const char test_string2[] = "borkborkborkborkborkborkborkbork";
  TF_TString tstr2[1];
  TF_TString_Init(&tstr2[0]);
  TF_TString_Copy(&tstr2[0], test_string2, sizeof(test_string2) - 1);
  TF_Tensor* int_tensor8 = TF_NewTensor(TF_STRING, 0, 0, &tstr2[0], sizeof(tstr2), &NoOpDeallocator, 0);

  if (int_tensor8 != NULL)
    printf("TF_NewTensor is OK\n");
  else
    printf("ERROR: Failed TF_NewTensor\n");

  const char test_string3[] = "borkborkborkborkborkborkborkbork";
  TF_TString tstr3[1];
  TF_TString_Init(&tstr3[0]);
  TF_TString_Copy(&tstr3[0], test_string3, sizeof(test_string3) - 1);
  TF_Tensor* int_tensor9 = TF_NewTensor(TF_STRING, 0, 0, &tstr3[0], sizeof(tstr3), &NoOpDeallocator, 0);

  if (int_tensor9 != NULL)
    printf("TF_NewTensor is OK\n");
  else
    printf("ERROR: Failed TF_NewTensor\n");

  const char test_string4[] = "borkborkborkborkborkborkborkbork";
  TF_TString tstr4[1];
  TF_TString_Init(&tstr4[0]);
  TF_TString_Copy(&tstr4[0], test_string4, sizeof(test_string4) - 1);
  TF_Tensor* int_tensor10 = TF_NewTensor(TF_STRING, 0, 0, &tstr4[0], sizeof(tstr4), &NoOpDeallocator, 0);

  if (int_tensor10 != NULL)
    printf("TF_NewTensor is OK\n");
  else
    printf("ERROR: Failed TF_NewTensor\n");

  const char test_string5[] = "borkborkborkborkborkborkborkbork";
  TF_TString tstr5[1];
  TF_TString_Init(&tstr5[0]);
  TF_TString_Copy(&tstr5[0], test_string5, sizeof(test_string5) - 1);
  TF_Tensor* int_tensor11 = TF_NewTensor(TF_STRING, 0, 0, &tstr5[0], sizeof(tstr5), &NoOpDeallocator, 0);

  if (int_tensor11 != NULL)
    printf("TF_NewTensor is OK\n");
  else
    printf("ERROR: Failed TF_NewTensor\n");

  const char test_string6[] = "borkborkborkborkborkborkborkbork";
  TF_TString tstr6[1];
  TF_TString_Init(&tstr6[0]);
  TF_TString_Copy(&tstr6[0], test_string6, sizeof(test_string6) - 1);
  TF_Tensor* int_tensor12 = TF_NewTensor(TF_STRING, 0, 0, &tstr6[0], sizeof(tstr6), &NoOpDeallocator, 0);

  if (int_tensor12 != NULL)
    printf("TF_NewTensor is OK\n");
  else
    printf("ERROR: Failed TF_NewTensor\n");

  const char test_string7[] = "borkborkborkborkborkborkborkbork";
  TF_TString tstr7[1];
  TF_TString_Init(&tstr7[0]);
  TF_TString_Copy(&tstr7[0], test_string7, sizeof(test_string7) - 1);
  TF_Tensor* int_tensor13 = TF_NewTensor(TF_STRING, 0, 0, &tstr7[0], sizeof(tstr7), &NoOpDeallocator, 0);

  if (int_tensor13 != NULL)
    printf("TF_NewTensor is OK\n");
  else
    printf("ERROR: Failed TF_NewTensor\n");

  InputValues[0] = int_tensor0;
  InputValues[1] = int_tensor1;
  InputValues[2] = int_tensor2;
  InputValues[3] = int_tensor3;
  InputValues[4] = int_tensor4;
  InputValues[5] = int_tensor5;
  InputValues[6] = int_tensor6;
  InputValues[7] = int_tensor7;
  InputValues[8] = int_tensor8;
  InputValues[9] = int_tensor9;
  InputValues[10] = int_tensor10;
  InputValues[11] = int_tensor11;
  InputValues[12] = int_tensor12;
  InputValues[13] = int_tensor13;

  // Run the Session
  TF_SessionRun(Session, NULL, Input, InputValues, NumInputs, Output, OutputValues, NumOutputs, NULL, 0, NULL, Status);

  if(TF_GetCode(Status) == TF_OK)
    printf("Session is OK\n");
  else
    printf("%s",TF_Message(Status));

  // Free memory
  TF_DeleteGraph(Graph);
  TF_DeleteSession(Session, Status);
  TF_DeleteSessionOptions(SessionOpts);
  TF_DeleteStatus(Status);

  /* Get Output Result */
  void* buff = TF_TensorData(OutputValues[0]);
  float* offsets = buff;
  printf("Result Tensor :\n");
  printf("%f\n",offsets[0]);

  return 0;
}

here the output of the program:

root@d7dd04522b8d:~# ./main 
2022-06-12 02:20:23.600451: I tensorflow/cc/saved_model/reader.cc:43] Reading SavedModel from: /tmp/my_saved_model/
2022-06-12 02:20:23.602913: I tensorflow/cc/saved_model/reader.cc:81] Reading meta graph with tags { serve }
2022-06-12 02:20:23.602969: I tensorflow/cc/saved_model/reader.cc:122] Reading SavedModel debug info (if present) from: /tmp/my_saved_model/
2022-06-12 02:20:23.603030: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-06-12 02:20:23.616634: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:354] MLIR V1 optimization pass is not enabled
2022-06-12 02:20:23.618329: I tensorflow/cc/saved_model/loader.cc:228] Restoring SavedModel bundle.
2022-06-12 02:20:23.651074: I tensorflow/cc/saved_model/loader.cc:212] Running initialization op on SavedModel bundle at path: /tmp/my_saved_model/
[INFO kernel.cc:1176] Loading model from path /tmp/my_saved_model/assets/ with prefix fefa6cf7d5f74ce5
[INFO decision_forest.cc:639] Model loaded with 300 root(s), 365320 node(s), and 14 input feature(s).
[INFO abstract_model.cc:1246] Engine "RandomForestGeneric" built
[INFO kernel.cc:1022] Use fast generic engine
2022-06-12 02:20:24.659070: I tensorflow/cc/saved_model/loader.cc:301] SavedModel load for tags { serve }; Status: success: OK. Took 1058629 microseconds.
status: 
loaded
TF_GraphOperationByName serving_default_input_1 is OK
TF_GraphOperationByName serving_default_input_2 is OK
TF_GraphOperationByName serving_default_input_3 is OK
TF_GraphOperationByName serving_default_input_4 is OK
TF_GraphOperationByName serving_default_input_5 is OK
TF_GraphOperationByName serving_default_input_6 is OK
TF_GraphOperationByName serving_default_input_7 is OK
TF_GraphOperationByName serving_default_input_8 is OK
TF_GraphOperationByName serving_default_input_9 is OK
TF_GraphOperationByName serving_default_input_10 is OK
TF_GraphOperationByName serving_default_input_11 is OK
TF_GraphOperationByName serving_default_input_12 is OK
TF_GraphOperationByName serving_default_input_13 is OK
TF_GraphOperationByName serving_default_input_14 is OK
TF_GraphOperationByName StatefulPartitionedCall is OK
TF_NewTensor is OK
TF_NewTensor is OK
TF_NewTensor is OK
TF_NewTensor is OK
TF_NewTensor is OK
TF_NewTensor is OK
TF_NewTensor is OK
TF_NewTensor is OK
TF_NewTensor is OK
TF_NewTensor is OK
TF_NewTensor is OK
TF_NewTensor is OK
TF_NewTensor is OK
TF_NewTensor is OK
Segmentation fault

model is from here btw: https://github.com/tensorflow/decision-forests/blob/main/examples/minimal.py
looks like that shows how many input/output tensors I need:

# saved_model_cli show --dir /tmp/my_saved_model
The given SavedModel contains the following tag-sets:
'serve'

# saved_model_cli show --dir /tmp/my_saved_model --tag_set serve 
The given SavedModel MetaGraphDef contains SignatureDefs with the following keys:
SignatureDef key: "__saved_model_init_op"
SignatureDef key: "serving_default"

# saved_model_cli show --dir /tmp/my_saved_model --tag_set serve --signature_def serving_default
The given SavedModel SignatureDef contains the following input(s):
  inputs['age'] tensor_info:
      dtype: DT_INT64
      shape: (-1)
      name: serving_default_age:0
  inputs['capital_gain'] tensor_info:
      dtype: DT_INT64
      shape: (-1)
      name: serving_default_capital_gain:0
  inputs['capital_loss'] tensor_info:
      dtype: DT_INT64
      shape: (-1)
      name: serving_default_capital_loss:0
  inputs['education'] tensor_info:
      dtype: DT_STRING
      shape: (-1)
      name: serving_default_education:0
  inputs['education_num'] tensor_info:
      dtype: DT_INT64
      shape: (-1)
      name: serving_default_education_num:0
  inputs['fnlwgt'] tensor_info:
      dtype: DT_INT64
      shape: (-1)
      name: serving_default_fnlwgt:0
  inputs['hours_per_week'] tensor_info:
      dtype: DT_INT64
      shape: (-1)
      name: serving_default_hours_per_week:0
  inputs['marital_status'] tensor_info:
      dtype: DT_STRING
      shape: (-1)
      name: serving_default_marital_status:0
  inputs['native_country'] tensor_info:
      dtype: DT_STRING
      shape: (-1)
      name: serving_default_native_country:0
  inputs['occupation'] tensor_info:
      dtype: DT_STRING
      shape: (-1)
      name: serving_default_occupation:0
  inputs['race'] tensor_info:
      dtype: DT_STRING
      shape: (-1)
      name: serving_default_race:0
  inputs['relationship'] tensor_info:
      dtype: DT_STRING
      shape: (-1)
      name: serving_default_relationship:0
  inputs['sex'] tensor_info:
      dtype: DT_STRING
      shape: (-1)
      name: serving_default_sex:0
  inputs['workclass'] tensor_info:
      dtype: DT_STRING
      shape: (-1)
      name: serving_default_workclass:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['output_1'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 1)
      name: StatefulPartitionedCall_9:0
Method name is: tensorflow/serving/predict

the example above is based on: https://discuss.tensorflow.org/t/decision-forests-issue-with-c-api/7434

@Arnold1
Copy link
Author

Arnold1 commented Jun 16, 2022

@janpfeifer I assume using the golang api will run much inference slower than the yggdrasil-decision-forests c++ code?

@janpfeifer
Copy link
Collaborator

hey @Arnold1 sorry I missed the thread, I was out on vacation.

The pure Go API we want to open source here will be slower than the C++ version -- it was just a straight forward re-implementation. Still it's pretty fast -- in our use case the generation of the features was way more expensive than the inference itself.

If this turns out to be a limiting factor (let us know what the size of your forests/time constraint), consider:

(A) Using cgo and link the c++ inference from Yggdrasil Decision Forests (not the Tensorflow one, since it adds some overhead, plus lots of memory). Consider benchmarking the C++ inference (again Yggdrasil Decision Forest) time before going there, but our C++ inference is one of the fastest implementations out there (maybe the fastest under certain constraints?);

(B) Re-implementing our C++ optimized engines in Go. There will be some challenges -- locality of memory played a huge role -- but probably super fun project :)

On your C++ implementation: it's been a while I used the tensorflow C API, but it seems correct. That's the tensorflow API, as opposed to Yggdrasil, which is simpler.

@janpfeifer
Copy link
Collaborator

hi all,

Took a while, but we just released a YDF (and TF-DF) inference API in Go.

It's fresh from the oven, so we consider it beta. But internally it has already been used successfully.

Also it doesn't support all types of models yet, but should be easier now to add any model type. If you see the need just ping us.

@Arnold1
Copy link
Author

Arnold1 commented Aug 19, 2022

@janpfeifer amazing! I currently wrote some code to load/serve the model with tensor flow...

I can just use this lib to load the model and do the inferencing with that lib and expect to be faster?

how is that related to python? will I need python installed?

// When running a model trained with the TensorFlow Decision Forests API (in
// TensorFlow python, as opposed from the command-line), use the
// `NewEngineWithCompatibility` method instead. This is a temporary
// compatibility issue, and will be resolved soon, when "NewEngine" will
// automatically apply the correct compatibility.
engine, err := serving.NewEngineWithCompatibility(model, example.CompatibilityTensorFlowDecisionForests)

do you currently support tfdf.keras.RandomForestModel() models? I currently use that...

@janpfeifer
Copy link
Collaborator

No need for Python. I mean you can train the model in Python+TensorFlow (or with Yggdrasil command line tool). But after that this is pure Go inference engine.

Ugh, no, we haven't implemented RandomForestModel yet -- our internal use case was GradientBoostedTrees. Let me take a stab a it, I'll post back in a bit.

@Arnold1
Copy link
Author

Arnold1 commented Aug 20, 2022

we train the model in python and serve it in go.

my_model = tfdf.keras.RandomForestModel(
            features=my_features,
            exclude_non_specified_features=True,
            task=tfdf.keras.Task.REGRESSION,
           ...)

@janpfeifer ok waiting for RandomForestModel code :) amazing thanks!

@Arnold1
Copy link
Author

Arnold1 commented Aug 31, 2022

hi @janpfeifer would like to check for an update on it - do you think you have time in the next 1-2 weeks to add support for RandomForestModel?

@rstz
Copy link
Collaborator

rstz commented Aug 31, 2022

I think we never give timelines, but I can confidently say that what we have looks very good :)

(answering for jan, I'm also a member of the team)

@janpfeifer
Copy link
Collaborator

hi @Arnold1 , +1 for what @rstz said. But I think in this case I can add some details, thanks to @achoum : he has a change (PR) that adds support for RF, it's under review. Likely early next week. Hopefully it will work for you!

@Arnold1
Copy link
Author

Arnold1 commented Aug 31, 2022

@rstz yeah I know.... just wanted to check back :) @janpfeifer amazing!

@rstz
Copy link
Collaborator

rstz commented Sep 6, 2022

Good news: e03e98d just landed with support for

  • Binary classification Random Forest
  • Regression Random Forest
  • Regression Gradient Boosted Trees
  • Ranking Gradient Boosted Trees

This is still very fresh (no guarantees it won't be rolled back or changed or ...) but feel free to have a look and tell us what you think

@geraldstanje
Copy link

geraldstanje commented Sep 6, 2022

awesome - im also intersted to test it.

@Arnold1
Copy link
Author

Arnold1 commented Dec 21, 2022

@rstz is there also support for Multiclass classification with decision trees? it looks like its in the repo - I have a use case for Multiclass classification - could you also add support for go?

@rstz
Copy link
Collaborator

rstz commented Dec 21, 2022

Hi, multiclass support is not available in the Go port of YDF yet.

@Arnold1
Copy link
Author

Arnold1 commented Dec 21, 2022

@rstz could I run it with tensorflow and special ops (inference.so) in the meantime and switch over later when go port is ready?

@achoum achoum closed this as completed Dec 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants