camunda · npepinpe · Aug 5, 2024 · Jul 4, 2024 · Jul 4, 2024 · Jul 5, 2024
diff --git a/docs/apis-tools/go-client/job-worker.md b/docs/apis-tools/go-client/job-worker.md
@@ -106,6 +106,18 @@ To avoid your workers being overloaded with too many jobs, e.g. running out of m
 
 **If streaming is enabled, back pressure applies to both pushing and polling**. You can then use `MaxJobsActive` and `Concurrency` as a way to soft-bound the memory usage of your worker. For example, given a maximum variable payload for a job of 1MB, `MaxJobsActive = 32`, and `Concurrency = 10`, then a single worker could use up to 42MB of memory. You can estimate a worst case scenario using the configured maximum message size, as no job payload will ever exceed this.
 
+#### Proxying
+
+If you're using a reverse proxy or a load balancer between your worker and your gateway, you may need to configure additional parameters to ensure the job stream is not closed unexpectedly with an error. If you observe regular 504 timeouts, read our guide on [job streaming](../../../self-managed/zeebe-deployment/zeebe-gateway/job-streaming).
+
+By default, the Go job workers have a stream timeout of one hour. You can overwrite this by calling the `StreamRequestTimeout` of the job worker builder:
+
+```go
+var JobWorkerBuilderStep3 builder;
+// builder is set in some way
+builder.StreamRequestTimeout(30 * time.Minute);
+```
+
 ## Additional resources
 
 - [Job worker reference](/components/concepts/job-workers.md)
diff --git a/docs/apis-tools/java-client/job-worker.md b/docs/apis-tools/java-client/job-worker.md
@@ -185,6 +185,17 @@ To avoid your workers being overloaded with too many jobs, e.g. running out of m
 If the worker blocks longer than the job's deadline, the job will **not** be passed to the worker, but will be dropped. As it will time out on the broker side, it will be pushed again.
 :::
 
+#### Proxying
+
+If you're using a reverse proxy or a load balancer between your worker and your gateway, you may need to configure additional parameters to ensure the job stream is not closed unexpectedly with an error. If you observe regular 504 timeouts, read our guide on [job streaming](../../../self-managed/zeebe-deployment/zeebe-gateway/job-streaming).
+
+By default, the Java job workers have a stream timeout of one hour. You can overwrite this by calling the `streamTimeout` of the job worker builder:
+
+```java
+final JobWorkerBuilderStep3 builder = ...;
+builder.streamTimeout(Duration.ofMinutes(30));
+```
+
 ## Multi-tenancy
 
 You can configure a job worker to pick up jobs belonging to one or more tenants. When using the builder, you can configure

diff --git a/docs/components/concepts/job-workers.md b/docs/components/concepts/job-workers.md
@@ -169,6 +169,10 @@ If you're using Prometheus, you can use the following query to estimate the queu
 
 On the server side (e.g. if you're running a self-managed cluster), you can measure the rate of jobs which are not pushed due to clients which are not ready via the metric `zeebe_broker_jobs_push_fail_try_count_total{code="BLOCKED"}`. If the rate of this metric is high for a sustained amount of time, it may be a good indicator that you need to scale your workers. Unfortunately, on the server side we don't differentiate between clients, so this metric doesn't tell you which worker deployment needs to be scaled. We thus recommend using client metrics whenever possible.
 
+### Proxying
+
+If you're using a reverse proxy or a load balancer between your worker and your gateway, you may need to configure additional parameters to ensure the job stream is not closed unexpectedly with an error. If you observe regular 504 timeouts, read our guide on [job streaming](../../../self-managed/zeebe-deployment/zeebe-gateway/job-streaming).
+
 ### Troubleshooting
 
 Since this feature requires a good amount of coordination between various components over the network, we've built in some tools to help monitor the health of the job streams.

diff --git a/docs/self-managed/zeebe-deployment/zeebe-gateway/job-streaming.md b/docs/self-managed/zeebe-deployment/zeebe-gateway/job-streaming.md
@@ -0,0 +1,25 @@
+---
+id: job-streaming
+title: "Job streaming"
+sidebar_label: "Job streaming"
+description: "Streaming job workers is expected to be long-lived to cut down on the latency overhead involved with re-creating a stream and propagating this throughout the cluster."
+---
+
+[Job streaming](../../../components/concepts/job-workers.md#job-streaming) is a long-lived process designed to reduce the latency involved with re-creating and propagating job workers.
+
+When using a reverse proxy or a load balancer between your worker and your gateway, you may need to configure additional parameters to ensure the job stream is not closed unexpectedly. Impacted proxies will see HTTP 504 (gateway timeout) errors returned to the job streaming worker at regular intervals.
+
+:::note
+This configuration is _only_ required for reverse proxies which do not support forwarding HTTP/2 keepalive (on either side). See [this nginx ticket](https://trac.nginx.org/nginx/ticket/1887), for example.
+
+Proxies which support forwarding HTTP/2 keepalive do not require any change.
+:::
+
+The following configuration is recommended for impacted reverse proxies:
+
+- On your client, set an explicit stream timeout of one hour. See additional examples in [Java](../../../../apis-tools/java-client/job-worker) and [Go](../../../../apis-tools/go-client/job-worker).
+- On your reverse proxy, ensure the read response timeout is set to slightly higher than your client (for example, an hour and ten minutes).
+
+## Nginx
+
+Nginx is a known proxy which does not support forward HTTP/2 pings from either side as a form of keepalive. To resolve related gateway timeouts, configure an appropriate `grpc_send_timeout` that it is _higher_ than your job worker stream timeout configuration.
diff --git a/sidebars.js b/sidebars.js
@@ -1032,6 +1032,7 @@ module.exports = {
               "Zeebe Gateway": [
                 "self-managed/zeebe-deployment/zeebe-gateway/overview",
                 "self-managed/zeebe-deployment/zeebe-gateway/interceptors",
+                "self-managed/zeebe-deployment/zeebe-gateway/job-streaming",
               ],
             },
             {

diff --git a/versioned_docs/version-8.4/apis-tools/go-client/job-worker.md b/versioned_docs/version-8.4/apis-tools/go-client/job-worker.md
@@ -106,6 +106,18 @@ To avoid your workers being overloaded with too many jobs, e.g. running out of m
 
 **If streaming is enabled, back pressure applies to both pushing and polling**. You can then use `MaxJobsActive` and `Concurrency` as a way to soft-bound the memory usage of your worker. For example, given a maximum variable payload for a job of 1MB, `MaxJobsActive = 32`, and `Concurrency = 10`, then a single worker could use up to 42MB of memory. You can estimate a worst case scenario using the configured maximum message size, as no job payload will ever exceed this.
 
+#### Proxying
+
+If you're using a reverse proxy or a load balancer between your worker and your gateway, you may need to configure additional parameters to ensure the job stream is not closed unexpectedly with an error. If you observe regular 504 timeouts, read our guide on [job streaming](../../../self-managed/zeebe-deployment/zeebe-gateway/job-streaming).
+
+By default, the Go job workers have a stream timeout of one hour. You can overwrite this by calling the `StreamRequestTimeout` of the job worker builder:
+
+```go
+var JobWorkerBuilderStep3 builder;
+// builder is set in some way
+builder.StreamRequestTimeout(30 * time.Minute);
+```
+
 ## Additional resources
 
 - [Job worker reference](/components/concepts/job-workers.md)
diff --git a/versioned_docs/version-8.4/apis-tools/java-client/job-worker.md b/versioned_docs/version-8.4/apis-tools/java-client/job-worker.md
@@ -185,6 +185,17 @@ To avoid your workers being overloaded with too many jobs, e.g. running out of m
 If the worker blocks longer than the job's deadline, the job will **not** be passed to the worker, but will be dropped. As it will time out on the broker side, it will be pushed again.
 :::
 
+#### Proxying
+
+If you're using a reverse proxy or a load balancer between your worker and your gateway, you may need to configure additional parameters to ensure the job stream is not closed unexpectedly with an error. If you observe regular 504 timeouts, read our guide on [job streaming](../../../self-managed/zeebe-deployment/zeebe-gateway/job-streaming).
+
+By default, the Java job workers have a stream timeout of one hour. You can overwrite this by calling the `streamTimeout` of the job worker builder:
+
+```java
+final JobWorkerBuilderStep3 builder = ...;
+builder.streamTimeout(Duration.ofMinutes(30));
+```
+
 ## Multi-tenancy
 
 You can configure a job worker to pick up jobs belonging to one or more tenants. When using the builder, you can configure

diff --git a/versioned_docs/version-8.4/components/concepts/job-workers.md b/versioned_docs/version-8.4/components/concepts/job-workers.md
@@ -169,6 +169,10 @@ If you're using Prometheus, you can use the following query to estimate the queu
 
 On the server side (e.g. if you're running a self-managed cluster), you can measure the rate of jobs which are not pushed due to clients which are not ready via the metric `zeebe_broker_jobs_push_fail_try_count_total{code="BLOCKED"}`. If the rate of this metric is high for a sustained amount of time, it may be a good indicator that you need to scale your workers. Unfortunately, on the server side we don't differentiate between clients, so this metric doesn't tell you which worker deployment needs to be scaled. We thus recommend using client metrics whenever possible.
 
+### Proxying
+
+If you're using a reverse proxy or a load balancer between your worker and your gateway, you may need to configure additional parameters to ensure the job stream is not closed unexpectedly with an error. If you observe regular 504 timeouts, read our guide on [job streaming](../../../self-managed/zeebe-deployment/zeebe-gateway/job-streaming).
+
 ### Troubleshooting
 
 Since this feature requires a good amount of coordination between various components over the network, we've built in some tools to help monitor the health of the job streams.

diff --git a/...d_docs/version-8.4/self-managed/zeebe-deployment/zeebe-gateway/job-streaming.md b/...d_docs/version-8.4/self-managed/zeebe-deployment/zeebe-gateway/job-streaming.md
@@ -0,0 +1,25 @@
+---
+id: job-streaming
+title: "Job streaming"
+sidebar_label: "Job streaming"
+description: "Streaming job workers is expected to be long-lived to cut down on the latency overhead involved with re-creating a stream and propagating this throughout the cluster."
+---
+
+[Job streaming](../../../components/concepts/job-workers.md#job-streaming) is a long-lived process designed to reduce the latency involved with re-creating and propagating job workers.
+
+When using a reverse proxy or a load balancer between your worker and your gateway, you may need to configure additional parameters to ensure the job stream is not closed unexpectedly. Impacted proxies will see HTTP 504 (gateway timeout) errors returned to the job streaming worker at regular intervals.
+
+:::note
+This configuration is _only_ required for reverse proxies which do not support forwarding HTTP/2 keepalive (on either side). See [this nginx ticket](https://trac.nginx.org/nginx/ticket/1887), for example.
+
+Proxies which support forwarding HTTP/2 keepalive do not require any change.
+:::
+
+The following configuration is recommended for impacted reverse proxies:
+
+- On your client, set an explicit stream timeout of one hour. See additional examples in [Java](../../../../apis-tools/java-client/job-worker) and [Go](../../../../apis-tools/go-client/job-worker).
+- On your reverse proxy, ensure the read response timeout is set to slightly higher than your client (for example, an hour and ten minutes).
+
+## Nginx
+
+Nginx is a known proxy which does not support forward HTTP/2 pings from either side as a form of keepalive. To resolve related gateway timeouts, configure an appropriate `grpc_send_timeout` that it is _higher_ than your job worker stream timeout configuration.
diff --git a/versioned_docs/version-8.5/apis-tools/go-client/job-worker.md b/versioned_docs/version-8.5/apis-tools/go-client/job-worker.md
@@ -106,6 +106,18 @@ To avoid your workers being overloaded with too many jobs, e.g. running out of m
 
 **If streaming is enabled, back pressure applies to both pushing and polling**. You can then use `MaxJobsActive` and `Concurrency` as a way to soft-bound the memory usage of your worker. For example, given a maximum variable payload for a job of 1MB, `MaxJobsActive = 32`, and `Concurrency = 10`, then a single worker could use up to 42MB of memory. You can estimate a worst case scenario using the configured maximum message size, as no job payload will ever exceed this.
 
+#### Proxying
+
+If you're using a reverse proxy or a load balancer between your worker and your gateway, you may need to configure additional parameters to ensure the job stream is not closed unexpectedly with an error. If you observe regular 504 timeouts, read our guide on [job streaming](../../../self-managed/zeebe-deployment/zeebe-gateway/job-streaming).
+
+By default, the Go job workers have a stream timeout of one hour. You can overwrite this by calling the `StreamRequestTimeout` of the job worker builder:
+
+```go
+var JobWorkerBuilderStep3 builder;
+// builder is set in some way
+builder.StreamRequestTimeout(30 * time.Minute);
+```
+
 ## Additional resources
 
 - [Job worker reference](/components/concepts/job-workers.md)
diff --git a/versioned_docs/version-8.5/apis-tools/java-client/job-worker.md b/versioned_docs/version-8.5/apis-tools/java-client/job-worker.md
@@ -185,6 +185,17 @@ To avoid your workers being overloaded with too many jobs, e.g. running out of m
 If the worker blocks longer than the job's deadline, the job will **not** be passed to the worker, but will be dropped. As it will time out on the broker side, it will be pushed again.
 :::
 
+#### Proxying
+
+If you're using a reverse proxy or a load balancer between your worker and your gateway, you may need to configure additional parameters to ensure the job stream is not closed unexpectedly with an error. If you observe regular 504 timeouts, read our guide on [job streaming](../../../self-managed/zeebe-deployment/zeebe-gateway/job-streaming).
+
+By default, the Java job workers have a stream timeout of one hour. You can overwrite this by calling the `streamTimeout` of the job worker builder:
+
+```java
+final JobWorkerBuilderStep3 builder = ...;
+builder.streamTimeout(Duration.ofMinutes(30));
+```
+
 ## Multi-tenancy
 
 You can configure a job worker to pick up jobs belonging to one or more tenants. When using the builder, you can configure

diff --git a/versioned_docs/version-8.5/components/concepts/job-workers.md b/versioned_docs/version-8.5/components/concepts/job-workers.md
@@ -169,6 +169,10 @@ If you're using Prometheus, you can use the following query to estimate the queu
 
 On the server side (e.g. if you're running a self-managed cluster), you can measure the rate of jobs which are not pushed due to clients which are not ready via the metric `zeebe_broker_jobs_push_fail_try_count_total{code="BLOCKED"}`. If the rate of this metric is high for a sustained amount of time, it may be a good indicator that you need to scale your workers. Unfortunately, on the server side we don't differentiate between clients, so this metric doesn't tell you which worker deployment needs to be scaled. We thus recommend using client metrics whenever possible.
 
+### Proxying
+
+If you're using a reverse proxy or a load balancer between your worker and your gateway, you may need to configure additional parameters to ensure the job stream is not closed unexpectedly with an error. If you observe regular 504 timeouts, read our guide on [job streaming](../../../self-managed/zeebe-deployment/zeebe-gateway/job-streaming).
+
 ### Troubleshooting
 
 Since this feature requires a good amount of coordination between various components over the network, we've built in some tools to help monitor the health of the job streams.

diff --git a/...d_docs/version-8.5/self-managed/zeebe-deployment/zeebe-gateway/job-streaming.md b/...d_docs/version-8.5/self-managed/zeebe-deployment/zeebe-gateway/job-streaming.md
@@ -0,0 +1,25 @@
+---
+id: job-streaming
+title: "Job streaming"
+sidebar_label: "Job streaming"
+description: "Streaming job workers is expected to be long-lived to cut down on the latency overhead involved with re-creating a stream and propagating this throughout the cluster."
+---
+
+[Job streaming](../../../components/concepts/job-workers.md#job-streaming) is a long-lived process designed to reduce the latency involved with re-creating and propagating job workers.
+
+When using a reverse proxy or a load balancer between your worker and your gateway, you may need to configure additional parameters to ensure the job stream is not closed unexpectedly. Impacted proxies will see HTTP 504 (gateway timeout) errors returned to the job streaming worker at regular intervals.
+
+:::note
+This configuration is _only_ required for reverse proxies which do not support forwarding HTTP/2 keepalive (on either side). See [this nginx ticket](https://trac.nginx.org/nginx/ticket/1887), for example.
+
+Proxies which support forwarding HTTP/2 keepalive do not require any change.
+:::
+
+The following configuration is recommended for impacted reverse proxies:
+
+- On your client, set an explicit stream timeout of one hour. See additional examples in [Java](../../../../apis-tools/java-client/job-worker) and [Go](../../../../apis-tools/go-client/job-worker).
+- On your reverse proxy, ensure the read response timeout is set to slightly higher than your client (for example, an hour and ten minutes).
+
+## Nginx
+
+Nginx is a known proxy which does not support forward HTTP/2 pings from either side as a form of keepalive. To resolve related gateway timeouts, configure an appropriate `grpc_send_timeout` that it is _higher_ than your job worker stream timeout configuration.
diff --git a/versioned_sidebars/version-8.4-sidebars.json b/versioned_sidebars/version-8.4-sidebars.json
@@ -1406,7 +1406,8 @@
             {
               "Zeebe Gateway": [
                 "self-managed/zeebe-deployment/zeebe-gateway/overview",
-                "self-managed/zeebe-deployment/zeebe-gateway/interceptors"
+                "self-managed/zeebe-deployment/zeebe-gateway/interceptors",
+                "self-managed/zeebe-deployment/zeebe-gateway/job-streaming"
               ]
             },
             {

diff --git a/versioned_sidebars/version-8.5-sidebars.json b/versioned_sidebars/version-8.5-sidebars.json
@@ -1536,7 +1536,8 @@
             {
               "Zeebe Gateway": [
                 "self-managed/zeebe-deployment/zeebe-gateway/overview",
-                "self-managed/zeebe-deployment/zeebe-gateway/interceptors"
+                "self-managed/zeebe-deployment/zeebe-gateway/interceptors",
+                "self-managed/zeebe-deployment/zeebe-gateway/job-streaming"
               ]
             },
             {