-
Notifications
You must be signed in to change notification settings - Fork 574
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
adding otel example + small typos in instance settings (#5009)
- Loading branch information
Showing
6 changed files
with
312 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
Tracing with Jaeger | ||
=================== | ||
|
||
[Jaeger](https://www.jaegertracing.io/) is an open-source distributed tracing system for monitoring and debugging microservices. Originally developed by Uber, it helps track requests across services, analyze latency, identify bottlenecks, and diagnose failures. | ||
|
||
Key use cases include debugging production issues, monitoring performance, visualizing service dependencies, and optimizing system reliability. As Jaeger supports the OpenTelemetry protocol, it can be used to collect traces from Windmill. | ||
|
||
Follow the guide on [setting up Jaeger](https://windmill.dev/docs/misc/guides/otel#setting-up-jaeger) for more details. | ||
|
||
## Setting up Jaeger along with Windmill | ||
|
||
Start all services by running: | ||
|
||
```bash | ||
docker-compose up -d | ||
``` | ||
|
||
## Configuring Windmill to use Jaeger | ||
|
||
In the Windmill UI available at `http://localhost`, complete the initial setup and go to "Instances Settings" and "OTEL/Prom" tab and fill in the Jaeger endpoint and service name and toggle the Tracing option to send traces to Jaeger. | ||
|
||
## Open the Jaeger UI | ||
|
||
The Jaeger UI, if hosted with the `docker-compose.yml` file above, will be available at `http://localhost:16686`. When running a script or workflow with Windmill, you will be able to see the traces in the Jaeger UI and investigate them. This can be useful to understand the performance of a workflow and identify bottlenecks in the Windmill server or client. | ||
|
||
## Searching for specific traces | ||
|
||
To search/filter for a specific trace, for example a workflow, you can use the search function in the Jaeger UI by filtering by tags set by Windmill. | ||
|
||
The following tags are useful to filter for specific traces: | ||
|
||
- `job_id`: The ID of the job | ||
- `root_job`: The ID of the root job (flow) | ||
- `parent_job`: The ID of the parent job (flow) | ||
- `flow_step_id`: The ID of the step within the workflow | ||
- `script_path`: The path of the script | ||
- `workspace_id`: The name of the workspace | ||
- `worker_id`: The ID of the worker | ||
- `language`: The language of the script | ||
- `tag`: The queue tag of the workflow | ||
|
||
## Monitoring metrics with Jaeger | ||
|
||
Jaeger can be used to generate time series for metrics of the collected traces. These time series can be used to compare the performance of individual steps within a workflow and their overall performance and relative contribution over time, as well as identify and troubleshoot issues and anomalies. | ||
|
||
In the Jaeger UI, you will now be able to see metrics time series for the traces in the "Monitor" tab. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,201 @@ | ||
version: "3.7" | ||
|
||
services: | ||
db: | ||
deploy: | ||
# To use an external database, set replicas to 0 and set DATABASE_URL to the external database url in the .env file | ||
replicas: 1 | ||
image: postgres:16 | ||
shm_size: 1g | ||
restart: unless-stopped | ||
volumes: | ||
- db_data:/var/lib/postgresql/data | ||
expose: | ||
- 5432 | ||
ports: | ||
- 5432:5432 | ||
environment: | ||
POSTGRES_PASSWORD: changeme | ||
POSTGRES_DB: windmill | ||
healthcheck: | ||
test: ["CMD-SHELL", "pg_isready -U postgres"] | ||
interval: 10s | ||
timeout: 5s | ||
retries: 5 | ||
|
||
windmill_server: | ||
image: ${WM_IMAGE} | ||
pull_policy: always | ||
deploy: | ||
replicas: 1 | ||
restart: unless-stopped | ||
expose: | ||
- 8000 | ||
- 2525 | ||
environment: | ||
- DATABASE_URL=${DATABASE_URL} | ||
- MODE=server | ||
depends_on: | ||
db: | ||
condition: service_healthy | ||
volumes: | ||
- worker_logs:/tmp/windmill/logs | ||
|
||
windmill_worker: | ||
image: ${WM_IMAGE} | ||
pull_policy: always | ||
deploy: | ||
replicas: 3 | ||
resources: | ||
limits: | ||
cpus: "1" | ||
memory: 2048M | ||
# for GB, use syntax '2Gi' | ||
restart: unless-stopped | ||
environment: | ||
- DATABASE_URL=${DATABASE_URL} | ||
- MODE=worker | ||
- WORKER_GROUP=default | ||
depends_on: | ||
db: | ||
condition: service_healthy | ||
# to mount the worker folder to debug, KEEP_JOB_DIR=true and mount /tmp/windmill | ||
volumes: | ||
# mount the docker socket to allow to run docker containers from within the workers | ||
- /var/run/docker.sock:/var/run/docker.sock | ||
- worker_dependency_cache:/tmp/windmill/cache | ||
- worker_logs:/tmp/windmill/logs | ||
|
||
## This worker is specialized for "native" jobs. Native jobs run in-process and thus are much more lightweight than other jobs | ||
windmill_worker_native: | ||
# Use ghcr.io/windmill-labs/windmill-ee:main for the ee | ||
image: ${WM_IMAGE} | ||
pull_policy: always | ||
deploy: | ||
replicas: 1 | ||
resources: | ||
limits: | ||
cpus: "1" | ||
memory: 2048M | ||
# for GB, use syntax '2Gi' | ||
restart: unless-stopped | ||
environment: | ||
- DATABASE_URL=${DATABASE_URL} | ||
- MODE=worker | ||
- WORKER_GROUP=native | ||
- NUM_WORKERS=8 | ||
- SLEEP_QUEUE=200 | ||
depends_on: | ||
db: | ||
condition: service_healthy | ||
volumes: | ||
- worker_logs:/tmp/windmill/logs | ||
# This worker is specialized for reports or scraping jobs. It is assigned the "reports" worker group which has an init script that installs chromium and can be targeted by using the "chromium" worker tag. | ||
# windmill_worker_reports: | ||
# image: ${WM_IMAGE} | ||
# pull_policy: always | ||
# deploy: | ||
# replicas: 1 | ||
# resources: | ||
# limits: | ||
# cpus: "1" | ||
# memory: 2048M | ||
# # for GB, use syntax '2Gi' | ||
# restart: unless-stopped | ||
# environment: | ||
# - DATABASE_URL=${DATABASE_URL} | ||
# - MODE=worker | ||
# - WORKER_GROUP=reports | ||
# depends_on: | ||
# db: | ||
# condition: service_healthy | ||
# # to mount the worker folder to debug, KEEP_JOB_DIR=true and mount /tmp/windmill | ||
# volumes: | ||
# # mount the docker socket to allow to run docker containers from within the workers | ||
# - /var/run/docker.sock:/var/run/docker.sock | ||
# - worker_dependency_cache:/tmp/windmill/cache | ||
# - worker_logs:/tmp/windmill/logs | ||
|
||
# The indexer powers full-text job and log search, an EE feature. | ||
windmill_indexer: | ||
image: ${WM_IMAGE} | ||
pull_policy: always | ||
deploy: | ||
replicas: 0 # set to 1 to enable full-text job and log search | ||
restart: unless-stopped | ||
expose: | ||
- 8001 | ||
environment: | ||
- PORT=8001 | ||
- DATABASE_URL=${DATABASE_URL} | ||
- MODE=indexer | ||
depends_on: | ||
db: | ||
condition: service_healthy | ||
volumes: | ||
- windmill_index:/tmp/windmill/search | ||
- worker_logs:/tmp/windmill/logs | ||
|
||
lsp: | ||
image: ghcr.io/windmill-labs/windmill-lsp:latest | ||
pull_policy: always | ||
restart: unless-stopped | ||
expose: | ||
- 3001 | ||
volumes: | ||
- lsp_cache:/root/.cache | ||
|
||
multiplayer: | ||
image: ghcr.io/windmill-labs/windmill-multiplayer:latest | ||
deploy: | ||
replicas: 0 # Set to 1 to enable multiplayer, only available on Enterprise Edition | ||
restart: unless-stopped | ||
expose: | ||
- 3002 | ||
|
||
caddy: | ||
image: ghcr.io/windmill-labs/caddy-l4:latest | ||
restart: unless-stopped | ||
# Configure the mounted Caddyfile and the exposed ports or use another reverse proxy if needed | ||
volumes: | ||
- ./Caddyfile:/etc/caddy/Caddyfile | ||
# - ./certs:/certs # Provide custom certificate files like cert.pem and key.pem to enable HTTPS - See the corresponding section in the Caddyfile | ||
ports: | ||
# To change the exposed port, simply change 80:80 to <desired_port>:80. No other changes needed | ||
- 80:80 | ||
- 25:25 | ||
# - 443:443 # Uncomment to enable HTTPS handling by Caddy | ||
environment: | ||
- BASE_URL=":80" | ||
# - BASE_URL=":443" # uncomment and comment line above to enable HTTPS via custom certificate and key files | ||
# - BASE_URL=mydomain.com # Uncomment and comment line above to enable HTTPS handling by Caddy | ||
|
||
# Jaeger OpenTelemetry Example | ||
# https://windmill.dev/docs/misc/guides/otel#setting-up-jaeger | ||
jaeger: | ||
image: jaegertracing/jaeger:latest | ||
ports: | ||
- "16686:16686" | ||
expose: | ||
- 4317 | ||
- 8889 | ||
volumes: | ||
- ./jaeger-config.yaml:/etc/jaeger/config.yml | ||
command: ["--config", "/etc/jaeger/config.yml"] | ||
|
||
prometheus: | ||
image: prom/prometheus:latest | ||
restart: unless-stopped | ||
expose: | ||
- 9090 | ||
volumes: | ||
- ./prometheus-config.yaml:/etc/prometheus/prometheus.yml | ||
command: | ||
- "--config.file=/etc/prometheus/prometheus.yml" | ||
|
||
volumes: | ||
db_data: null | ||
worker_dependency_cache: null | ||
worker_logs: null | ||
windmill_index: null | ||
lsp_cache: null |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
service: | ||
extensions: [jaeger_storage, jaeger_query] | ||
pipelines: | ||
traces: | ||
receivers: [otlp] | ||
processors: [batch] | ||
exporters: [jaeger_storage_exporter, spanmetrics] | ||
metrics/spanmetrics: | ||
receivers: [spanmetrics] | ||
exporters: [prometheus] | ||
telemetry: | ||
resource: | ||
service.name: jaeger | ||
metrics: | ||
level: detailed | ||
address: 0.0.0.0:8888 | ||
logs: | ||
level: DEBUG | ||
|
||
extensions: | ||
jaeger_query: | ||
storage: | ||
traces: some_storage | ||
metrics: some_metrics_storage | ||
jaeger_storage: | ||
backends: | ||
some_storage: | ||
memory: | ||
max_traces: 100000 | ||
metric_backends: | ||
some_metrics_storage: | ||
prometheus: | ||
endpoint: http://prometheus:9090 | ||
normalize_calls: true | ||
normalize_duration: true | ||
|
||
connectors: | ||
spanmetrics: | ||
|
||
receivers: | ||
otlp: | ||
protocols: | ||
grpc: | ||
endpoint: "0.0.0.0:4317" | ||
|
||
processors: | ||
batch: | ||
|
||
exporters: | ||
jaeger_storage_exporter: | ||
trace_storage: some_storage | ||
prometheus: | ||
endpoint: "0.0.0.0:8889" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
global: | ||
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. | ||
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute. | ||
# scrape_timeout is set to the global default (10s). | ||
|
||
scrape_configs: | ||
- job_name: aggregated-trace-metrics | ||
static_configs: | ||
- targets: ['jaeger:8889'] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters