Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HWORKS-284] Documentation for exporting logs #142

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

kouzant
Copy link
Contributor

@kouzant kouzant commented Nov 18, 2022


## Introduction
Hopsworks collects services and applications logs to [Logstash](https://www.elastic.co/logstash/) which then forwards them to OpenSearch for indexing.
Often organizations already have logging systems in place so streaming Hopsworks logs is necessary.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so exporting HOpsworks logs is necessary?

Often organizations already have logging systems in place so streaming Hopsworks logs is necessary.

## Prerequisites
To configure Logstash streaming logs outside of Hopsworks you will need SSH access to the cluster (Logstash node). Also, depending on the target system you might
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To configure Logstash to stream logs.

need authentication tokens or opening firewall rules.

## Export logs
Logstash is a well established log collection service with many output [plugins](https://www.elastic.co/guide/en/logstash/7.17/output-plugins.html) available.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logstash is not a log collection service, it's just a processing pipeline.

Logstash process logs in *pipelines* where each pipeline is responsible for a logical group of logs. In Hopsworks we have multiple pipelines and their configuration files are under `/srv/hops/logstash/config`

### Export services logs
To stream various services' logs outside of Hopsworks you will need to **create another pipeline** similar to `services`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar to the services pipeline.

!!! note
Take a note of the pipeline address as we will use it in Step 2

At the end of the file is the `output` section which currently forwards them to OpenSearch. Replace the output section with a sample block such as
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forwards the logs to OpenSearch

pipeline.batch.size: 50
```

**Instruct** the services pipeline to push logs also in the newly created pipeline by appending to `services-intake` for example:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also to the newly created pipeline



### Export Spark logs
To stream applications' logs to another system the Steps are fairly similar to exporting services logs but need some additional configuration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lower s in Steps.



### Export Spark logs
To stream applications' logs to another system the Steps are fairly similar to exporting services logs but need some additional configuration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also specify Spark applications logs so users don't confuse with Hopsworks application.

Finally you should restart Logstash `sudo systemctl restart logstash`

## Conclusion
It is not easy to write a guide for a task that can be achieved in many different ways but in this guide we gave solid
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users don't care that it's not easy. Just refer them to the different plugins configuration to understand how they need to configure their new pipeline to send data wherever they need to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants