Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Socket Logging Handler, enable json format for it #43232

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

troosan
Copy link
Contributor

@troosan troosan commented Sep 12, 2024

This pull request is a followup of PR #23128 and is greatly inspired by the following provided by @geoand https://github.com/quarkusio/quarkus/compare/main...geoand:%2323127?expand=1

It allows sending logs in ECS format to a socket (typically a logstash tcp input)

fixes #23127 and answers my own question on StackOverFlow

@troosan
Copy link
Contributor Author

troosan commented Sep 12, 2024

I'm still working on updating the documentation for it (logging and centralized-log-management)

The logging one is easy, the centralized log management is more complicated, as it promotes the use of the GELF format.
I think I'll just add a section under "GELF alternative: use Syslog" for now.

But I believe logging in ECS format should be promoted over the GELF format as it fits better in the ELK stack

@geoand
Copy link
Contributor

geoand commented Sep 12, 2024

Looks good, thanks!

@dmlloyd any comments?

Copy link

github-actions bot commented Sep 12, 2024

🎊 PR Preview 1ff7c69 has been successfully built and deployed to https://quarkus-pr-main-43232-preview.surge.sh/version/main/guides/

  • Images of blog posts older than 3 months are not available.
  • Newsletters older than 3 months are not available.

@quarkus-bot

This comment has been minimized.

@troosan
Copy link
Contributor Author

troosan commented Sep 12, 2024

The resulting json is the following

{
            "threadId": 449,
           "timestamp": "2024-09-12T08:56:17.609151+02:00",
          "@timestamp": 2024-09-12T06:56:17.610853593Z,
          "threadName": "Shutdown thread",
             "message": "my-app stopped in 0.118s",
                 "ndc": "",
            "sequence": 4335,
     "loggerClassName": "org.jboss.logging.Logger",
          "loggerName": "io.quarkus",
         "processName": "/Users/antoine/.sdkman/candidates/java/17.0.9-graalce/bin/java",
           "processId": 65262,
               "level": "INFO",
            "@version": "1",
                "mdc": {},
            "hostName": "macbook-m1-pro.localdomain"
}

I guess it's a good start, but we could add default values for parameters as serviceName, serviceVersion, serviceEnvironment, ...

https://www.elastic.co/guide/en/ecs-logging/java/1.x/setup.html#_add_the_dependency

or we let the generation of an ECS compatible json to the https://github.com/quarkiverse/quarkus-logging-json/blob/main/runtime/src/main/java/io/quarkiverse/loggingjson/LoggingJsonRecorder.java#L92 extension.

I'm not sure why there are 2 impl for json logging actually.

@quarkus-bot

This comment has been minimized.

@quarkus-bot

This comment has been minimized.

@troosan
Copy link
Contributor Author

troosan commented Sep 13, 2024

@dmlloyd @geoand should we natively provide an valid ECS json?

or should I update the documentation to mention the quarkiverse version of the quarkus-logging-json, which supports ECS encoding. This will need a PR for that extension though to enable using the socket handler.

@geoand
Copy link
Contributor

geoand commented Sep 13, 2024

@loicmathieu you've used the logging-json stuff IIRC, what's your take?

@loicmathieu
Copy link
Contributor

quarkus-logging-json is the official way to log in JSON.

What I don't understand is why it mandates modification in the logging-json library, socket loggings should be a log handler so it should work with all supported log format/pattern?

By the way, logging-gelf already support that.

@troosan
Copy link
Contributor Author

troosan commented Sep 13, 2024

I think the reason this required a change in the logging-json is because it's configured in a different way.
The json logging can be configured per handler (console, file and now socket) whereas the gelf handler handles only it's own format.
Maybe indeed this should be reversed, and the socket handler should allow only the ECS format, just as the GELF handler is allowing only the GELF format.

I know the gelf logging already supports sending logs to logstash, but my operations team is not too keen on enabling & maintaining another input (and the filters that go with it) in the logstash pipeline.

@loicmathieu
Copy link
Contributor

OK then.

@troosan
Copy link
Contributor Author

troosan commented Sep 16, 2024

@geoand @gsmet Contrary to what I put int the documentation, with this current PR, the format of the JSON does not follow the ECS standards.

For now I have gone around this by setting field key overrides through the configuration, hiding/adding additional fields, like this:

quarkus.log.socket.json.exception-output-type=formatted
quarkus.log.socket.json.key-overrides=timestamp=@timestamp,\
	logger_name=log.logger,\
	level=log.level,\
	thread_name=process.thread.name,\
	thread_id=process.thread.id,\
	host_name=host.hostname,\
	stack_trace=error.stack_trace
quarkus.log.socket.json.excluded-keys=sequence,loggerClassName
quarkus.log.socket.json.additional-field."service.environment".value=${quarkus.profile}
quarkus.log.socket.json.additional-field."service.name".value=${quarkus.application.name}
quarkus.log.socket.json.additional-field."service.version".value=${quarkus.application.version}
quarkus.log.socket.json.additional-field."data_stream.type".value=logs
quarkus.log.socket.json.additional-field."ecs.version".value=1.12

This is only however not a very clean solution, I could also do this programmatically like this in the LoggingJsonRecorder

    public RuntimeValue<Optional<Formatter>> initializeSocketJsonLogging(JsonLogConfig config) {
        if (config.socketJson.logFormat == JsonConfig.LogFormat.ECS) {
            addEcsFields(config.socketJson);
        }
        return getFormatter(config.socketJson);
    }

    private void addEcsFields(JsonConfig config) {
        EnumMap<Key, String> keyOverrides = PropertyValues.stringToEnumMap(Key.class, config.keyOverrides.orElse(null));
        keyOverrides.putIfAbsent(Key.TIMESTAMP, "@timestamp");
        keyOverrides.putIfAbsent(Key.LOGGER_CLASS_NAME, "log.logger");
        keyOverrides.putIfAbsent(Key.LOGGER_NAME, "log.logger");
        keyOverrides.putIfAbsent(Key.LEVEL, "log.level");
        keyOverrides.putIfAbsent(Key.PROCESS_ID, "process.pid");
        keyOverrides.putIfAbsent(Key.PROCESS_NAME, "process.name");
        keyOverrides.putIfAbsent(Key.THREAD_NAME, "process.thread.name");
        keyOverrides.putIfAbsent(Key.THREAD_ID, "process.thread.id");
        keyOverrides.putIfAbsent(Key.HOST_NAME, "host.hostname");
        keyOverrides.putIfAbsent(Key.SEQUENCE, "event.sequence");
        keyOverrides.putIfAbsent(Key.EXCEPTION_MESSAGE, "error.message");
        keyOverrides.putIfAbsent(Key.STACK_TRACE, "error.stack_trace");
        config.keyOverrides = Optional.of(PropertyValues.mapToString(keyOverrides));

        config.additionalField.computeIfAbsent("ecs.version", k -> buildFieldConfig("1.12.2", Type.STRING));
        config.additionalField.computeIfAbsent("data_stream.type", k -> buildFieldConfig("logs", Type.STRING));

        quarkusConfig.getOptionalValue("quarkus.application.name", String.class).ifPresent(
                s -> config.additionalField.computeIfAbsent("service.name", k -> buildFieldConfig(s, Type.STRING)));
        quarkusConfig.getOptionalValue("quarkus.application.version", String.class).ifPresent(
                s -> config.additionalField.computeIfAbsent("service.version", k -> buildFieldConfig(s, Type.STRING)));
        quarkusConfig.getOptionalValue("quarkus.profile", String.class).ifPresent(
                s -> config.additionalField.computeIfAbsent("service.environment", k -> buildFieldConfig(s, Type.STRING)));
    }

which is not very clean either.

Also missing is the addition of fields like trace.id and span.id instead (or in addition to) of having them in the mdc.traceId and mdc.spanId

So question is, shall I commit this working (but far from perfect) solution and adapt the documentation accordingly?
Or do I just specify in the documentation how to make it ECS compatible through properties?

@geoand
Copy link
Contributor

geoand commented Sep 17, 2024

Or do I just specify in the documentation how to make it ECS compatible through properties?

I think this is best for now. @loicmathieu WDYT?

@troosan
Copy link
Contributor Author

troosan commented Sep 19, 2024

@geoand in case we go for the code solution instead of the properties solution, do you know how I can get the traceId and spanId? Those will obviously only be available if the OpenTelemetry dependency is present. Is there a way to retrieve those two values without relying on the io.opentelemetry.context classes?
I would like to add those in the ECS tracing fields
My current solution is to copy them in the Logstash pipeline with the following filter but it would be nicer to natively support it.

filter {
  if ![span][id] and [mdc][spanId] {
    mutate { rename => { "[mdc][spanId]" => "[span][id]" } }
  }
  if ![trace][id] and [mdc][traceId] {
    mutate { rename => {"[mdc][traceId]" => "[trace][id]"} }
  }
}

@geoand
Copy link
Contributor

geoand commented Sep 19, 2024

@geoand in case we go for the code solution instead of the properties solution, do you know how I can get the traceId and spanId? Those will obviously only be available if the OpenTelemetry dependency is present. Is there a way to retrieve those two values without relying on the io.opentelemetry.context classes?

@brunobat can give you the best way to access those, although in order to make reduce unnecessary couplinh, we'd need to put some kind of SPI in place.

@brunobat
Copy link
Contributor

@troosan You need access to the context like the MDCEnabledContextStorage does:

But this data is maintained by us in the MDC context, therefore, your solution sounds ok.

@loicmathieu
Copy link
Contributor

loicmathieu commented Sep 19, 2024

I think we should have a util for that that can extract the spanId/transactionId if present as we need to do it for all handlers. For ex it would be useful to be able to forward these ids to Google Cloud Loggin, today it must be done manually by implementing a trace extractor: https://docs.quarkiverse.io/quarkus-google-cloud-services/main/logging.html#_tracing

This is a cross-extension concern so we should have something in the core that didn't depends on OTEL.

@troosan

Or do I just specify in the documentation how to make it ECS compatible through properties?

Yes, again we may want to provide something that will work cross handler, for example this is the code we have for Google Cloud Logging: https://github.com/quarkiverse/quarkus-google-cloud-services/blob/main/logging/runtime/src/main/java/io/quarkiverse/googlecloudservices/logging/runtime/ecs/EscJsonFormat.java

@troosan
Copy link
Contributor Author

troosan commented Sep 19, 2024

@loicmathieu that is indeed exactly what we need, Ideally the generation of the json (ECS or not) and where to send it (Google, Logstash, ...) should be decoupled. Otherwise we're just duplicating code all over.

@brunobat
Copy link
Contributor

brunobat commented Sep 19, 2024

This is a cross-extension concern so we should have something in the core that didn't depends on OTEL.

But all tracing is currently impl by OTel...
What we can have is some service you can inject and get the Otel tracing data in a simple way.

@quarkus-bot

This comment has been minimized.

@quarkus-bot

This comment has been minimized.

@troosan
Copy link
Contributor Author

troosan commented Sep 23, 2024

@geoand I have updated the documentation on how to make the json ECS compatible through properties.
https://quarkus-pr-main-43232-preview.surge.sh/version/main/guides/centralized-log-management#logstash_ecs

Do you think this could get merged as is in next iteration? (I can squash the commits before merging)

Longer term the logging facility probably needs to be reviewed to avoid unnecessary duplication across the extensions.

Copy link
Member

@gsmet gsmet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some minor suggestions and questions.

Other than that, I think the patch looks good and it's a very nice contribution.

docs/src/main/asciidoc/centralized-log-management.adoc Outdated Show resolved Hide resolved
docs/src/main/asciidoc/centralized-log-management.adoc Outdated Show resolved Hide resolved
docs/src/main/asciidoc/centralized-log-management.adoc Outdated Show resolved Hide resolved
docs/src/main/asciidoc/centralized-log-management.adoc Outdated Show resolved Hide resolved
docs/src/main/asciidoc/logging.adoc Outdated Show resolved Hide resolved
import io.quarkus.runtime.annotations.ConfigItem;

@ConfigGroup
public class SocketConfig {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late review, but if this is being used specifically to talk to one kind of service, then should we be having a general socket config? I think that users might have a harder time configuring a generic socket handler to talk to some service than if they could have a configuration specific to the kind of service they're talking to.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theoretically the socket to which you are writing is independent of the format in which you send the data to it.
The current setup will allow you to format the logs however you want, JSON is only one of those.

The Socket handler should allow you to send logs to Splunk too for instance. I've not tried it out though.

* socket logging formatting configuration and use the formatter provided instead. If multiple formatters
* are enabled at runtime, a warning message is printed and only one is used.
*/
public final class LogSocketFormatBuildItem extends MultiBuildItem {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this break if we have more than one socket handler configured for some reason?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean that we should be able to work with "named" handlers? To allow writing logs to 2 different socket appenders? (Logstash and Splunk for instance)

I must admit I don't know. This is just a copy of LogFileFormatBuildItem for instance

@quarkus-bot

This comment has been minimized.

@quarkus-bot

This comment has been minimized.

@troosan
Copy link
Contributor Author

troosan commented Sep 28, 2024

I added some minor suggestions and questions.

Other than that, I think the patch looks good and it's a very nice contribution.

Fixed what you reported, squashed and forced pushed again. For some reason one of the tests fails (native HTTP tests), can it be rerun? On my machine those work fine.

@geoand geoand requested a review from gsmet September 30, 2024 11:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Logging into socket
6 participants