Skip to content

Commit

Permalink
Log raw events to a separate log file (#38767)
Browse files Browse the repository at this point in the history
This commit introduces a new logger core that can be configured through logging.event_data and is used to log any message that contains the whole event or could contain any sensitive data. This is accomplished by adding log.type: event to the log entry. The logger core is responsible for filtering the log entries and directing them to the correct files.

At the moment it is used by multiple outputs to log indexing errors containing the whole event and errors returned by Elasticsearch that can potentially contain the whole event.

The debug processor is also using the new logger core. The "Publish event:..." log entries are now directed to the event log file.

---------

Co-authored-by: Pierre HILBERT <[email protected]>
  • Loading branch information
belimawr and pierrehilbert authored May 23, 2024
1 parent 44679fb commit de3318d
Show file tree
Hide file tree
Showing 36 changed files with 1,154 additions and 104 deletions.
2 changes: 1 addition & 1 deletion CHANGELOG.next.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -198,7 +198,7 @@ https://github.com/elastic/beats/compare/v8.8.1\...main[Check the HEAD diff]
- The environment variable `BEATS_ADD_CLOUD_METADATA_PROVIDERS` overrides configured/default `add_cloud_metadata` providers {pull}38669[38669]
- Introduce log message for not supported annotations for Hints based autodiscover {pull}38213[38213]
- Add persistent volume claim name to volume if available {pull}38839[38839]

- Raw events are now logged to a different file, this prevents potentially sensitive information from leaking into log files {pull}38767[38767]

*Auditbeat*

Expand Down
48 changes: 48 additions & 0 deletions auditbeat/auditbeat.reference.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1549,6 +1549,54 @@ logging.files:
# file. Defaults to true.
# rotateonstartup: true

#=============================== Events Logging ===============================
# Some outputs will log raw events on errors like indexing errors in the
# Elasticsearch output, to prevent logging raw events (that may contain
# sensitive information) together with other log messages, a different
# log file, only for log entries containing raw events, is used. It will
# use the same level, selectors and all other configurations from the
# default logger, but it will have it's own file configuration.
#
# Having a different log file for raw events also prevents event data
# from drowning out the regular log files.
#
# IMPORTANT: No matter the default logger output configuration, raw events
# will **always** be logged to a file configured by `logging.event_data.files`.

# logging.event_data:
# Logging to rotating files. Set logging.to_files to false to disable logging to
# files.
#logging.event_data.to_files: true
#logging.event_data:
# Configure the path where the logs are written. The default is the logs directory
# under the home path (the binary location).
#path: /var/log/auditbeat

# The name of the files where the logs are written to.
#name: auditbeat-event-data

# Configure log file size limit. If the limit is reached, log file will be
# automatically rotated.
#rotateeverybytes: 5242880 # = 5MB

# Number of rotated log files to keep. The oldest files will be deleted first.
#keepfiles: 2

# The permissions mask to apply when rotating log files. The default value is 0600.
# Must be a valid Unix-style file permissions mask expressed in octal notation.
#permissions: 0600

# Enable log file rotation on time intervals in addition to the size-based rotation.
# Intervals must be at least 1s. Values of 1m, 1h, 24h, 7*24h, 30*24h, and 365*24h
# are boundary-aligned with minutes, hours, days, weeks, months, and years as
# reported by the local system clock. All other intervals are calculated from the
# Unix epoch. Defaults to disabled.
#interval: 0

# Rotate existing logs on startup rather than appending them to the existing
# file. Defaults to false.
# rotateonstartup: false

# ============================= X-Pack Monitoring ==============================
# Auditbeat can export internal metrics to a central Elasticsearch monitoring
# cluster. This requires xpack monitoring to be enabled in Elasticsearch. The
Expand Down
48 changes: 48 additions & 0 deletions filebeat/filebeat.reference.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2640,6 +2640,54 @@ logging.files:
# file. Defaults to true.
# rotateonstartup: true

#=============================== Events Logging ===============================
# Some outputs will log raw events on errors like indexing errors in the
# Elasticsearch output, to prevent logging raw events (that may contain
# sensitive information) together with other log messages, a different
# log file, only for log entries containing raw events, is used. It will
# use the same level, selectors and all other configurations from the
# default logger, but it will have it's own file configuration.
#
# Having a different log file for raw events also prevents event data
# from drowning out the regular log files.
#
# IMPORTANT: No matter the default logger output configuration, raw events
# will **always** be logged to a file configured by `logging.event_data.files`.

# logging.event_data:
# Logging to rotating files. Set logging.to_files to false to disable logging to
# files.
#logging.event_data.to_files: true
#logging.event_data:
# Configure the path where the logs are written. The default is the logs directory
# under the home path (the binary location).
#path: /var/log/filebeat

# The name of the files where the logs are written to.
#name: filebeat-event-data

# Configure log file size limit. If the limit is reached, log file will be
# automatically rotated.
#rotateeverybytes: 5242880 # = 5MB

# Number of rotated log files to keep. The oldest files will be deleted first.
#keepfiles: 2

# The permissions mask to apply when rotating log files. The default value is 0600.
# Must be a valid Unix-style file permissions mask expressed in octal notation.
#permissions: 0600

# Enable log file rotation on time intervals in addition to the size-based rotation.
# Intervals must be at least 1s. Values of 1m, 1h, 24h, 7*24h, 30*24h, and 365*24h
# are boundary-aligned with minutes, hours, days, weeks, months, and years as
# reported by the local system clock. All other intervals are calculated from the
# Unix epoch. Defaults to disabled.
#interval: 0

# Rotate existing logs on startup rather than appending them to the existing
# file. Defaults to false.
# rotateonstartup: false

# ============================= X-Pack Monitoring ==============================
# Filebeat can export internal metrics to a central Elasticsearch monitoring
# cluster. This requires xpack monitoring to be enabled in Elasticsearch. The
Expand Down
136 changes: 136 additions & 0 deletions filebeat/tests/integration/event_log_file_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
// Licensed to Elasticsearch B.V. under one or more contributor
// license agreements. See the NOTICE file distributed with
// this work for additional information regarding copyright
// ownership. Elasticsearch B.V. licenses this file to you under
// the Apache License, Version 2.0 (the "License"); you may
// not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.

//go:build integration

package integration

import (
"fmt"
"os"
"path/filepath"
"strings"
"testing"
"time"

"github.com/stretchr/testify/require"

"github.com/elastic/beats/v7/libbeat/tests/integration"
)

var eventsLogFileCfg = `
filebeat.inputs:
- type: filestream
id: filestream-input-id
enabled: true
parsers:
- ndjson:
target: ""
overwrite_keys: true
expand_keys: true
add_error_key: true
ignore_decoding_error: false
paths:
- %s
output:
elasticsearch:
hosts:
- localhost:9200
protocol: http
username: admin
password: testing
logging:
level: info
event_data:
files:
name: filebeat-my-event-log
`

func TestEventsLoggerESOutput(t *testing.T) {
// First things first, ensure ES is running and we can connect to it.
// If ES is not running, the test will timeout and the only way to know
// what caused it is going through Filebeat's logs.
integration.EnsureESIsRunning(t)

filebeat := integration.NewBeat(
t,
"filebeat",
"../../filebeat.test",
)

logFilePath := filepath.Join(filebeat.TempDir(), "log.log")
filebeat.WriteConfigFile(fmt.Sprintf(eventsLogFileCfg, logFilePath))

logFile, err := os.Create(logFilePath)
if err != nil {
t.Fatalf("could not create file '%s': %s", logFilePath, err)
}

_, _ = logFile.WriteString(`
{"message":"foo bar","int":10,"string":"str"}
{"message":"another message","int":20,"string":"str2"}
{"message":"index failure","int":"not a number","string":10}
{"message":"second index failure","int":"not a number","string":10}
`)
if err := logFile.Sync(); err != nil {
t.Fatalf("could not sync log file '%s': %s", logFilePath, err)
}
if err := logFile.Close(); err != nil {
t.Fatalf("could not close log file '%s': %s", logFilePath, err)
}

filebeat.Start()

// Wait for a log entry that indicates an entry in the events
// logger file.
msg := "Cannot index event (status=400)"
require.Eventually(t, func() bool {
return filebeat.LogContains(msg)
}, time.Minute, 100*time.Millisecond,
fmt.Sprintf("String '%s' not found on Filebeat logs", msg))

// The glob here matches the configured value for the filename
glob := filepath.Join(filebeat.TempDir(), "filebeat-my-event-log*.ndjson")
files, err := filepath.Glob(glob)
if err != nil {
t.Fatalf("could not read files matching glob '%s': %s", glob, err)
}
if len(files) != 1 {
t.Fatalf("there must be only one file matching the glob '%s', found: %s", glob, files)
}

eventsLogFile := files[0]
data, err := os.ReadFile(eventsLogFile)
if err != nil {
t.Fatalf("could not read '%s': %s", eventsLogFile, err)
}

strData := string(data)
eventMsg := "not a number"
if !strings.Contains(strData, eventMsg) {
t.Errorf("expecting to find '%s' on '%s'", eventMsg, eventsLogFile)
t.Errorf("Contents:\n%s", strData)
t.FailNow()
}

// Ensure the normal log file does not contain the event data
if filebeat.LogContains(eventMsg) {
t.Fatalf("normal log file must NOT contain event data, '%s' found in the logs", eventMsg)
}
}
11 changes: 6 additions & 5 deletions filebeat/tests/system/test_reload_inputs.py
Original file line number Diff line number Diff line change
Expand Up @@ -105,9 +105,9 @@ def test_start_stop(self):

self.wait_until(lambda: self.output_lines() == 1)

# Remove input
with open(self.working_dir + "/configs/input.yml", 'w') as f:
f.write("")
# Remove input by moving the file
# we keep it around to help debugging
os.rename(self.working_dir + "/configs/input.yml", self.working_dir + "/configs/input.yml.disabled")

# Wait until input is stopped
self.wait_until(
Expand Down Expand Up @@ -152,8 +152,9 @@ def test_start_stop_replace(self):
self.wait_until(lambda: self.output_lines() == 1)

# Remove input
with open(self.working_dir + "/configs/input.yml", 'w') as f:
f.write("")
# Remove input by moving the file
# we keep it around to help debugging
os.rename(self.working_dir + "/configs/input.yml", self.working_dir + "/configs/input.yml.disabled")

# Wait until input is stopped
self.wait_until(
Expand Down
6 changes: 3 additions & 3 deletions filebeat/tests/system/test_reload_modules.py
Original file line number Diff line number Diff line change
Expand Up @@ -144,9 +144,9 @@ def test_start_stop(self):
self.wait_until(lambda: self.output_lines() == 1, max_timeout=10)
print(self.output_lines())

# Remove input
with open(self.working_dir + "/configs/system.yml", 'w') as f:
f.write("")
# Remove input by moving the file
# we keep it around to help debugging
os.rename(self.working_dir + "/configs/system.yml", self.working_dir + "/configs/system.yml.disabled")

# Wait until input is stopped
self.wait_until(
Expand Down
48 changes: 48 additions & 0 deletions heartbeat/heartbeat.reference.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1636,6 +1636,54 @@ logging.files:
# file. Defaults to true.
# rotateonstartup: true

#=============================== Events Logging ===============================
# Some outputs will log raw events on errors like indexing errors in the
# Elasticsearch output, to prevent logging raw events (that may contain
# sensitive information) together with other log messages, a different
# log file, only for log entries containing raw events, is used. It will
# use the same level, selectors and all other configurations from the
# default logger, but it will have it's own file configuration.
#
# Having a different log file for raw events also prevents event data
# from drowning out the regular log files.
#
# IMPORTANT: No matter the default logger output configuration, raw events
# will **always** be logged to a file configured by `logging.event_data.files`.

# logging.event_data:
# Logging to rotating files. Set logging.to_files to false to disable logging to
# files.
#logging.event_data.to_files: true
#logging.event_data:
# Configure the path where the logs are written. The default is the logs directory
# under the home path (the binary location).
#path: /var/log/heartbeat

# The name of the files where the logs are written to.
#name: heartbeat-event-data

# Configure log file size limit. If the limit is reached, log file will be
# automatically rotated.
#rotateeverybytes: 5242880 # = 5MB

# Number of rotated log files to keep. The oldest files will be deleted first.
#keepfiles: 2

# The permissions mask to apply when rotating log files. The default value is 0600.
# Must be a valid Unix-style file permissions mask expressed in octal notation.
#permissions: 0600

# Enable log file rotation on time intervals in addition to the size-based rotation.
# Intervals must be at least 1s. Values of 1m, 1h, 24h, 7*24h, 30*24h, and 365*24h
# are boundary-aligned with minutes, hours, days, weeks, months, and years as
# reported by the local system clock. All other intervals are calculated from the
# Unix epoch. Defaults to disabled.
#interval: 0

# Rotate existing logs on startup rather than appending them to the existing
# file. Defaults to false.
# rotateonstartup: false

# ============================= X-Pack Monitoring ==============================
# Heartbeat can export internal metrics to a central Elasticsearch monitoring
# cluster. This requires xpack monitoring to be enabled in Elasticsearch. The
Expand Down
Loading

0 comments on commit de3318d

Please sign in to comment.