diff --git a/docs/readers.md b/docs/readers.md index 2922937..6006df1 100644 --- a/docs/readers.md +++ b/docs/readers.md @@ -2,10 +2,10 @@ - [class `LogReader`](#class-logreader) - [`new LogReader(source)`](#new-logreadersource) - - [`reader.tail(minTimestamp[, options])`](#readertailmintimestamp-options) + - [`reader.tail([minTimestamp[, options]])`](#readertailmintimestamp-options) - [`reader.range(minTimestamp, maxTimestamp)`](#readerrangemintimestamp-maxtimestamp) - [`reader.rangeReversed(minTimestamp, maxTimestamp)`](#readerrangereversedmintimestamp-maxtimestamp) - - [`reader.bulkTail(minTimestamp[, options])`](#readerbulktailmintimestamp-options) + - [`reader.bulkTail([minTimestamp[, options]])`](#readerbulktailmintimestamp-options) - [`reader.bulkRange(minTimestamp, maxTimestamp)`](#readerbulkrangemintimestamp-maxtimestamp) - [`reader.bulkRangeReversed(minTimestamp, maxTimestamp)`](#readerbulkrangereversedmintimestamp-maxtimestamp) - [class `LogEntry`](#class-logentry) @@ -23,91 +23,353 @@ - [`new LogDirectorySource(dirname[, options])`](#new-logdirectorysourcedirname-options) - [class `Vfs`](#class-vfs) - [`new Vfs(options)`](#new-vfsoptions) -- [enum `LogType`](#class-logtype) -- [enum `LogLevel`](#class-loglevel) -- [enum `Lifecycle`](#class-lifecycle) -- [enum `HttpMethod`](#class-httpmethod) +- [enum `LogType`](#enum-logtype) +- [enum `LogLevel`](#enum-loglevel) +- [enum `Lifecycle`](#enum-lifecycle) +- [enum `HttpMethod`](#enum-httpmethod) # class *LogReader* +This class is the primary interface used to read logs. Logs are stored in an efficient binary format, so they can't be read directly by humans. + +LogReader supports two main ways of reading logs: + +- Reading a **range** of logs, between a minimum and maximum timestamp. +- Reading the **tail** of logs, which means reading the latest logs in real-time, as they are written. + ### new LogReader(source) -### reader.tail(minTimestamp[, options]) +- `source` [<Vfs>][Vfs] The source from which to read logs. In most cases this will be a [LogDirectorySource][LogDirectorySource], which reads logs from files within a log directory. + +```js +const dirname = './logs'; +const reader = new LogReader(new LogDirectorySource(dirname)); +``` + +### reader.tail([minTimestamp[, options]]) + +- `minTimestamp` [<number>][number] The minimum timestamp (inclusive). This indicates where to start reading from. **Default:** `Date.now()`. +- `options` [<Object>][Object] + - `pollInterval` [<number>][number] The delay, in milliseconds, between each poll to the underlying log source (i.e., the filesystem). This controls how frequently new logs can be detected. +- Returns: [<AsyncIterable][AsyncIterable][<LogEntry>][LogEntry][>][AsyncIterable] + +Yields all logs with a timestamp greater than or equal to `minTimestamp`. New logs will be yielded indefinitely, until the returned [AsyncIterable][AsyncIterable] is stopped (i.e., by a `break` statement). + +Logs are yielded in ascending order (older logs first). + +> Note that logs are only ordered chronologically when compared to logs written by the same process. In other words, logs written by different workers in a [server cluster](https://nodejs.org/api/cluster.html) are not guaranteed to be in exact chronological order. + +```js +for await (const log of reader.tail()) { + console.log(log.timestamp); +} +``` ### reader.range(minTimestamp, maxTimestamp) +- `minTimestamp` [<number>][number] The minimum timestamp (inclusive). This indicates where to start reading from. +- `maxTimestamp` [<number>][number] The maximum timestamp (inclusive). This indicates when to stop reading. +- Returns: [<AsyncIterable][AsyncIterable][<LogEntry>][LogEntry][>][AsyncIterable] + +Yields all logs with a timestamp greater than or equal to `minTimestamp`, and less than or equal to `maxTimestamp`. + +Logs are yielded in ascending order (older logs first). + +> Note that logs are only ordered chronologically when compared to logs written by the same process. In other words, logs written by different workers in a [server cluster](https://nodejs.org/api/cluster.html) are not guaranteed to be in exact chronological order. + +```js +const DAY = 1000 * 60 * 60 * 24; + +for await (const log of reader.range(Date.now() - DAY, Date.now())) { + console.log(log.timestamp); +} +``` + ### reader.rangeReversed(minTimestamp, maxTimestamp) -### reader.bulkTail(minTimestamp[, options]) +- `minTimestamp` [<number>][number] The minimum timestamp (inclusive). This indicates when to stop reading. +- `maxTimestamp` [<number>][number] The maximum timestamp (inclusive). This indicates where to start reading from. +- Returns: [<AsyncIterable][AsyncIterable][<LogEntry>][LogEntry][>][AsyncIterable] + +This is the same as [`reader.range()`](#readerrangemintimestamp-maxtimestamp), except the logs are yielded in the reverse order (newer logs first). + +> Note that logs are only ordered chronologically when compared to logs written by the same process. In other words, logs written by different workers in a [server cluster](https://nodejs.org/api/cluster.html) are not guaranteed to be in exact chronological order. + +```js +const DAY = 1000 * 60 * 60 * 24; + +for await (const log of reader.rangeReversed(Date.now() - DAY, Date.now())) { + console.log(log.timestamp); +} +``` + +### reader.bulkTail([minTimestamp[, options]]) + +- `minTimestamp` [<number>][number] The minimum timestamp (inclusive). This indicates where to start reading from. **Default:** `Date.now()`. +- `options` [<Object>][Object] + - `pollInterval` [<number>][number] The delay, in milliseconds, between each poll to the underlying log source (i.e., the filesystem). This controls how frequently new logs can be detected. **Default:** `200`. +- Returns: [<AsyncIterable][AsyncIterable][<Buffer>][Buffer][>][AsyncIterable] + +Yields chunks of raw binary logs, such that the chunks include all logs with a timestamp greater than or equal to `minTimestamp`. Newly written chunks will be yielded indefinitely, until the returned [AsyncIterable][AsyncIterable] is stopped (i.e., by a `break` statement). + +All "bulk" methods (including this one) may include logs outside the specified timestamp bounds. The only guarantee is that no logs *within* bounds will be missing. The bulk methods provide an efficient way of piping logs to an external system (perhaps over a network). On the receiving end, [BulkParser][BulkParser] can be used to parse the raw chunks into [LogEntry][LogEntry]s. + +Logs are yielded in ascending order (older logs first). + +> Note that logs are only ordered chronologically when compared to logs written by the same process. In other words, logs written by different workers in a [server cluster](https://nodejs.org/api/cluster.html) are not guaranteed to be in exact chronological order. + +```js +for await (const chunk of reader.bulkTail()) { + socket.write(chunk); +} +``` ### reader.bulkRange(minTimestamp, maxTimestamp) +- `minTimestamp` [<number>][number] The minimum timestamp (inclusive). This indicates where to start reading from. +- `maxTimestamp` [<number>][number] The maximum timestamp (inclusive). This indicates when to stop reading. +- Returns: [<AsyncIterable][AsyncIterable][<Buffer>][Buffer][>][AsyncIterable] + +Yields chunks of raw binary logs, such that the chunks include all logs with a timestamp greater than or equal to `minTimestamp`, and less than or equal to `maxTimestamp`. + +All "bulk" methods (including this one) may include logs outside the specified timestamp bounds. The only guarantee is that no logs *within* bounds will be missing. The bulk methods provide an efficient way of piping logs to an external system (perhaps over a network). On the receiving end, [BulkParser][BulkParser] can be used to parse the raw chunks into [LogEntry][LogEntry]s. + +Logs are yielded in ascending order (older logs first). + +> Note that logs are only ordered chronologically when compared to logs written by the same process. In other words, logs written by different workers in a [server cluster](https://nodejs.org/api/cluster.html) are not guaranteed to be in exact chronological order. + +```js +const DAY = 1000 * 60 * 60 * 24; + +for await (const chunk of reader.bulkRange(Date.now() - DAY, Date.now())) { + socket.write(chunk); +} +``` + ### reader.bulkRangeReversed(minTimestamp, maxTimestamp) +- `minTimestamp` [<number>][number] The minimum timestamp (inclusive). This indicates when to stop reading. +- `maxTimestamp` [<number>][number] The maximum timestamp (inclusive). This indicates where to start reading from. +- Returns: [<AsyncIterable][AsyncIterable][<Buffer>][Buffer][>][AsyncIterable] + +This is the same as [`reader.bulkRange()`](#readerbulkrangemintimestamp-maxtimestamp), except the logs are yielded in the reverse order (newer logs first). + +> Note that logs are only ordered chronologically when compared to logs written by the same process. In other words, logs written by different workers in a [server cluster](https://nodejs.org/api/cluster.html) are not guaranteed to be in exact chronological order. + +```js +const DAY = 1000 * 60 * 60 * 24; + +for await (const chunk of reader.bulkRangeReversed(Date.now() - DAY, Date.now())) { + socket.write(chunk); +} +``` + # class *LogEntry* +This class represents individual log entries. Every LogEntry has the following properties: + +- `log.timestamp` [<number>][number] A millisecond unix timestamp indicating when the log was written. +- `log.nonce` [<number>][number] An unsigned 16-bit integer. Logs can be uniquely identified by combining their `timestamp`, `nonce`, and `workerId`. +- `log.level` [<number>][number] The log level. Possible values are defined by the [LogLevel](#enum-loglevel) enum. +- `log.type` [<number>][number] The log type. Possible values are defined by the [LogType](#enum-logtype) enum. +- `log.workerId`: [<number>][number] | [<null>][null] The ID of the worker that wrote this log. A value of `null` indicates that it was written by the master process of the [server cluster](https://nodejs.org/api/cluster.html). + +Additional properties depend on the log's [type](#enum-logtype): + +- `log.type === LogType.REQUEST`: + - `log.requestId` [<Buffer>][Buffer] A 16-byte [UUIDv4](https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_(random)) used to uniquely identify an HTTP request. + - `log.httpVersionMajor` [<number>][number] The first integer of the request's HTTP version. + - `log.httpVersionMinor` [<number>][number] The second integer of the request's HTTP version. + - `log.ipAddress` [<number>][number] | [<Buffer>][Buffer] Either the [IPv4 address](https://en.wikipedia.org/wiki/Internet_Protocol_version_4) as an unsigned 32-bit integer, or the [IPv6 address](https://en.wikipedia.org/wiki/IPv6) as a 16-byte [<Buffer>][Buffer]. + - `log.method` [<number>][number] | [<string>][string] Either a well-known HTTP method defined by the [HttpMethod](#enum-httpmethod) enum, or an unrecognized HTTP method as a literal string. + - `log.url` [<string>][string] The URL that was sent in the request's header ([`req.url`](https://nodejs.org/api/http.html#messageurl)). +- `log.type === LogType.REQUEST_META`: + - `log.requestId` [<Buffer>][Buffer] A 16-byte [UUIDv4](https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_(random)) used to uniquely identify an HTTP request. + - `log.data` [<string>][string] Application-specific metadata that was associated with a request, stored as stringified JSON. +- `log.type === LogType.RESPONSE`: + - `log.requestId` [<Buffer>][Buffer] A 16-byte [UUIDv4](https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_(random)) used to uniquely identify an HTTP request. + - `log.statusCode` [<number>][number] The status code used to respond to the request. + - `log.error` [<string>][string] | [<null>][null] Details describing an exception that occured while trying to handle the request. To get a human-readable version, use [`log.getError()`](#loggeterror). +- `log.type === LogType.RESPONSE_FINISHED`: + - `log.requestId` [<Buffer>][Buffer] A 16-byte [UUIDv4](https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_(random)) used to uniquely identify an HTTP request. + - `log.error` [<string>][string] | [<null>][null] Details describing an exception that occured after responding to a request, but before the response stream was finished. To get a human-readable version, use [`log.getError()`](#loggeterror). +- `log.type === LogType.LOG`: + - `log.requestId` [<Buffer>][Buffer] | [<null>][null] A 16-byte [UUIDv4](https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_(random)) used to uniquely identify an HTTP request. A value of `null` is used for logs that are not associated with a request. + - `log.data` [<string>][string] The logged data, stored as stringified JSON. +- `log.type === LogType.LIFECYCLE`: + - `log.event` [<number>][number] The type of lifecycle event that occured. Possible values are defined by the [Lifecycle](#enum-lifecycle) enum. + - Additional properties for `log.event === Lifecycle.WORKER_EXITED`: + - `log.exitCode` [<number>][number] The exit code of the worker process. + - `log.signal` [<string>][string] | [<null>][null] The POSIX signal that terminated the worker process (if any). +- `log.type === LogType.UNCAUGHT_EXCEPTION`: + - `log.error` [<string>][string] Details describing an uncaught exception that occured within the process. To get a human-readable version, use [`log.getError()`](#loggeterror). + ### log.getRequestId() +- Returns: [<string>][string] + +Returns the [UUIDv4](https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_(random)) `log.requestId` converted to a string. + ### log.getIpAddress() +- Returns: [<string>][string] + +Returns the [IPv4](https://en.wikipedia.org/wiki/Internet_Protocol_version_4) or [IPv6](https://en.wikipedia.org/wiki/IPv6) address `log.ipAddress` converted to a string. + ### log.getHttpVersion() +- Returns: [<string>][string] + +Returns the HTTP version (`log.httpVersionMajor` and `log.httpVersionMajor`) converted to a string such as `"1.1"` or `"2.0"`. + ### log.getHttpMethod() +- Returns: [<string>][string] + +Returns the HTTP method `log.method` converted to a string. + ### log.getError() +- Returns: [<Object>][Object] + - `stack` [<string>][string] | [<undefined>][undefined] The stack trace of the thrown exception. If the exception was not an [Error][Error], this will be undefined. + - `value` [<string>][string] | [<undefined>][undefined] The thrown exception converted to a string. If the exception was an [Error][Error], this will be undefined. + - `properties` [<Object>][Object] Additional key-value pairs that were found on the thrown exception. + - `debug` [<Array][Array][<Object>][Object][>][Array] An array of DEBUG logs that were written prior to the exception being thrown. + - `timestamp` [<number>][number] A millisecond unix timestamp indicating when the DEBUG log was written. + - `data` [<string>][string] The logged data, stored as stringified JSON. + +Returns an object that describes an exception that was thrown. The object will either have a `stack` or `value` property, but never both. If any DEBUG logs were written prior to the exception being thrown, they will be included in the returned object. + ### log.toJSON() +- Returns: [<Object>][Object] + +Returns a JSON-compatible representation of the LogEntry. + # namespace *BulkParser* ### BulkParser.read(chunks) +- `chunks` [<AsyncIterable][AsyncIterable][<Buffer>][Buffer][>][AsyncIterable] A "bulk" stream of raw binary logs. +- Returns: [<AsyncIterable][AsyncIterable][<Buffer>][Buffer][>][AsyncIterable] + +Given an [AsyncIterable][AsyncIterable] containing chunks of raw binary logs (such as one returned by [`reader.bulkTail()`](#readerbulktailmintimestamp-options) or [`reader.bulkRange()`](#readerbulkrangemintimestamp-maxtimestamp)), this returns an [AsyncIterable][AsyncIterable] that yields well-formed "log blocks". These blocks can subsequently be parsed by [`BulkParser.parse(block)`](#bulkparserparseblock) to extract the [LogEntry][LogEntry]s. + +```js +for await (const block of BulkParser.read(rawChunks)) { + for (const log of BulkParser.parse(block)) { + console.log(log.timestamp); + } +} +``` + ### BulkParser.readReversed(chunks) +- `chunks` [<AsyncIterable][AsyncIterable][<Buffer>][Buffer][>][AsyncIterable] A reverse "bulk" stream of raw binary logs. +- Returns: [<AsyncIterable][AsyncIterable][<Buffer>][Buffer][>][AsyncIterable] + +Given an [AsyncIterable][AsyncIterable] containing reverse-order chunks of raw binary logs (such as one returned by [`reader.bulkRangeReversed()`](#readerbulktailmintimestamp-options)), this returns an [AsyncIterable][AsyncIterable] that yields well-formed "log blocks". These blocks can subsequently be parsed by [`BulkParser.parse(block)`](#bulkparserparseblock) to extract the [LogEntry][LogEntry]s. + +```js +for await (const block of BulkParser.readReversed(rawChunks)) { + for (const log of BulkParser.parse(block)) { + console.log(log.timestamp); + } +} +``` + ### BulkParser.parse(block) +- `block` [<Buffer>][Buffer] A well-formed "log block". +- Returns: [<Iterable][Iterable][<LogEntry>][LogEntry][>][Iterable] + +This is used in conjunction with [`BulkParser.read()`](#bulkparserreadchunks) or [`BulkParser.readReversed()`](#bulkparserreadreversedchunks) to parse a "bulk" log stream. + +```js +for await (const block of BulkParser.read(rawChunks)) { + for (const log of BulkParser.parse(block)) { + console.log(log.timestamp); + } +} +``` + # class *LogDirectorySource* +Every [LogReader](#class-logreader) needs a source from which to read logs. The most common source is simply the filesystem directory that the logs were written to. LogDirectorySource does exactly that; it allows you to read log files from a directory on the filesystem. This class is not available in the browser. + ### new LogDirectorySource(dirname[, options]) +- `dirname` [<string>][string] The path of the log directory. +- `options` [<Object>][Object] + - `cacheSize` [<number>][number] The maximum amount of filesystem data (in bytes) to cache in memory at any given time. **Default:** `16777216` (16 MiB). + - `pollInterval` [<number>][number] | [<null>][null] The delay, in milliseconds, between each poll to the filesystem. This controls how frequently new logs can be detected. **Default:** `null`. + - `lazy` [<boolean>][boolean] If `true`, files are only kept open while they are being actively read. **Default:** `false`. + - `immutable` [<boolean>][boolean] If `true`, some optimizations will be enabled. Only use this if no live server is writing to the logs. **Default:** `false`. + +By default, logs are read from a single snapshot in time. To support tailing, the `pollInterval` option must be set, which allows the LogDirectorySource to periodically detect new logs being written. + +By default, all log files in the directory are opened eagerly, which prevents the logs from being deleted (e.g., by log rotation) while a read operation is in progress. This might be undesirable if you're tailing the logs for a very long time. In such situations, you can set the `lazy` option to `true`, which causes files to only be opened while they are being actively read. In this mode, attempting to read old logs which have been deleted will cause an exception to be thrown. + +By default, the LogDirectorySource assumes that a live server might be actively writing to the logs. Some extra book-keeping is required to maintain correctness in such a situation. If you're sure that no live server is writing to the logs, you can set the `immutable` option to true, which allows the LogDirectorySource to avoid some unnecessary work. + +```js +const dirname = './logs'; +const reader = new LogReader(new LogDirectorySource(dirname, { + lazy: true, + pollInterval: 200, +})); + +for await (const log of reader.tail(Date.now())) { + console.log(log.timestamp); +} +``` + # class *Vfs* -### new Vfs(options) +Instead of using [LogDirectorySource][LogDirectorySource], you can read logs from any arbitrary source by creating your own implementation of Vfs. For example, this could allow you to read raw binary logs in a browser. + +To learn how to implement a Vfs, see the [source code]('../src/shared/vfs.js'). # enum *LogType* -- `LogType.REQUEST` -- `LogType.REQUEST_META` -- `LogType.RESPONSE` -- `LogType.RESPONSE_FINISHED` -- `LogType.LOG` -- `LogType.LIFECYCLE` -- `LogType.UNCAUGHT_EXCEPTION` +- `LogType.REQUEST`: This log indicates that an HTTP request was received. This log is only concerned with the HTTP request's "head"; the request body may still be pending. +- `LogType.REQUEST_META`: This log contains application-specific metadata that was associated with an HTTP request. +- `LogType.RESPONSE`: This log indicates that an HTTP response was sent. This log is only concerned with the response's "head"; the response body may still be pending. +- `LogType.RESPONSE_FINISHED`: This log indicates that an HTTP response's body is done being sent (sucessfully or not). +- `LogType.LOG`: This is an ad-hoc log with arbitrary data (analogous to `console.log()`). +- `LogType.LIFECYCLE`: This log indicates that a lifecycle event occured within the [server cluster's](https://nodejs.org/api/cluster.html) master process or one of its worker processes. +- `LogType.UNCAUGHT_EXCEPTION`: This log indicates that an uncaught exception was thrown within the [server cluster's](https://nodejs.org/api/cluster.html) master process or one of its worker processes. # enum *LogLevel* -- `LogType.CRITICAL` -- `LogType.ERROR` -- `LogType.WARN` -- `LogType.INFO` -- `LogType.INTERNAL` +- `LogType.CRITICAL`: This log represents a critical error that prevented the application from functioning even at a basic level. +- `LogType.ERROR`: This log represents an unexpected error that prevented a user action from being satisfied. +- `LogType.WARN`: This log represents an unexpected condition that doesn't directly impact the user, but may warrant some investigation by the application developer. +- `LogType.INFO`: This log represents normal application behavior that has long-term value (if it didn't have value, it wouldn't be logged). +- `LogType.INTERNAL`: This log is used by the logging system itself (e.g., for maintaining correctness in concurrent situations), but otherwise has no actual value. + +Note that there's no DEBUG level. That's because DEBUG logs are only *conditionally* written (retroactively) when an error actually occurs. As such, DEBUG logs are not written as part of the main log stream, but are instead nested within the log that contains the error. + +> Currently, there are three log types capable of containing error information (and thus containing DEBUG logs): `RESPONSE`, `RESPONSE_FINISHED`, and `UNCAUGHT_EXCEPTION`. # enum *Lifecycle* -- `Lifecycle.WORKER_STARTED` -- `Lifecycle.WORKER_GOING_ONLINE` -- `Lifecycle.WORKER_ONLINE` -- `Lifecycle.WORKER_GOING_OFFLINE` -- `Lifecycle.WORKER_OFFLINE` -- `Lifecycle.WORKER_DONE` -- `Lifecycle.WORKER_PING` -- `Lifecycle.STARTING_UP` -- `Lifecycle.STARTING_UP_COMPLETED` -- `Lifecycle.SHUTTING_DOWN` -- `Lifecycle.SHUTTING_DOWN_COMPLETED` -- `Lifecycle.WORKER_SPAWNED` -- `Lifecycle.WORKER_EXITED` -- `Lifecycle.MASTER_PING` +All logs that have `log.type === LogType.LIFECYCLE` additionally have an `event` property, indicating the lifecycle event that has occured: + +- `Lifecycle.STARTING_UP`: The master process is starting up, and will soon start spawning workers. +- `Lifecycle.WORKER_SPAWNED`: The master process has spawned a new worker process. +- `Lifecycle.WORKER_STARTED`: A worker process is about to perform its setup procedure. +- `Lifecycle.WORKER_GOING_ONLINE`: A worker process has finished its setup procedure, and is about to start its HTTP server. +- `Lifecycle.WORKER_ONLINE`: A worker process has sucessfully started its HTTP server. +- `Lifecycle.STARTING_UP_COMPLETED`: The master process has finished starting up, and all workers have started their HTTP servers successfully. +- `Lifecycle.SHUTTING_DOWN`: The master process is shutting down, and will soon instruct all workers to gracefully shut down. +- `Lifecycle.WORKER_GOING_OFFLINE`: A worker process is about to gracefully shut down its HTTP server. +- `Lifecycle.WORKER_OFFLINE`: A worker process has gracefully shut down its HTTP server, and is about to perform its teardown procedure. +- `Lifecycle.WORKER_DONE`: A worker process has finished its teardown procedure. +- `Lifecycle.WORKER_EXITED`: The master process has detected that a worker process exited. +- `Lifecycle.SHUTTING_DOWN_COMPLETED`: The master process has finished shutting down, and all workers have exited. +- `Lifecycle.MASTER_PING`: Used internally. +- `Lifecycle.WORKER_PING`: Used internally. # enum *HttpMethod* @@ -121,6 +383,8 @@ - `HttpMethod.TRACE` - `HttpMethod.CONNECT` + + [any]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Data_structures#Data_types [undefined]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Data_structures#undefined_type [null]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Data_structures#null_type @@ -136,3 +400,7 @@ [Iterable]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Iteration_protocols#the_iterable_protocol [AsyncIterable]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/for-await...of [Buffer]: https://nodejs.org/api/buffer.html#class-buffer +[BulkParser]: #class-bulkparser +[LogEntry]: #class-logentry +[LogDirectorySource]: #class-logdirectorysource +[Vfs]: #class-vfs diff --git a/docs/writers.md b/docs/writers.md index e1a714a..2e8f9a3 100644 --- a/docs/writers.md +++ b/docs/writers.md @@ -10,167 +10,215 @@ - [`manager.filename`](#managerfilename) - [class `MasterLogger`](#class-masterlogger) - [`new MasterLogger(filename[, options])`](#new-masterloggerfilename-options) - - [`masterLogger.STARTING_UP()`](#masterloggerstarting_up) - - [`masterLogger.STARTING_UP_COMPLETED()`](#masterloggerstarting_up_completed) - - [`masterLogger.SHUTTING_DOWN()`](#masterloggershutting_down) - - [`masterLogger.SHUTTING_DOWN_COMPLETED()`](#masterloggershutting_down_completed) - - [`masterLogger.WORKER_SPAWNED(workerId)`](#masterloggerworker_spawnedworkerid) - - [`masterLogger.WORKER_EXITED(workerId, exitCode[, signal])`](#masterloggerworker_exitedworkerid-exitcode-signal) - - [`masterLogger.UNCAUGHT_EXCEPTION(error)`](#masterloggeruncaught_exceptionerror) - - [`masterLogger.critical(data)`](#masterloggercriticaldata) - - [`masterLogger.error(data)`](#masterloggererrordata) - - [`masterLogger.warn(data)`](#masterloggerwarndata) - - [`masterLogger.info(data)`](#masterloggerinfodata) - - [`masterLogger.debug(data)`](#masterloggerdebugdata) + - [Master logging methods](#master-logging-methods) - [`masterLogger.flush()`](#masterloggerflush) - [`masterLogger.rotate(filename)`](#masterloggerrotatefilename) - [`masterLogger.close()`](#masterloggerclose) - [`masterLogger.closed`](#masterloggerclosed) - [class `WorkerLogger`](#class-workerlogger) - [`new WorkerLogger(filename[, options])`](#new-masterloggerfilename-options) - - [`workerLogger.WORKER_STARTED()`](#workerloggerworker_started) - - [`workerLogger.WORKER_GOING_ONLINE()`](#workerloggerworker_going_online) - - [`workerLogger.WORKER_ONLINE()`](#workerloggerworker_online) - - [`workerLogger.WORKER_GOING_OFFLINE()`](#workerloggerworker_going_offline) - - [`workerLogger.WORKER_OFFLINE()`](#workerloggerworker_offline) - - [`workerLogger.WORKER_DONE()`](#workerloggerworker_done) - - [`workerLogger.UNCAUGHT_EXCEPTION(error)`](#workerloggeruncaught_exceptionerror) + - [Worker logging methods](#worker-logging-methods) - [`workerLogger.newRequest()`](#workerloggernewrequest) - - [`workerLogger.critical(data)`](#workerloggercriticaldata) - - [`workerLogger.error(data)`](#workerloggererrordata) - - [`workerLogger.warn(data)`](#workerloggerwarndata) - - [`workerLogger.info(data)`](#workerloggerinfodata) - - [`workerLogger.debug(data)`](#workerloggerdebugdata) - [`workerLogger.flush()`](#workerloggerflush) - [`workerLogger.rotate(filename)`](#workerloggerrotatefilename) - [`workerLogger.close()`](#workerloggerclose) - [`workerLogger.closed`](#workerloggerclosed) - [class `RequestLogger`](#class-requestlogger) - - [`requestLogger.REQUEST(req)`](#requestloggerrequestreq) - - [`requestLogger.REQUEST_META(data)`](#requestloggerrequest_metadata) - - [`requestLogger.RESPONSE(statusCode[, error])`](#requestloggerresponsestatuscode-error) - - [`requestLogger.RESPONSE_FINISHED([error])`](#requestloggerresponse_finishederror) - - [`requestLogger.critical(data)`](#requestloggercriticaldata) - - [`requestLogger.error(data)`](#requestloggererrordata) - - [`requestLogger.warn(data)`](#requestloggerwarndata) - - [`requestLogger.info(data)`](#requestloggerinfodata) - - [`requestLogger.debug(data)`](#requestloggerdebugdata) + - [Request logging methods](#request-logging-methods) - [`requestLogger.closed`](#requestloggerclosed) - [`requestLogger.requestId`](#requestloggerrequestid) # class *LogManager* +This class provides log rotation. It keeps track of log files within a directory, detects when the current log file is too big or too old, and deletes the oldest log files when necessary. + +LogManager is a subclass of [EventEmitter][EventEmitter]. + ### new LogManager(dirname[, options]) +- `dirname` [<string>][string] The path of the log directory. +- `options` [<Object>][Object] + - `pollInterval` [<number>][number] The delay, in milliseconds, between each poll to the filesystem. This controls how frequently it tries to detect oversized or outdated log files. **Default:** `5000`. + - `logSizeLimit` [<number>][number] The total maximum size (in bytes) of all logs in the directory. The oldest logs in the directory will be automatically deleted to keep the log size below this limit. **Default:** `2147483648` (2 GiB). + - `logAgeLimit` [<number>][number] The maximum age, in milliseconds, that logs are allowed to have. Logs that are too old will be automatically deleted. **Default:** `31536000000` (1 year). + - `granularity` [<number>][number] The granularity that is used to enforce `logSizeLimit` and `logAgeLimit` (see below for details). **Default:** `20`. + +If the given `dirname` does not yet exist, it is created. + +LogManager is able to enforce `logSizeLimit` and `logAgeLimit` by simply unlinking files on the filesystem, without needing to actually move any data around. This is possible because all loggers are periodically instructed to write to new log files, effectively splitting the logs into many small-ish files. + +The `granularity` option controls how much splitting to do. For example, if `logAgeLimit` is 10 months and `granularity` is 10, then each log file would span no more than 1 month. Increasing the granularity increases the number of log files that will created, but also improves the precision of the `logSizeLimit` and `logAgeLimit` options. These limits are expected to have an error of approximately `±1 / granularity`. For example, with the default granularity of `20`, the expected error is `±5%`. Thus, with a `logSizeLimit` of 2 GiB, you should expect the logs to grow to 2.1 GiB, and possibly more depending on how many logs can be written before the `pollInterval` detects them. + ### event `"rotate"` +- `filename` [<string>][string] The path of the new log file that should be used. + +This event is emitted when the LogManager wants you to change the file that all loggers are writing too. You should first call [`.rotate(filename)`](#masterloggerrotatefilename) on the master process's [MasterLogger](#class-masterlogger), and then call [`.rotate(filename)`](#workerloggerrotatefilename) on each worker process's [WorkerLogger](#class-workerlogger). You'll need to utilize [IPC](https://nodejs.org/api/cluster.html#workersendmessage-sendhandle-options-callback) to send this instruction from the master process to each worker process. + ### event `"error"` +- `error` [<Error>][Error] An unexpected error. + +This event indicates that an error occurred during some filesystem operation. The LogManager will continue trying to function until explicitly [closed](#managerclose). + ### manager.close() +Closes the LogManager. All operations are stopped and no more events will be emitted. + ### manager.closed -### manager.dirname +- [<boolean>][boolean] -### manager.filename +Indicates whether or not the LogManager is closed. -# class *MasterLogger* +### manager.dirname -### new MasterLogger(filename[, options]) +- [<string>][string] -### masterLogger.STARTING_UP() +The file path of the log directory being managed. -### masterLogger.STARTING_UP_COMPLETED() +### manager.filename -### masterLogger.SHUTTING_DOWN() +The file path of the current log file that all loggers should be writing to. -### masterLogger.SHUTTING_DOWN_COMPLETED() +# class *MasterLogger* -### masterLogger.WORKER_SPAWNED(workerId) +This is the logger used by the [server cluster's](https://nodejs.org/api/cluster.html) master process. It internally maintains state that is necessary for writing well-formed logs, so only one instance should be used for the entire lifespan of the master process. -### masterLogger.WORKER_EXITED(workerId, exitCode[, signal]) +### new MasterLogger(filename[, options]) -### masterLogger.UNCAUGHT_EXCEPTION(error) +- `filename` [<string>][string] The path of the log file. +- `options` [<Object>][Object] + - `highWaterMark` [<number>][number] The maximum amount of data (in bytes) that can be buffered before being flushed. **Default:** `32768` (32 KiB). + - `outputDelay` [<number>][number] The maximum amount of time (in milliseconds) that data can be buffered for, before being flushed. **Default:** `200`. + - `compression` [<boolean>][boolean] Whether or not to compress logs. Compressing saves disk space but lowers the throughput of reading logs. **Default:** `true`. + - `pingDelay` [<number>][number] How often an internal "ping" should be written to the logs. Reducing this delay uses more disk space, but may improve the performance of reading logs. **Default:** `60000` (60 seconds). + - `debugLogLimit` [<number>][number] The maixmum number of DEBUG logs to keep in memory before discarding the oldest ones. DEBUG logs are kept in memory so they can be conditionally logged if/when an error occurs. **Default:** `100`. -### masterLogger.critical(data) +If the given `filename` does not yet exist, it is created. -### masterLogger.error(data) +The `highWaterMark` and `outputDelay` options control how logs are batched. If either of these options are set to `0`, batching will be disabled, which drastically reduces the performance of logging. -### masterLogger.warn(data) +Most applications shouldn't have to worry about the `pingDelay` option. Internally, "pings" are emitted by loggers periodically to record the current state of the server cluster. When reading logs, this information is used internally to prove timestamp bounds without needing to read every single log. By default, these pings amount to a few bytes being written every 60 seconds, which should be negligible to most applications. Increasing the `pingDelay` can save a little disk space, but will likely negatively impact the performance of reading logs. By contrast, decreasing the `pingDelay` results in more disk space being used, but may improve the performance of reading logs, particularly if your application writes many logs per second. -### masterLogger.info(data) +### Master logging methods -### masterLogger.debug(data) +- `masterLogger.STARTING_UP()`: This MUST be the first thing logged whenever a master process first starts up. +- `masterLogger.STARTING_UP_COMPLETED()`: This should be logged after all workers have been spawned, and their HTTP servers have all been started. +- `masterLogger.SHUTTING_DOWN()`: This should be logged when the master process wants to initiate a graceful shutdown procedure, but before any workers have been instructed to shut down. +- `masterLogger.SHUTTING_DOWN_COMPLETED()`: This should be logged after the master process finishes shutting down, and all workers have exited (regardless of whether the shutdown was actually graceful). +- `masterLogger.WORKER_SPAWNED(workerId)`: This MUST be logged whenever the master process spawns a new worker process. +- `masterLogger.WORKER_EXITED(workerId, exitCode[, signal])`: This MUST be logged whenever the master process detects that a worker process has exited. +- `masterLogger.UNCAUGHT_EXCEPTION(error)`: This should be logged whenever an uncaught exception is detected within the master process. +- `masterLogger.critical(data)`: This writes a CRITICAL-level log. +- `masterLogger.error(data)`: This writes an ERROR-level log. +- `masterLogger.warn(data)`: This writes a WARN-level log. +- `masterLogger.info(data)`: This writes an INFO-level log. +- `masterLogger.debug(data)`: This buffers a DEBUG-level log, which will be written if/when the next UNCAUGHT_EXCEPTION log is written. ### masterLogger.flush() +Flushes the logger's internal buffer, writing the logs to the filesystem. This happens synchronously, so it can be used within [`process.on('exit')`](https://nodejs.org/api/process.html#event-exit). + ### masterLogger.rotate(filename) +Changes the logger's file to the new `filename`. This happens synchronously. + ### masterLogger.close() -### masterLogger.closed +Closes the logger, flushing any remaining logs to the filesystem. This happens synchronously, so it can be used within [`process.on('exit')`](https://nodejs.org/api/process.html#event-exit). -# class *WorkerLogger* - -### new WorkerLogger(filename[, options]) +### masterLogger.closed -### workerLogger.WORKER_STARTED() +- [<boolean>][boolean] -### workerLogger.WORKER_GOING_ONLINE() +Indicates whether or not the logger is closed. -### workerLogger.WORKER_ONLINE() +# class *WorkerLogger* -### workerLogger.WORKER_GOING_OFFLINE() +This is the logger used by each worker process within a [server cluster](https://nodejs.org/api/cluster.html). Each worker process should only have one WorkerLogger. -### workerLogger.WORKER_OFFLINE() +### new WorkerLogger(filename[, options]) -### workerLogger.WORKER_DONE() +- `filename` [<string>][string] The path of the log file. +- `options` [<Object>][Object] + - `highWaterMark` [<number>][number] The maximum amount of data (in bytes) that can be buffered before being flushed. **Default:** `32768` (32 KiB). + - `outputDelay` [<number>][number] The maximum amount of time (in milliseconds) that data can be buffered for, before being flushed. **Default:** `200`. + - `compression` [<boolean>][boolean] Whether or not to compress logs. Compressing saves disk space but lowers the throughput of reading logs. **Default:** `true`. + - `debugLogLimit` [<number>][number] The maixmum number of DEBUG logs to keep in memory before discarding the oldest ones. DEBUG logs are kept in memory so they can be conditionally logged if/when an error occurs. **Default:** `100`. -### workerLogger.UNCAUGHT_EXCEPTION(error) +If the given `filename` does not yet exist, it is created. -### workerLogger.newRequest() +The `highWaterMark` and `outputDelay` options control how logs are batched. If either of these options are set to `0`, batching will be disabled, which drastically reduces the performance of logging. -### workerLogger.critical(data) +### Worker logging methods -### workerLogger.error(data) +- `workerLogger.WORKER_STARTED()`: This should be logged when the worker process starts, before starting its HTTP server or performing its setup procedure (if any). +- `workerLogger.WORKER_GOING_ONLINE()`: This should be logged after the worker process completes its setup procedure (if any), but before starting its HTTP server. +- `workerLogger.WORKER_ONLINE()`: This should be logged when the worker process successfully starts its HTTP server. +- `workerLogger.WORKER_GOING_OFFLINE()`: This should be logged when the worker process wants to initiate a graceful shutdown procedure. +- `workerLogger.WORKER_OFFLINE()`: This should be logged when the worker process has successfully shut down its HTTP server (regardless of whether the shutdown was actually graceful) and all connections/requests have ended, but before performing its teardown procedure (if any). +- `workerLogger.WORKER_DONE()`: This should be logged after the worker process completes its teardown procedure (if any). +- `workerLogger.UNCAUGHT_EXCEPTION(error)`: This should be logged whenever an uncaught exception is detected within the worker process. +- `workerLogger.critical(data)`: This writes a CRITICAL-level log. +- `workerLogger.error(data)`: This writes an ERROR-level log. +- `workerLogger.warn(data)`: This writes a WARN-level log. +- `workerLogger.info(data)`: This writes an INFO-level log. +- `workerLogger.debug(data)`: This buffers a DEBUG-level log, which will be written if/when the next UNCAUGHT_EXCEPTION log is written. -### workerLogger.warn(data) +### workerLogger.newRequest() -### workerLogger.info(data) +- Returns: [<RequestLogger>][RequestLogger] -### workerLogger.debug(data) +Creates a new [RequestLogger][RequestLogger]. Each RequestLogger is assigned a unique `requestId` which can be used to uniquely identify the request. ### workerLogger.flush() +Flushes the logger's internal buffer, writing the logs to the filesystem. This happens synchronously, so it can be used within [`process.on('exit')`](https://nodejs.org/api/process.html#event-exit). + ### workerLogger.rotate(filename) +Changes the logger's file to the new `filename`. This happens synchronously. + ### workerLogger.close() +Closes the logger, flushing any remaining logs to the filesystem. This happens synchronously, so it can be used within [`process.on('exit')`](https://nodejs.org/api/process.html#event-exit). + ### workerLogger.closed +- [<boolean>][boolean] + +Indicates whether or not the logger is closed. + # class *RequestLogger* -### requestLogger.REQUEST(req) +Whenever a worker process receives an HTTP request, you should use the [WorkerLogger][WorkerLogger] to spawn a new RequestLogger, and then use that RequestLogger for all request-related activity. Each HTTP request should have its own RequestLogger. -### requestLogger.REQUEST_META(data) +### Request logging methods -### requestLogger.RESPONSE(statusCode[, error]) +- `requestLogger.REQUEST(req)`: This should be logged when the associated HTTP request is first received by the server. Only the request's "head" needs to be received; the request body may still be pending. +- `requestLogger.REQUEST_META(data)`: This can be logged to associate arbitrary application-specific metadata to the request. +- `requestLogger.RESPONSE(statusCode[, error])`: This should be logged when a response is sent for the associated HTTP request. Only the response's "head" needs to be sent; the response body may still be pending. Passing an `error` indicates that an unexpected error occurred while trying to handle the request. +- `requestLogger.RESPONSE_FINISHED([error])`: This should be logged when the response body is done being sent (even if the response body was empty). Passing an `error` indicates that an unexpected error occurred while trying to send the response body, but after the response's "head" was already sent. +- `requestLogger.critical(data)`: This writes a CRITICAL-level log. +- `requestLogger.error(data)`: This writes an ERROR-level log. +- `requestLogger.warn(data)`: This writes a WARN-level log. +- `requestLogger.info(data)`: This writes an INFO-level log. +- `requestLogger.debug(data)`: This buffers a DEBUG-level log, which will be written if/when the next RESPONSE or RESPONSE_FINISHED log is written with an `error` passed to it. -### requestLogger.RESPONSE_FINISHED([error]) +### requestLogger.closed -### requestLogger.critical(data) +- [<boolean>][boolean] -### requestLogger.error(data) +Indicates whether or not the logger is closed. This value will always be equal to that of the associated [WorkerLogger][WorkerLogger]. -### requestLogger.warn(data) +### requestLogger.requestId -### requestLogger.info(data) +- [<string>][string] -### requestLogger.debug(data) +A [UUIDv4](https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_(random)) that can be used to uniquely identify the associated HTTP request. -### requestLogger.closed -### requestLogger.requestId [any]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Data_structures#Data_types [undefined]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Data_structures#undefined_type @@ -187,3 +235,6 @@ [Iterable]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Iteration_protocols#the_iterable_protocol [AsyncIterable]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/for-await...of [Buffer]: https://nodejs.org/api/buffer.html#class-buffer +[EventEmitter]: https://nodejs.org/api/events.html#class-eventemitter +[WorkerLogger]: #class-workerlogger +[RequestLogger]: #class-requestlogger diff --git a/src/nodejs/logger.js b/src/nodejs/logger.js index e395751..075f645 100644 --- a/src/nodejs/logger.js +++ b/src/nodejs/logger.js @@ -16,7 +16,7 @@ const { compress, escapeBlock, SEPARATOR } = require('../shared/common'); */ module.exports = class Logger { - constructor(filename, { highWaterMark = 1024 * 32, outputDelay = 100, compression = true } = {}) { + constructor(filename, { highWaterMark = 1024 * 32, outputDelay = 200, compression = true } = {}) { if (typeof filename !== 'string' && filename !== null) { throw new TypeError('Expected filename to be a string or null'); }