Skip to content

Commit

Permalink
feat: pre-register partial transactions in AWS Lambda (elastic#3285)
Browse files Browse the repository at this point in the history
When used with the Lambda extension >=v1.4.0, this results in
transactions being reported for Lambda timeout, uncaughtException,
unhandledRejection, crashes.

Closes: elastic#3136
Closes: elastic#2379
  • Loading branch information
trentm authored and fpm-peter committed Aug 20, 2024
1 parent e199968 commit fa89ad7
Show file tree
Hide file tree
Showing 22 changed files with 373 additions and 72 deletions.
23 changes: 23 additions & 0 deletions CHANGELOG.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,29 @@ Notes:
=== Node.js Agent version 3.x
==== Unreleased
[float]
===== Breaking changes
[float]
===== Features
* Improve error handling with AWS Lambda. When used together with the
https://github.com/elastic/apm-aws-lambda[Elastic AWS Lambda extension]
v1.4.0 or greater, the APM agent will pre-register a partial transaction
before the user's handler function is run. If the handler function fails
with a Lambda timeout, `uncaughtException`, `unhandledRejection`, or crash
then the Lambda extension will report the failed transaction so it can be
seen in the Kibana APM app. ({pull}3285[#3285])
[float]
===== Bug fixes
[float]
===== Chores
[[release-notes-3.45.0]]
==== 3.45.0 2023/04/28
Expand Down
39 changes: 15 additions & 24 deletions docs/lambda.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -55,38 +55,29 @@ You can optionally <<configuration, fine-tune the Node.js agent>> or the {apm-la
That's it. After following the steps above, you're ready to go!
Your Lambda function invocations should be traced from now on.

Read on to learn more about the features and limitations of the Node.js APM Agent on AWS Lambda Functions.

[float]
[[aws-lambda-features-and-caveats]]
=== Features and Caveats
[[aws-lambda-features]]
=== Features

The AWS Lambda as a runtime behaves differently from conventional runtimes.
While most APM and monitoring concepts apply to AWS Lambda, there are a few differences and limitations to be aware of.
The AWS Lambda instrumentation will report a transaction for all function invocations
and trace any <<compatibility-frameworks,support modules>>. In addition, the
created transactions will capture additional data for a number of Lambda
trigger types -- API Gateway, SNS, SQS, S3 (with the trigger is a single event),
and ELB.

[float]
[[aws-lambda-performance-monitoring]]
==== Performance monitoring

Elastic APM automatically measures the performance of your lambda function executions.
It records traces for database queries, external HTTP requests,
and other slow operations that happen during execution.

By default, the agent will trace <<supported-technologies,the most common modules>>.
To trace other events,
you can use custom traces.
For information about custom traces,
see the <<custom-spans,Custom Spans section>>.

[float]
[[aws-lambda-error-monitoring]]
==== Error monitoring
A transaction will be reported for Lambda invocations that fail due to a
timeout, crash, `uncaughtException`, or `unhandledRejection`. (This requires
APM agent v3.45.0 or later and
https://www.elastic.co/guide/en/apm/lambda/current/aws-lambda-arch.html[Elastic's APM Lambda extension]
version 1.4.0 or later.)

include::./shared-set-up.asciidoc[tag=error-logging]

[float]
[[aws-lambda-caveats]]
==== Caveats
=== Caveats and Troubleshooting

* System and custom metrics are not collected for Lambda functions. This is both because most of those are irrelevant
and because the interval-based event sending model is not suitable for FaaS environments.
* Lambda instrumentation does not currently work when the (deprecated) <<context-manager,`contextManager: 'patch'`>> configuration setting is used.
* The APM agent does not yet support a Lambda handler module that uses ECMAScript modules (ESM). That means a your handler file name should end with ".js" (and not have `"type": "module"` in package.json if you have one) or end with ".cjs". A handler file that uses the ".mjs" suffix will not be instrumented by the APM agent.
80 changes: 72 additions & 8 deletions lib/lambda.js
Original file line number Diff line number Diff line change
Expand Up @@ -439,6 +439,54 @@ function setS3SingleData (trans, event, context, faasId, isColdStart) {
function elasticApmAwsLambda (agent) {
const log = agent.logger

/**
* Register this transaction with the Lambda extension, if possible. This
* function is `await`able so that the transaction is registered before
* executing the user's Lambda handler.
*
* Perf note: Using a Lambda sized to have 1 vCPU (1769MB memory), some
* rudimentary perf tests showed an average of 0.8ms for this call to the ext.
*/
function registerTransaction (trans, awsRequestId) {
if (!agent._transport) {
return
}
if (!agent._transport.lambdaShouldRegisterTransactions()) {
return
}

// Reproduce the filtering logic from `Instrumentation.prototype.addEndedTransaction`.
if (agent._conf.contextPropagationOnly) {
return
}
if (!trans.sampled && !agent._transport.supportsKeepingUnsampledTransaction()) {
return
}

var payload = trans.toJSON()
// If this partial transaction is used, the Lambda Extension will fill in:
// - `transaction.result` will be set to one of:
// - The "status" field from the Logs API platform `runtimeDone` message.
// https://docs.aws.amazon.com/lambda/latest/dg/runtimes-logs-api.html#runtimes-logs-api-ref-done
// Values: "success", "failure"
// - The "shutdownReason" field from the `Shutdown` event from the Extensions API.
// https://docs.aws.amazon.com/lambda/latest/dg/runtimes-extensions-api.html#runtimes-lifecycle-shutdown
// Values: "spindown", "timeout", "failure" (I think these are the values.)
// - `transaction.outcome` will be set to "failure" if the status above is
// not "success". Therefore we want a default outcome value.
// - `transaction.duration` will be estimated
delete payload.result
delete payload.duration

payload = agent._transactionFilters.process(payload)
if (!payload) {
log.trace({ traceId: trans.traceId, transactionId: trans.id }, 'transaction ignored by filter')
return
}

return agent._transport.lambdaRegisterTransaction(payload, awsRequestId)
}

function endAndFlushTransaction (err, result, trans, event, context, triggerType, cb) {
log.trace({ awsRequestId: context && context.awsRequestId }, 'lambda: fn end')

Expand Down Expand Up @@ -509,7 +557,7 @@ function elasticApmAwsLambda (agent) {
}
}

return function wrapLambda (type, fn) {
return function wrapLambdaHandler (type, fn) {
if (typeof type === 'function') {
fn = type
type = 'request'
Expand All @@ -519,7 +567,7 @@ function elasticApmAwsLambda (agent) {
return fn
}

return function wrappedLambda (event, context, callback) {
return async function wrappedLambdaHandler (event, context, callback) {
if (!(event && context && typeof callback === 'function')) {
// Skip instrumentation if arguments are unexpected.
// https://docs.aws.amazon.com/lambda/latest/dg/nodejs-handler.html
Expand Down Expand Up @@ -592,16 +640,32 @@ function elasticApmAwsLambda (agent) {
log.warn(`not setting transaction data for triggerType=${triggerType}`)
}

// Wrap context and callback to finish and send transaction
// Wrap context and callback to finish and send transaction.
// Note: Wrapping context needs to happen *before any `await` calls* in
// this function, otherwise the Lambda Node.js Runtime will call the
// *unwrapped* `context.{succeed,fail,done}()` methods.
wrapContext(trans, event, context, triggerType)
if (typeof callback === 'function') {
callback = wrapLambdaCallback(trans, event, context, triggerType, callback)
}
const wrappedCallback = wrapLambdaCallback(trans, event, context, triggerType, callback)

await registerTransaction(trans, context.awsRequestId)

try {
return fn.call(this, event, context, callback)
var retval = fn.call(this, event, context, wrappedCallback)
if (retval instanceof Promise) {
return retval
} else {
// In this case, our wrapping of the user's handler has changed it
// from a sync function to an async function. We need to ensure the
// Lambda Runtime does not end the invocation based on this returned
// promise -- the invocation should end when the `callback` is called
// -- so we return a promise that never resolves.
return new Promise((resolve, reject) => { /* never resolves */ })
}
} catch (handlerErr) {
callback(handlerErr)
wrappedCallback(handlerErr)
// Return a promise that never resolves, so that the Lambda Runtime's
// doesn't attempt its "success" handling.
return new Promise((resolve, reject) => { /* never resolves */ })
}
}
}
Expand Down
3 changes: 3 additions & 0 deletions lib/noop-transport.js
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,10 @@ class NoopTransport {

addMetadataFilter (fn) {}
setExtraMetadata (metadata) {}

lambdaStart () {}
lambdaShouldRegisterTransactions () { return true }
lambdaRegisterTransaction (trans, awsRequestId) { }

sendSpan (span, cb) {
if (cb) {
Expand Down
49 changes: 33 additions & 16 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@
"basic-auth": "^2.0.1",
"cookie": "^0.5.0",
"core-util-is": "^1.0.2",
"elastic-apm-http-client": "11.3.1",
"elastic-apm-http-client": "11.4.0",
"end-of-stream": "^1.4.4",
"error-callsites": "^2.0.4",
"error-stack-parser": "^2.0.6",
Expand Down
7 changes: 7 additions & 0 deletions test/_capturing_transport.js
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,13 @@ class CapturingTransport {
}
}

lambdaShouldRegisterTransactions () {
return true
}

lambdaRegisterTransaction (trans, awsRequestId) {
}

supportsKeepingUnsampledTransaction () {
return true
}
Expand Down
23 changes: 18 additions & 5 deletions test/_mock_apm_server.js
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
*/

// A mock APM server to use in tests.
// It also has an option to attempt to behave like the Elastic Lambda extension.
//
// Usage:
// const server = new MockAPMServer(opts)
Expand All @@ -22,14 +23,20 @@ const { URL } = require('url')
const zlib = require('zlib')

class MockAPMServer {
// - @param {Object} opts
// - {String} opts.apmServerVersion - The version to report in the
// "GET /" response body. Defaults to "8.0.0".
/**
* @param {object} opts
* - {string} opts.apmServerVersion - The version to report in the `GET /`
* response body. Defaults to "8.0.0".
* - {boolean} opts.mockLambdaExtension - Default false. If enabled then
* this will add some behaviour expected of APM Lambda extension, e.g.
* responding to the `POST /register/transaction` endpoint.
*/
constructor (opts) {
opts = opts || {}
this.clear()
this.serverUrl = null // set in .start()
this.apmServerVersion = opts.apmServerVersion || '8.0.0'
this._apmServerVersion = opts.apmServerVersion || '8.0.0'
this._mockLambdaExtension = !!opts.mockLambdaExtension
this._http = http.createServer(this._onRequest.bind(this))
}

Expand Down Expand Up @@ -60,7 +67,7 @@ class MockAPMServer {
resBody = JSON.stringify({
build_date: '2021-09-16T02:05:39Z',
build_sha: 'a183f675ecd03fca4a897cbe85fda3511bc3ca43',
version: this.apmServerVersion
version: this._apmServerVersion
})
} else if (parsedUrl.pathname === '/config/v1/agents') {
// Central config mocking.
Expand All @@ -75,6 +82,12 @@ class MockAPMServer {
})
resBody = '{}'
res.writeHead(202)
} else if (this._mockLambdaExtension && req.method === 'POST' && parsedUrl.pathname === '/register/transaction') {
// See `func handleTransactionRegistration` in apm-aws-lambda.git.
// This mock doesn't handle the various checks there. It only handles
// the status code, so the APM agent will continue to register
// transactions.
res.writeHead(200)
} else {
res.writeHead(404)
}
Expand Down
Loading

0 comments on commit fa89ad7

Please sign in to comment.