Skip to content

Commit

Permalink
Add alert metadata (#3764)
Browse files Browse the repository at this point in the history
* alert metadata WIP

* add alert metada changes

* add alert type

* add validation

* generic api handler correction

* add meta field to webhook notification

* remove log statements

* resolve go lint issues

* Update migrate/migrations/20240212125328-alert-metadata.sql

Co-authored-by: Nathaniel Caza <[email protected]>

* Update graphql2/schema.graphql

Co-authored-by: Nathaniel Caza <[email protected]>

* Update graphql2/graphqlapp/alert.go

Co-authored-by: Nathaniel Caza <[email protected]>

* change as per review

* Add createMetaTx method

* fix urlform values

* fix generic api tests

* fix lint error

* remove commented code

* move metadata out of CreateOrUpdate

Since we're handling this separatly with a specific store method, we
don't want to add a second place to set metadata here.

If/when we update things using CreateOrUpdate, we'll make a Tx method
for it. This way we don't need to update call sites across the app for
the new signature.

* update SetMetadata to support service permissions validation

So this query is a bit more complicated, but we want to make sure no
future bug allows a service/integration key to update the metadata of an
alert that doesn't belong to it.

Integration keys in GoAlert are tied to a single service, and all DB
calls regarding alerts should be guarded such that only alerts belonging
to that specific service may be modified, including the newly introduced
metadata.

The approach I took here allows us to pass a service ID, optionally
(i.e., ONLY when a request comes from an integration), where the query
will fail to insert/update any data if it's mismatched. If the service
ID is null, it will skip this check (i.e., a user action).

* fix type for manymetadata and add single query

This is just updating the type from int to bigint (64 bit) and using
@alert_ids instead of $1 so that sqlc generates a better name for the
argument.

* regen sqlc

* move store methods to metadata.go

For organization sake, moving the store methods to a separate file (we
have a bad habbit of having the store.go files growing forever).

I also implemented things using the `gadb` and taking a `DBTX` rather
than a sql.Tx. This is the most recent pattern, although most DB code
has not been refactored for this.

I'm using the `map[string]string` Go type directly to get rid of any
indirection on the metadata format external to this package. Internally,
I added a `metadataDBFormat` that hopefully by name makes it clearer
it's only intended to be used at rest.

Compared to what was there before, I also:
- Added assertions on the type string
- Added a MetadataAlertID type that includes an alert ID for the
  dataloader
- Added an Access Denied error if SetMetadataTx fails because of an
  invalid alert or service ID mismatch

* move validation to metadata.go

For this I just moved the validation to metadata.go and implemented it
as a standalone function rather than a method.

This was a mistake on my part of recommending a method -- we really
don't need a specific type for metadata as it's just a
`map[string]string` we also don't have a use-case to "normalize" the
data, as we always store it as-is.

This is different to things like services, schedules, etc... where we
will do things like trim whitespace, remove double-spaces, to
"normalize" the input.

For this it's just pass or fail.

* missing sig. updates

Some leftover updates I forgot to commit

* add ServiceNullUUID method to permission pkg

Similar to the UserNullUUID -- a convenience/helper method so we can
pull the service ID from the current context and pass it directly as a
query argument.

* update grafana call

* regen sqlc

* fix CreateOrUpdate call

* add db in call for metadata

* fix edge case handling in MetaValue

ErrNoRows is handled by the method itself, so here we can just return any err we get.

We also want to check if the map is nil before doing `md[key]` so we don't panic if the alert has no metadata.

* move map handling out of Tx function

This is a bit of an optimization -- we can handle allocating the memory, and processing the map outside of the transaction to keep the transaction as quick as possible.

Additionally, we can do some early validation to jump out _before_ we get to the withContextTx call.

This helps because to create the alert, we actually:
- Start a transaction
- Grab a global exclusive lock on the service's row
- Create the alert itself
- Add the log entry about the alert

So if we already know the metadata is going to be rejected, we can skip that.

Lastly, now that the alert store only takes `map[string]map` it simplifies the code here as well.

* add create CreateOrUpdateWithMeta store method and update generic api handler call to point to new method

* add comment

* remove unused queries

* clairify comment

* tweak error message

* cleanup graphql code

- removed debug print
- removed redundant nil check on Metadata() result
- updated Metadata comment regarding return values

* update documentation

* remove changes to existing tests (will create new)

* add metadata smoke test

* add dataloader for batching metadata lookups

* use dataloader

* regen

* add id column for switchover

* regen sqlc

---------

Co-authored-by: Forfold <[email protected]>
Co-authored-by: Nathaniel Caza <[email protected]>
  • Loading branch information
3 people authored Apr 2, 2024
1 parent 677be7d commit cbe89ca
Show file tree
Hide file tree
Showing 22 changed files with 1,094 additions and 12 deletions.
1 change: 1 addition & 0 deletions alert/alert.go
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@ func (a Alert) Normalize() (*Alert, error) {
}
a.Summary = strings.Replace(a.Summary, "\n", " ", -1)
a.Summary = strings.Replace(a.Summary, " ", " ", -1)

err := validate.Many(
validate.Text("Summary", a.Summary, 1, MaxSummaryLength),
validate.Text("Details", a.Details, 0, MaxDetailsLength),
Expand Down
158 changes: 158 additions & 0 deletions alert/metadata.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
package alert

import (
"context"
"database/sql"
"encoding/json"
"errors"

"github.com/sqlc-dev/pqtype"
"github.com/target/goalert/gadb"
"github.com/target/goalert/permission"
"github.com/target/goalert/validation"
"github.com/target/goalert/validation/validate"
)

const metaV1 = "alert_meta_v1"

type metadataDBFormat struct {
Type string
AlertMetaV1 map[string]string
}

// Metadata returns the metadata for a single alert. If err == nil, meta is guaranteed to be non-nil. If the alert has no metadata, an empty map is returned.
func (s *Store) Metadata(ctx context.Context, db gadb.DBTX, alertID int) (meta map[string]string, err error) {
err = permission.LimitCheckAny(ctx, permission.System, permission.User)
if err != nil {
return nil, err
}

md, err := gadb.New(db).AlertMetadata(ctx, int64(alertID))
if errors.Is(err, sql.ErrNoRows) || !md.Valid {
return map[string]string{}, nil
}
if err != nil {
return nil, err
}

var doc metadataDBFormat
err = json.Unmarshal(md.RawMessage, &doc)
if err != nil {
return nil, err
}

if doc.Type != metaV1 || doc.AlertMetaV1 == nil {
return nil, errors.New("unsupported metadata type")
}

return doc.AlertMetaV1, nil
}

type MetadataAlertID struct {
// AlertID is the ID of the alert.
ID int64
Meta map[string]string
}

func (s Store) FindManyMetadata(ctx context.Context, db gadb.DBTX, alertIDs []int) ([]MetadataAlertID, error) {
err := permission.LimitCheckAny(ctx, permission.System, permission.User)
if err != nil {
return nil, err
}

err = validate.Range("AlertIDs", len(alertIDs), 1, maxBatch)
if err != nil {
return nil, err
}
ids := make([]int64, len(alertIDs))
for i, id := range alertIDs {
ids[i] = int64(id)
}

rows, err := gadb.New(db).AlertManyMetadata(ctx, ids)
if errors.Is(err, sql.ErrNoRows) {
return nil, nil
}
if err != nil {
return nil, err
}

res := make([]MetadataAlertID, len(rows))
for i, r := range rows {
var doc metadataDBFormat
err = json.Unmarshal(r.Metadata.RawMessage, &doc)
if err != nil {
return nil, err
}

if doc.Type != metaV1 || doc.AlertMetaV1 == nil {
return nil, errors.New("unsupported metadata type")
}

res[i] = MetadataAlertID{
ID: r.AlertID,
Meta: doc.AlertMetaV1,
}
}

return res, nil
}

func (s Store) SetMetadataTx(ctx context.Context, db gadb.DBTX, alertID int, meta map[string]string) error {
err := permission.LimitCheckAny(ctx, permission.User, permission.Service)
if err != nil {
return err
}

err = ValidateMetadata(meta)
if err != nil {
return err
}

var doc metadataDBFormat
doc.Type = metaV1
doc.AlertMetaV1 = meta

md, err := json.Marshal(&doc)
if err != nil {
return err
}

rowCount, err := gadb.New(db).AlertSetMetadata(ctx, gadb.AlertSetMetadataParams{
ID: int64(alertID),
ServiceID: permission.ServiceNullUUID(ctx), // only provide service_id restriction if request is from a service
Metadata: pqtype.NullRawMessage{Valid: true, RawMessage: json.RawMessage(md)},
})
if err != nil {
return err
}

if rowCount == 0 {
// shouldn't happen, but just in case
return permission.NewAccessDenied("alert closed, invalid, or wrong service")
}

return nil
}

func ValidateMetadata(m map[string]string) error {
if m == nil {
return validation.NewFieldError("Meta", "cannot be nil")
}

var totalSize int
for k, v := range m {
err := validate.ASCII("Meta[<key>]", k, 1, 255)
if err != nil {
return err
}

totalSize += len(k) + len(v)
}

if totalSize > 32768 {
return validation.NewFieldError("Meta", "cannot exceed 32KiB in size")
}

return nil
}
35 changes: 35 additions & 0 deletions alert/queries.sql
Original file line number Diff line number Diff line change
Expand Up @@ -60,3 +60,38 @@ ON CONFLICT (alert_id)
RETURNING
alert_id;

-- name: AlertMetadata :one
SELECT
metadata
FROM
alert_data
WHERE
alert_id = $1;

-- name: AlertManyMetadata :many
SELECT
alert_id,
metadata
FROM
alert_data
WHERE
alert_id = ANY (@alert_ids::bigint[]);

-- name: AlertSetMetadata :execrows
INSERT INTO alert_data(alert_id, metadata)
SELECT
a.id,
$2
FROM
alerts a
WHERE
a.id = $1
AND a.status != 'closed'
AND (a.service_id = $3
OR $3 IS NULL) -- ensure the alert is associated with the service, if coming from an integration
ON CONFLICT (alert_id)
DO UPDATE SET
metadata = $2
WHERE
alert_data.alert_id = $1;

18 changes: 18 additions & 0 deletions alert/store.go
Original file line number Diff line number Diff line change
Expand Up @@ -610,6 +610,15 @@ func (s *Store) CreateOrUpdateTx(ctx context.Context, tx *sql.Tx, a *Alert) (*Al
// In the case that Status is closed but a matching alert is not present, nil is returned.
// Otherwise the current alert is returned.
func (s *Store) CreateOrUpdate(ctx context.Context, a *Alert) (*Alert, bool, error) {
return s.createOrUpdate(ctx, a, nil)
}

// CreateOrUpdateWithMeta behaves the same as CreateOrUpdate, but also sets metadata on the alert if it is new.
func (s *Store) CreateOrUpdateWithMeta(ctx context.Context, a *Alert, meta map[string]string) (*Alert, bool, error) {
return s.createOrUpdate(ctx, a, meta)
}

func (s *Store) createOrUpdate(ctx context.Context, a *Alert, meta map[string]string) (*Alert, bool, error) {
err := permission.LimitCheckAny(ctx,
permission.System,
permission.Admin,
Expand All @@ -631,10 +640,19 @@ func (s *Store) CreateOrUpdate(ctx context.Context, a *Alert) (*Alert, bool, err
return nil, false, err
}

// Set metadata only if meta is not nil and isNew is true
if meta != nil && isNew {
err = s.SetMetadataTx(ctx, tx, n.ID, meta)
if err != nil {
return nil, false, err
}
}

err = tx.Commit()
if err != nil {
return nil, false, err
}

if n == nil {
return nil, false, nil
}
Expand Down
5 changes: 5 additions & 0 deletions engine/sendmessage.go
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,10 @@ func (p *Engine) sendMessage(ctx context.Context, msg *message.Message) (*notifi
// set to nil if it's the current message
stat = nil
}
meta, err := p.a.Metadata(ctx, p.b.db, msg.AlertID)
if err != nil {
return nil, errors.Wrap(err, "lookup alert metadata")
}
notifMsg = notification.Alert{
Dest: msg.Dest,
AlertID: msg.AlertID,
Expand All @@ -79,6 +83,7 @@ func (p *Engine) sendMessage(ctx context.Context, msg *message.Message) (*notifi
CallbackID: msg.ID,
ServiceID: a.ServiceID,
ServiceName: name,
Meta: meta,

OriginalStatus: stat,
}
Expand Down
6 changes: 6 additions & 0 deletions gadb/models.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

87 changes: 87 additions & 0 deletions gadb/queries.sql.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit cbe89ca

Please sign in to comment.