Skip to content

Releases: mozilla-it/ctms-api

v1.1.2 - Improvements to Stripe object ingestion

15 Dec 18:49
v1.1.2
b0d8dd5
Compare
Choose a tag to compare

Improvements to POST /stripe_from_pubsub, for better behavior with production-level traffic.

API

  • POST /stripe_from_pubsub and POST /stripe return a 409 Conflict if the changes fail due to a database error, such as an IntegrityError due to duplicate IDs, or a deadlock. Previously, these returned 500 Server Error.
  • If a Stripe customer is submitted that has the same Firefox Account ID (FxA ID) as an existing Stripe customer, the existing Stripe customer is deleted. This was seen on stage, but not production, and may be due to a bug or direct interaction with Stripe. The deletion most closely matches what we believe happens in the FxA Stripe cache, which is indexed by FxA ID.
  • Stripe ingest now correctly updates invoice.default_source_id and invoice_line_item.stripe_subscription_item_id.

Deployments

  • Database changes that may impact request timing and success:
    • Stripe ingest endpoints now use SELECT ... FOR UPDATE. This will hopefully reduce write conflicts, but may lead to increased deadlocks.
    • Duplicate FxA ID detection adds an extra database SELECT to customer creation and some customer updates (when the FxA ID changes), potentially slowing requests.
    • Stripe ingest updates now correctly parse Stripe timestamps as UTC timestamps rather than timezone-naive timestamps, which avoids database writes that do not change the data.
  • Log changes:
    • Structured request logs for /stripe_from_pubsub and /stripe have context changes:
      • Added fxa_id_conflict, listing the FxA ID that was found on a new and existing Stripe customer.
      • Added ingest_actions to detail the contents of the Stripe payload. The keys are the action taken: created, updated, no_change, deleted, and skipped), and the values are a list of objects, represented as object_type:object_id, such as ["subscription:sub_abc123"].
      • Removed stripe_unknown. Unknown objects are now found in ingest_actions["skipped"].
    • An error-level log ("Severity": 3) is emitted with message "IntegrityError converted to 409", or "OperationalError converted to 409", for database exceptions handled by the Stripe endpoints. The log contains the context, and they are no longer sent to Sentry. This is followed by the standard request log for the 409 Conflict returned to the caller.
    • In the Acoustic sync service log message "sync_service cycle complete", the context retry_backlog now has the correct value. Previously, it was a duplicate of the sync_backlog value. The associated metric gauge ctms_background_acoustic_sync_retries had the correct value.

Other

  • Updated from Python 3.9.7 to 3.9.9.
  • Updated fastapi from 0.65.3 to 0.70.0, starlette from 0.14.2 to 0.16.0, and lxml from 4.6.4 to 4.7.1.
  • Updated several documents:
    • Updated overview architecture diagram for Cinchy interaction and Stripe ingestion from FxA.
    • Added a Stripe entity diagram.
    • Synced docs/configuration.md with the current environment configuration, including adding the Acoustic Sync configuration variables.
    • Updated docs/deployment_guide.md, syncing with the current deployment methods, updating the logging section, and adding overview, metrics, and dashboards sections.
    • Updated docs/developer_setup.md with an "Updating Dependencies" section.

v1.1.1 - Bugfix for product segments

06 Dec 21:11
v1.1.1
37d2a44
Compare
Choose a tag to compare

This release fixes a bug when generating the product segment for some users.

API Changes

  • When a Stripe customer has multiple subscriptions to a product, and the latest status was a failure such as incomplete_expired, the code attempted to set the product segment to re-other, which was an invalid value. In these cases, the segment is now other, the same as when they have a single subscription with a status other than active or canceled.

v1.1.0 - Stripe objects and product subscriptions

06 Dec 20:26
v1.1.0
cde9694
Compare
Choose a tag to compare

This release adds the ability to ingest Stripe objects from the Firefox Accounts (FxA) Firestore cache via a PubSub queue. These are processed to determine the product subscriptions for a contact, and these are synced to a relational table in Acoustic. The product subscriptions are not exposed on the contact in the API.

API

  • A new endpoint, POST /stripe, takes Stripe objects and adds them to the CTMS database. The supported objects are customer, subscription, and invoice. This endpoint takes CTMS OAuth2 credentials.
  • A new endpoint, POST /stripe_from_pubsub, takes PubSub push requests with a Stripe object, or dictionary of keys to Stripe objects, as payload. This endpoint checks the Javascript Web Token (JWT) authentication header, and verifies the claimed audience and email. The endpoint also takes a client "secret" as a URL parameter. This endpoint returns 202 for content issues to prevent PubSub from submitting again.
  • Loading contacts now loads the related Stripe data, and converts them to products. This will increase the number of database requests to read or update a contact.

Acoustic Sync Service

  • A contact's product subscriptions are synced to a new Acoustic relational table. This includes placeholder columns for future subscription data.
  • The Acoustic sync service does not sleep if it processed a full batch of contacts, to speed up processing a backlog of contacts.
  • Added a timeout to Acoustic requests, with a default of 5.0 seconds. If the timeout is reached, syncing fails for that contact and it is retried later.

Deployments

  • The database includes new tables for Stripe data, added by migrations: stripe_customer, stripe_price, stripe_invoice, stripe_invoice_line_item, stripe_subscription, and stripe_subscription_item. The primary key is the stripe_id column. The tables refer to each other - stripe_subscription.stripe_customer_id refers to a stripe_customer.stripe_id - but foreign keys are not used because the data may come in an unexpected order from FxA.
  • The API __heartbeat__ endpoint now includes details of the Acoustic sync backlog. Optional settings sets maximum levels for the backlog and the retry backlog, to make the heartbeat fail. The default is no maximum.
  • The API process now reads the background process settings from environment variables as well. Some are reported in the __heartbeat__ endpoint.
  • The background sync service can optionally write the current time to a file, at startup and once per loop. This can be checked by a new process ctms/bin/healthcheck_sync.py as Kubernetes startup and liveness check.
  • Environment Variables:
    • Added CTMS_PUBSUB_AUDIENCE and CTMS_PUBSUB_EMAIL, to validate the JWT claim for POST /stripe_from_pubsub.
    • Added CTMS_PUBSUB_CLIENT, checked against the query string parameter in POST /stripe_from_pubsub?pubsub_client=<client_id>.
    • Added CTMS_ACOUSTIC_PRODUCT_SUBSCRIPTIONS_ID, required in the background process, for the product relational table ID.
    • Added optional CTMS_ACOUSTIC_MAX_BACKLOG and CTMS_ACOUSTIC_MAX_RETRY_BACKLOG. If set, __healthcheck__ will fail if the backlog or the retry backlog exceeds these limits.
    • Added optional CTMS_BACKGROUND_HEALTHCHECK_PATH and CTMS_BACKGROUND_HEALTHCHECK_AGE_S. If the path is set, the background process will write the current timestamp. If both are set, ctms/bin/healthcheck_sync.py will read the timestamp file and exit with a failing code if it is older than the age in seconds.
    • Added optional CTMS_ACOUSTIC_TIMEOUT_S, to set the timeout for requests to Acoustic. The default is 5.0 seconds.
  • Metrics updates:
    • The new counter ctms_pending_acoustic_sync_total is incremented when an Acoustic sync is scheduled, from an existing endpoint like POST /ctms or PATCH /ctms/<email_id>, as well as the new Stripe ingest endpoints.
    • The ctms_background_acoustic_requests_duration and ctms_background_acoustic_sync_loopsmetrics now include tag table, to identify the table synced (main for the main contact table, newsletter and product for the relational tables).
    • The new counter ctms_background_acoustic_sync_loops increments when a sync loop completes processing a batch of contacts and before sleeping (if requested). This can be used to detect if the sync process is stuck.
    • The new gauge ctms_background_acoustic_sync_age_s gives the age of the sync request for the last synced item that was not re-queued for retrying. This can be used to determine the impact of Acoustic API slowdowns or large backlogs.
  • Log updates:
    • The background process now emits structured logs, and the log lines have been reduced.
    • The background process emits on INFO message at startup, "Setting up sync_service.", with the sync_feature_flag in context.
    • The background process emits one INFO message per loop, "sync_service cycle complete". The log context includes:
      • How many contacts were synced, and and the count by sync status.
      • "trivial": true if no contact were synced.
      • The duration of the loop, and the planned sleep duration.
    • The background process emits one DEBUG message per contact ("Successfully sync'd contact to acoustic..." or "Failure for contact in sync to acoustic..."). The log context includes:
      • The email_id.
      • The email address, if a contact's email matches the +trace_me_mozilla_ pattern.
      • The names of skipped columns, except for known columns, such as update_timestamp, which are silently skipped.
      • If the fxa_created date was successfully parsed into a datetime, or what went wrong.
      • The slugs of any skipped newsletters.
      • The status and duration of Acoustic sync requests.
      • The count of rows for the newsletter and product relational tables.
    • The new Stripe endpoints log the payload if the Stripe object has an email that matches the +trace_me_mozilla_ pattern

Other

  • Added adminer to the development database as postgres-admin, to allow viewing the database.
  • Added new script ctms/bin/ingest_stripe_data.py that can import one or more Stripe objects from a JSON file.
  • Updated to Python 3.9.7. The accepted range is 3.7.x to 3.10.x (raised from 3.9.x).
  • The PostgreSQL client psycopg2 is now built from source rather than installed as a wheel, meaning that libpq5 is shipped in the deployment object, and development libraries are needed when building on a local developer's machine. This allows an arm64 build for Apple Silicon.
  • Updated several dependencies, such as fastapi 0.65.3, alembic 1.7.5, google-cloud-core 2.2.1, psycopg2 2.9.2, and uvicorn 0.15.0.
  • Updated several development tools, such as black 21.10b0, bandit 1.7.1, mypy 0.910, pylint 2.12.1, and black 21.11.b1.
  • Switched pre-commit to the Poetry environment, to avoid out-of-date dependencies.
  • Moved documentation from guides/ to docs/, and refreshed and reworded documentation.
  • Added docs/adrs for Architectural Decision Records, with ADR for Stripe syncing.
  • Moved scripts/lint.sh to docker/lint.sh and scripts/test.sh to docker/test.sh. Removed some unused scripts.
  • Removed auto-documentation stubs and documentation deploy to Github pages.
  • Set CODEOWNERS from a team to the current development staff .

v1.0.2 - Adding a newsletter!

23 Jul 18:28
Compare
Choose a tag to compare

This really shouldn't require a release!

Fixing of Date-Formatting in CTMS to Acoustic Sync

25 Jun 20:27
3d4bd46
Compare
Choose a tag to compare

This version includes changes desired by Marketing to enable time-based queries for VPN-based offers.

The data in Acoustic previously was not queryable as it was in string-timestamps that Acoustic did not understand.

v1.0.0 - Production! Acoustic batched processing, metrics

09 Jun 20:39
v1.0.0
d905ddc
Compare
Choose a tag to compare

This release updates how the Acoustic synchronization job processes large backlogs, and adds metrics.

Tag v0.8.3 has been running in production without Salesforce for a few weeks, so we're bumping the version number to 1.0.0. Scripts used during the final import have been updated in this release.

Acoustic Synchronization Job

  • Pending updates are now processed in batches, rather than all pending updates. This avoids long processing runtime without feedback. The default is 20 updates per batch.
  • Prometheus metrics are pushed to the pushgateway, if configured.

Deployments

  • Two new environment variables to tune the Acoustic Synchronization Job:
    • ACOUSTIC_BATCH_LIMIT - set the number of updates per batch
    • PROMETHEUS_PUSHGATEWAY_URL - set the URL of the Prometheus push gateway
  • New metrics are available, if configured:
    • ctms_background_acoustic_request_total - Total count of acoustic requests by method and status
    • ctms_background_acoustic_requests_duration - Histogram of requests processing time by method (in seconds)
    • ctms_background_acoustic_sync_total - Total count of contacts synced to acoustic
    • ctms_background_acoustic_sync_retries - Gauge of pending records with >0 retries to acoustic
    • ctms_background_acoustic_sync_backlog - Gauge of the number of contacts in the sync backlog. Not counting over-retried records.

Other Changes

  • The import script scripts/importers/setup.sql and scripts/importers/finish.sql includes updates for the final import, such as index dropping and creation, case-insensitive duplicate email dropping, and newsletter source column cleanup.

v0.8.3 - Update Acoustic column map

21 May 22:54
v0.8.3
1d2375d
Compare
Choose a tag to compare

Acoustic Synchronization Job

  • Change the column names for the newsletter table timestamps to match the names in Acoustic production. The Acoustic sandbox will be updated to match production.

v0.8.2 - Case-insensitive emails

21 May 00:51
v0.8.2
f76181b
Compare
Choose a tag to compare

This release makes email matching case-insensitive.

API Changes

  • When searching by email, such as GET /[email protected], a case-insensitive match is used. The two searches are by primary email and by Firefox Accounts primary email.

Deployments

  • A lowercase index has been added for the primary email addresses as well as the Firefox Accounts email address. A unique lowercase index has been manually added for the primary email in stage and production.

Other Changes

  • A script import-mofo.sql has been added to import Mozilla Foundation data from a CSV file.

v0.8.1 - Monitoring and debugging

21 May 00:44
v0.8.1
2f88444
Compare
Choose a tag to compare

This release adds and improves monitoring, and adds some debugging capabiltiies:

  • Increased API logging of contacts with a tracing string in the email address
  • Improve logging and add Sentry integration to the Acoustic sync background job
  • Fix web metrics

API Changes

  • Added the ability to trace requests by contact email. If a contact has a primary email with the the string +trace-me-mozilla- in it, like [email protected], then the API request logs will include the email address and, when provided, the request JSON.

Acoustic Synchronization Job

  • The log level is set by the environment variable CTMS_LOGGING_LEVEL, and defaults to DEBUG. The variable CTMS_USE_MOZLOG will also set the log format to the MozLog JSON format.
  • Sentry integration added to capture exceptions

Deployments

  • Fixed a bug with web metrics, where the gunicorn process serving /metrics counted its own metrics twice, causing counters to fluctuate based on which process was randomly chosen.

Other Changes

  • make lint now builds the lint target of the Docker image, running the same checks as CI. This now includes pylint
  • scripts/lint.sh now skips detect-secrets if git is unavailable, such as in the container

v0.8.0 - Acoustic integration

13 May 01:48
v0.8.0
a7991c7
Compare
Choose a tag to compare

This release adds Acoustic integration, completing a major remaining data flow for CTMS.

  • When a contact is updated, the related contact is updated in Acoustic
  • The interactive docs are easier to use now that more endpoints return representations rather than redirects
  • The web API emits request metrics and structured logs

API Changes

  • Endpoints that create or update contacts, like POST /ctms, PUT /ctms/{email_id}, and PATCH /ctms/{email_id}, now return a 200 OK or a 201 Created and the new contact representation. Previously, they returned a 303 See Other to /ctms/{email_id}, which required requesting with the OAuth2 access token and caused problems for the Swagger interactive docs and for other clients.
  • For /updates, invalid values for limit, after, and other parameters should now return a 422 Validation Error, instead of a 500 Server Error.
  • More timestamps are consistently in the UTC timezone

Acoustic Integration

  • Add pending_acoustic table, which gets a new row for each update to a record from the POST /ctms, PUT /ctms/{email_id}, and PATCH /ctms/{email_id} APIs.
  • Add initial integration with Acoustic API using the silverpop library.
  • Add ctms/bin/acoustic_sync.py, which processes records in the pending_acoustic table and syncs them to Acoustic.

Deployments

  • The health endpoints /__heartbeat__ and /__lbheartbeat__ now accept HEAD as well as GET requests.
  • Add Prometheus, served from /metrics. These run in multiprocessing mode when run in production under gunicorn. The metrics are:
    • ctms_requests - A counter for each endpoint, labelled by method, path template (like /ctms/{email_id}), status code, and status code family (like 2xx)
    • ctms_requests_duration - A histogram of request time as seen by the web server, with bucket breakpoints from 10 ms to 10s, and labelled with method, path template, and status code family
    • ctms_api_requests - A counter for each API request, labelled by method, path template, API client ID, and status code family
  • Stop sending error-level logs from uvicorn and web requests to Sentry. These logs report exceptions that are already captured by Sentry, and appear as duplicate issues.
  • Re-implement structured logging with structlog, using the same code paths for development and deployments with gunicorn
  • Add a "trivial" = True logging tag when a bot makes an expected request to a monitoring endpoint, for filtering out in log viewer.
  • Add new environment variables:
    • FASTAPI_ENV - Set in Dockerfile to development or ``production
    • IS_GUNICORN - Set in Dockerfile to 1 in production
    • PROMETHEUS_MULTIPROC_DIR - Set in docker_entrypoint.sh to a fresh directory for Prometheus multiprocessing metrics
    • ACOUSTIC_RETRY_LIMIT - Default 6, how many times to try to sync a record
    • ACOUSTIC_SERVER_NUMBER - Default 6, identifies the acoustic API server
    • ACOUSTIC_LOOP_MIN_SECS - Default 5, sets the rate of sync requests
    • ACOUSTIC_CLIENT_ID - The client_id for Acoustic OAuth2 credentials
    • ACOUSTIC_CLIENT_SECRET - The client_secret for Acoustic OAuth2 credentials
    • ACOUSTIC_REFRESH_TOKEN - The initial refresh_token for Acoustic OAuth2 refresh requests
    • ACOUSTIC_MAIN_TABLE_ID - The identifier of the main contact table
    • ACOUSTIC_NEWSLETTER_TABLE_ID - The identifier of the newsletter relational table
    • ACOUSTIC_SYNC_FEATURE_FLAG - Default False, set to True to enable running acoustic_sync.py
    • ACOUSTIC_INTEGRATION_FEATURE_FLAG - Default False, set to True to sync contacts to Acoustic

Other Changes

  • Update from Python 3.8.2 to 3.8.10, and update several dependencies to recent versions.
  • Refresh documentation, and get auto-generated docs working again.
  • Add mypy.ini, and start stricter type checking on some files