Releases: navapbc/template-infra
v0.6.1
Summary
Layer | Has changes | Needs migration |
---|---|---|
Account | ||
Network | ✅ | 🚚 (optional) |
Build repository | ||
Database | ||
Service |
Enhancements
Fixes
- Add route table associations to fix S3 Gateway VPC endpoint and remove NAT gateway by @lorenyu in #499
Tech debt
- Move VPC endpoints into network module by @lorenyu in #502
- Filter has_database setting to applications in the network (relevant for multi-app projects) by @lorenyu in #502
Documentation
- Remove link to live running app. by @anybodys in #498
- Update network setup instructions by @lorenyu in #502
- Add comments to network-related settings in app-config by @lorenyu in #502
None of these are breaking changes, but you can have a smoother upgrade by following the migration notes below, which moves resources rather than destroying and recreating them.
Migration notes
First initialize the network you want to migrate/update
terraform -chdir=infra/networks init -reconfigure -backend-config=<NETWORK_NAME>.s3.tfbackend
Then move the VPC endpoints over to where they now reside
terraform -chdir=infra/networks state mv 'aws_security_group.aws_services[0]' 'module.network.aws_security_group.aws_services'
terraform -chdir=infra/networks state mv 'aws_vpc_endpoint.aws_service["ecr.api"]' 'module.network.aws_vpc_endpoint.interface["ecr.api"]'
terraform -chdir=infra/networks state mv 'aws_vpc_endpoint.aws_service["ecr.dkr"]' 'module.network.aws_vpc_endpoint.interface["ecr.dkr"]'
terraform -chdir=infra/networks state mv 'aws_vpc_endpoint.aws_service["kms"]' 'module.network.aws_vpc_endpoint.interface["kms"]'
terraform -chdir=infra/networks state mv 'aws_vpc_endpoint.aws_service["logs"]' 'module.network.aws_vpc_endpoint.interface["logs"]'
terraform -chdir=infra/networks state mv 'aws_vpc_endpoint.aws_service["secretsmanager"]' 'module.network.aws_vpc_endpoint.interface["secretsmanager"]'
terraform -chdir=infra/networks state mv 'aws_vpc_endpoint.aws_service["ssm"]' 'module.network.aws_vpc_endpoint.interface["ssm"]'
terraform -chdir=infra/networks state mv 'aws_vpc_endpoint.aws_service["s3"]' 'module.network.aws_vpc_endpoint.gateway["s3"]'
Then apply the rest of the terraform changes
make infra-update-network NETWORK_NAME=<NETWORK_NAME>
Full Changelog: v0.6.0...v0.6.1
v.0.6.0 🚨 Breaking changes
Layer | Has changes | Needs migration |
---|---|---|
Account | ||
Network | ✅ | |
Build repository | ||
Database | ✅ | 🚚 |
Service | ✅ | 🚚 |
🚨 Breaking changes
See migration notes below for how to migrate from v0.5.0
Infra layers with changes that need to be applied
- Network
New and updated functionality
- Create non-default VPC in network layer by @lorenyu in #496
- Lint markdown files for broken links. by @anybodys in #497
Tech debt
Documentation
- Add more documentation by @rocketnova in #471
- Document database access control by @lorenyu in #495
New Contributors
Migration notes
The following migration steps causes downtime. You will want to set a maintenance window of a few hours in which to perform these operations. It may be possible to do a zero-downtime migration but it will likely be much more complex. If that is required, please consult @lorenyu to discuss.
1. Creating your VPC
Configure and create a new non-default VPC for your application environment
make infra-configure-network NETWORK_NAME=<NETWORK_NAME>
make infra-update-network NETWORK_NAME=<NETWORK_NAME>
The plan will list 38 resources, including the VPC itself, public and private subnets, a NAT gateway, route table entries for the NAT gateway, and VPC endpoints for a variety of AWS services, including at the minimum ECR, S3, CloudWatch Logs, and if you have a database layer it will also include KMS, SSM, and Secrets Manager VPC endpoints.
Now you can move on to migrating the application environments to the new VPC. Note that the build repository layer is not associated with a VPC and therefore does not need to be migrated.
2. Migrating the database layer to the new VPC
(You can skip this step if you don't have a database)
Before you do this step, make sure you have database backups if you want to keep your data.
First manually delete the role manager lambda function, and tell terraform to "forget" the role manager security group. The reason is because the security group can only be deleted after the elastic network interfaces associated with the role manager lambda function have been deleted, which takes a few hours for AWS to automatically delete once the lambda function has been deleted, and can theoretically take up to 24 hours. To expedite the migration, you'll want to do these steps and then delete the orphaned security group in the future.
-
Get the role manager function name then delete it:
ROLE_MANAGER_FUNCTION_NAME=$(terraform -chdir=infra/<APP_NAME>/database show -json | jq -r '.values.root_module.child_modules[] | .resources[] | select(.address == "module.database.aws_lambda_function.role_manager").values.function_name') aws lambda delete-function --function-name "$ROLE_MANAGER_FUNCTION_NAME"
-
Get the security group id for the role manager
ROLE_MANAGER_SECURITY_GROUP_ID=$(terraform -chdir=infra/app/database show -json | jq -r '.values.root_module.child_modules[] | .resources[] | select(.address == "module.database.aws_security_group.role_manager").values.id') echo "$ROLE_MANAGER_SECURITY_GROUP_ID"
-
Now tell terraform to forget about this security group
terraform -chdir=infra/<APP_NAME>/database state rm module.database.aws_security_group.role_manager
You now need to modify the database to disable deletion protection since the database needs to be recreated.
-
Disable deletion protection on database cluster
DB_CLUSTER_ID=$(terraform -chdir=infra/app/database show -json | jq -r '.values.root_module.child_modules[] | .resources[] | select(.address == "module.database.aws_rds_cluster.db").values.id') aws --no-cli-pager rds modify-db-cluster --db-cluster-identifier "$DB_CLUSTER_ID" --no-deletion-protection --apply-immediately
-
Now apply the remaining changes to the database layer. This will delete your cluster.
make infra-update-app-database APP_NAME=<APP_NAME> ENVIRONMENT=<ENVIRONMENT>
-
Eventually, in a few hours or the following day, delete the orphaned security group
aws ec2 delete-security-group --group-id "$ROLE_MANAGER_SECURITY_GROUP_ID"
-
Now that your cluster is created, run the role manager to create the roles in the new database
make infra-update-app-database-roles APP_NAME=<APP_NAME> ENVIRONMENT=<ENVIRONMENT>
-
Then check that the roles were created successfully
make infra-check-app-database-roles APP_NAME=<APP_NAME> ENVIRONMENT=<ENVIRONMENT>
-
If you need to load from backups now could be a good time.
3. Migrating the service layer to the new VPC
-
Manually delete the application load balancer through the AWS Console before applying changes to the service layer. Otherwise you get an error where it fails to attach the security group from the new VPC onto the load balancer which is still in the old VPC.
LOAD_BALANCER_ARN=$(terraform -chdir=infra/app/service show -json | jq -r '.values.root_module.child_modules[] | .resources[] | select(.address == "module.service.aws_lb.alb").values.arn') aws elbv2 delete-load-balancer --load-balancer-arn "$LOAD_BALANCER_ARN"
-
Then apply the rest of the service layer changes
make infra-update-app-service APP_NAME=<APP_NAME> ENVIRONMENT=<ENVIRONMENT>
This is a lot so if you run into trouble don't hesitate to reach out for help.
Full Changelog: v0.5.0...v0.6.0
v0.5.0
Summary
Layer | Has changes | Needs migration |
---|---|---|
Account | ||
Network | ||
Build repository | ||
Database | ✅ | |
Service | ✅ |
New functionality
- Add feature flag system by @lorenyu in #483
- Add ability to pass in extra environment variables to the service by @lorenyu in #483
- Add ability to pass in extra IAM policies to be attached to the service by @lorenyu in #483
- Enable db query logging by @lorenyu in #479
Documentation
- Feature flags system design ADR by @lorenyu in #482
- Update some documentation around APP_NAME by @rocketnova in #484
Full Changelog: v0.4.0...v0.5.0
v.0.4.0 🚨 Breaking changes
Summary
Layer | Has changes | Needs migration |
---|---|---|
Account | ||
Network | ✅ | |
Build repository | ||
Database | ✅ | 🚚 |
Service | ✅ |
🚨 Breaking changes
- Use secrets manager to manage db master password by @charles-nava and @lorenyu in #461, #469, #476
See migration notes below for how to migrate from v0.3.0
New and updated functionality
- Use secrets manager to manage db master password by @charles-nava and @lorenyu in #461, #469, #476
- Update CI vulnerability scans to work with multiple apps by @daphnegold in #454
Fixes
- Add concurrency limit to build and publish workflow to @charles-nava in #451 — Fixes race condition when multiple runs of workflow run on same branch
- Install role manager dependencies on every terraform apply by @charles-nava in #452 — Fixes missing dependencies in deployed role manager lambda function when a second engineer deploys without modifying requirements.txt
- Shorten database IAM role name prefixes due to character limits by @rocketnova in #472 — Fixes errors with database layer when using long app names, environment names, and/or workspace names
Tech debt
- Refactor env configs into separate blocks by @lorenyu in #437
- Replace
terraform refresh
(deprecated) withterraform apply -refresh-only
by @charles-nava in #449 - Remove redundant comment by @rocketnova in #450
- Rename ci-app-vulnerability-scans.yml.yml to ci-app-vulnerability-scans.yml by @daphnegold in #465
Documentation
Migration notes
If the database cluster already exists and has manage_master_user_password
set to false
, the running terraform plan
on the database layer will fail with the following error:
In order to migrate, perform the following steps:
-
First do a targeted apply of the aws_rds_cluster by running the
following command (replace ENVIRONMENT_NAME with the correct
environment)TF_CLI_ARGS_apply='-target="module.database.aws_rds_cluster.db"' make infra-update-app-database APP_NAME=app ENVIRONMENT=<ENVIRONMENT_NAME>
-
Then you can apply the rest of the changes normally with
make infra-update-app-database APP_NAME=app ENVIRONMENT=<ENVIRONMENT_NAME>
Full Changelog: v0.3.0...v0.4.0
v0.3.0 🚨 Breaking changes
Summary
Layer | Has changes | Needs migration |
---|---|---|
Account | ||
Network | ||
Build repository | ✅ | |
Database | ✅ | 🚚 |
Service | ✅ | 🚚 |
🚨 Breaking changes
- Separate database access policy into separate policies for application (which has read/write access) and migrations (which has create access) by @charles-nava and @lorenyu in #422, #411, #426, #429, #434, #440, #442, #443, #444 (Note: This change only affects applications that have
has_database
set totrue
)
See migration notes below for how to migrate from v0.2.0
Architecture design changes
- ADR: Consolidate infra configuration from .tfvars files into config module
- Consolidate build-repository config by @lorenyu in #432
- Consolidate database and service configs by @lorenyu in #433 , #435 , #436 , and #441
Note: With this architecture change, you can now delete infra/<APP_NAME>/database/*.tfvars
and infra/<APP_NAME>/service/*.tfvars
files (see /docs/decisions/infra/0008-consolidate-infra-config-from-tfvars-files-into-config-module.md for more info)
New and updated functionality
- Separate database access policy into separate policies for application (which has read/write access) and migrations (which has create access) by @charles-nava and @lorenyu in #422, #411, #426, #429, #434, #440, #442, #443, #444
- Tail CloudWatch logs when running ECS tasks using run-command.sh by @lorenyu in #413 and #423
- Add GitHub action linter by @lorenyu in #419
- Add linter for scripts by @lorenyu in #420
- Add ability to check database roles configuration by @lorenyu in #416
Tech debt and maintenance
- Update Makefile .PHONY list by @lorenyu in #410
- Bump github.com/hashicorp/go-getter from 1.6.1 to 1.7.0 in /infra/test by @dependabot in #225
- Use v3 of aws-actions/configure-aws-credentials by @lorenyu in #417
- Group logs in infra CI service tests by @lorenyu in #430
- Remove cd.yml and ci-infra.yml from update-template exclude list by @lorenyu in #414
Documentation
- Fix broken link; improve template application usage instructions by @NavaTim in #395
- Add update template instructions by @lorenyu in #409
- Fix hadolint typo in vulnerability-management.md by @yoomlam in #396
Migration notes
Instructions for migrating from v0.2.0
Do the following for each environment:
Step 1: Create new database policies
Do a targeted terraform apply to create the two new policies for the application service's access to the database and the migrator's access to the database. Replace <APP_NAME>
and <ENVIRONMENT>
with the name of your app and the environment you are updating.
TF_CLI_ARGS_apply='-target module.database.aws_iam_policy.app_db_access -target module.database.aws_iam_policy.migrator_db_access' make infra-update-app-database APP_NAME="<APP_NAME>" ENVIRONMENT="<ENVIRONMENT>"
Verify that the roles and policies are properly configured by running the database role check:
make infra-check-app-database-roles APP_NAME="<APP_NAME>" ENVIRONMENT="<ENVIRONMENT>"
Note: If the lambda function times out, try running it a second time.
Step 2: Update the service layer to use the new policies
make infra-update-app-service APP_NAME="<APP_NAME>" ENVIRONMENT="<ENVIRONMENT>"
Note: If you see the following error, try re-running make infra-update-app-service
│ Error: creating IAM Role (portal-dev-migrator): EntityAlreadyExists: Role with name portal-dev-migrator already exists.
│ status code: 409, request id: 729ac1c2-4972-4da6-b88a-6d2aab546b7c
│
│ with module.service.aws_iam_role.migrator_task[0],
│ on ../../modules/service/access-control.tf line 15, in resource "aws_iam_role" "migrator_task":
│ 15: resource "aws_iam_role" "migrator_task" {
│
Verify that the service is still healthy in AWS console.
Verify that the migrator can still run migrations by running database migrations
make release-run-database-migrations APP_NAME="<APP_NAME>" ENVIRONMENT="<ENVIRONMENT>"
Note: This step assumes that you have deployed using the current commit, which doesn't work if you are testing on an a yet-to-be-deployed branch. You can fix this by passing in the IMAGE_TAG argument with the latest tag in ECR:
make release-run-database-migrations APP_NAME="<APP_NAME>" ENVIRONMENT="<ENVIRONMENT>" IMAGE_TAG="<latest tag in ECR>"
Step 3: Apply the rest of the database layer changes to remove the old policy
make infra-update-app-database APP_NAME="<APP_NAME>" ENVIRONMENT="<ENVIRONMENT>"
New Contributors
- @NavaTim made their first contribution in #395
- @dependabot made their first contribution in #225
- @yoomlam made their first contribution in #396
Full Changelog: v0.2.0...v0.3.0
v0.2.0 🚨 Breaking changes
🚨 Breaking changes
See migration notes below for how to migrate from v0.1.0
New and updated functionality
- Lock down PosgreSQL database's public schema by @aplybeah in #387
- Remove ALL privileges from PosgreSQL database's PUBLIC role by @aplybeah in #387
- Require APP_NAME in relevant make commands by @daphnegold in #392 and #394 and #398
- Tweak CD workflow name by @lorenyu in #399
- Remove versioning on access log bucket by @charles-nava in #400
- Store template version in a file on project by @lorenyu in #402
- Use git patch for applying template changes by @lorenyu in #403 and #404
Fixes
- Fix support for multiple environments on projects that has a database (Moving VPC endpoints from database module to network module fixes this)
Tech debt
- Organize the database module into logical tf files by @aplybeah in #384
- Organize the service module into logical tf files by @aplybeah in #386
- Refactor infra tests to prepare for db tests by @lorenyu in #388
Documentation
- [ISSUE 389] Update set-up-ci.md by @daphnegold in #390
- Add instructions for setting up monitoring alerts by @gingeririna in #359
Migration notes
Instructions for migrating from v0.1.0
Step 0: Initialize the terraform module to work with the correct environment
First, initialize the correct backend for the environment you are working with. If you are working with the dev environment do
terraform -chdir=infra/app/database init -backend-config=dev.s3.tfbackend
Step 1: configure the new network module and initialize terraform
make infra-configure-network
terraform -chdir=infra/networks init -backend-config=default.s3.tfbackend
at this point you may check in the tfbackend file
git add infra/networks/default.s3.tfbackend
git commit
Step 2: get the vpc endpoint ids from the database module
KMS_VPC_ENDPOINT_ID=$(terraform -chdir=infra/app/database show -json | jq -r '.values.root_module.child_modules[] | .resources[] | select(.address == "module.database.aws_vpc_endpoint.kms").values.id')
echo $KMS_VPC_ENDPOINT_ID
SSM_VPC_ENDPOINT_ID=$(terraform -chdir=infra/app/database show -json | jq -r '.values.root_module.child_modules[] | .resources[] | select(.address == "module.database.aws_vpc_endpoint.ssm").values.id')
echo $SSM_VPC_ENDPOINT_ID
Step 3: import the vpc endpoint ids into the network module
terraform -chdir=infra/networks import 'aws_vpc_endpoint.aws_service["kms"]' $KMS_VPC_ENDPOINT_ID
terraform -chdir=infra/networks import 'aws_vpc_endpoint.aws_service["ssm"]' $SSM_VPC_ENDPOINT_ID
Step 4: finally create the rest of the network resources i.e. the vpc_enpoints security group
make infra-update-network
Step 5: remove the VPC endpoints from the database module's state file now that it has been migrated to the network module
terraform -chdir=infra/app/database state rm module.database.aws_vpc_endpoint.kms
terraform -chdir=infra/app/database state rm module.database.aws_vpc_endpoint.ssm
Step 6: apply the rest of the changes to the database module, but first updating the egress rule to no longer reference the vpc endpoints security group so that we can destroy it properly
terraform -chdir=infra/app/database apply -var-file=dev.tfvars -target=module.database.aws_vpc_security_group_egress_rule.role_manager_egress_to_vpc_endpoints
make infra-update-app-database APP_NAME=app ENVIRONMENT=dev
For reference, these are the PRs that introduced this change:
New Contributors
- @daphnegold made their first contribution in #390
Full Changelog: v0.1.0...v0.2.0
v0.1.0
New functionality
- Add load balancer access logs by @charles-nava in #362 and #380
Documentation
- Update set-up-database.md to clarify that APP_NAME can be changed by @KevinJBoyer in #372
- Add instruction to update account name in app config by @lorenyu in #374 and #375
- Explain how to fix default table permissions in app by @lorenyu and @sawyerh in #377
Fixes
- Fix triggering of HTTP 500 alerts when app is quiet by @gingeririna in #369
New Contributors
- @KevinJBoyer made their first contribution in #372
- @charles-nava made their first contribution in #362
Full Changelog: 2023-08-08...2023-08-18
2023-08-08
2023-08-03
Notable recent changes
- ADR - Database Migration Method by @Nava-JoshLong in #295
- Clarify PR template instructions by @lorenyu in #304
- 329 add email and external incident management tools integrations by @gingeririna in #337
- Group verbose logs by @aplybeah in #360
- Document template development workflow by @lorenyu in #363
- Update instructions based on an initial project test run by @sawyerh in #364
- Remove workaround for aws_s3_bucket_logging race condition by @lorenyu in #361
- Fix error when re-running account setup by @lorenyu in #365
New Contributors
- @rocketnova made their first contribution in #1
- @aplybeah made their first contribution in #2
- @karinamzalez made their first contribution in #4
- @shawnvanderjagt made their first contribution in #16
- @lorenyu made their first contribution in #47
- @sawyerh made their first contribution in #78
- @danielleswyhart made their first contribution in #172
- @Nava-JoshLong made their first contribution in #203
- @navams made their first contribution in #246
- @krishnaduttPanchagnula made their first contribution in #289
- @gingeririna made their first contribution in #336
Full Changelog: https://github.com/navapbc/template-infra/commits/2023-08-03