Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: Implement a data copy command that copies data cross-VPC #565

Merged
merged 44 commits into from
Oct 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
1c4dc7c
Added temporary new dir with new data copy functionality
antroy-madetech Sep 16, 2024
116a9a0
Tidy up commands in the entrypoint
antroy-madetech Sep 16, 2024
3fbdca8
WIP
antroy-madetech Sep 27, 2024
ee7307d
add validation for database copy configuration
ksugden Oct 3, 2024
2fe11ba
Added checks to ensure that you can only copy to and from existing en…
antroy-madetech Oct 3, 2024
8777278
Do not allow database copy to and from the same environment
ksugden Oct 3, 2024
4985f21
Linting
ksugden Oct 3, 2024
e5c2674
Added prod env copy restriction
antroy-madetech Oct 3, 2024
e655443
Fix the structure of the to anf from list
antroy-madetech Oct 3, 2024
c704a4f
Account for multiple database_copy sections
antroy-madetech Oct 3, 2024
83c3649
Add breaking tests for the multi-postgres case
antroy-madetech Oct 3, 2024
7d25ba8
Multi-extension support added.
antroy-madetech Oct 4, 2024
944812b
Wired in the new validation function into the main validate_platform_…
antroy-madetech Oct 4, 2024
80fd015
renamed test
antroy-madetech Oct 4, 2024
3e5fada
Refactoring get_connection_string out of the database command. Also s…
antroy-madetech Oct 8, 2024
2327324
Added a function to get a Vpc
antroy-madetech Oct 8, 2024
255f815
Handle the case where no matching vpcs found
antroy-madetech Oct 8, 2024
45d5ce5
Failure case around no VPC Id
antroy-madetech Oct 8, 2024
ec3ea54
Missing subnets case
antroy-madetech Oct 8, 2024
fad95bc
No matching security groups case
antroy-madetech Oct 8, 2024
eaffc82
Added a first stab at the run_database_copy_task function
antroy-madetech Oct 10, 2024
b30b436
Added load case to test/run_database_copy_task function
antroy-madetech Oct 10, 2024
0fdcdd4
Add tests for database_load and database_dump
ksugden Oct 10, 2024
738c19a
Refactor: extract private execute function
ksugden Oct 10, 2024
30bf812
Fix import and patching weirdness by moving database helper into own …
ksugden Oct 10, 2024
75c6e11
Wired the new command class through into click
antroy-madetech Oct 10, 2024
f0570f0
Dump command complete
antroy-madetech Oct 10, 2024
8eb81db
Add load command
ksugden Oct 11, 2024
10a4075
Now getting subnets from the route_tables rather than the vpc
antroy-madetech Oct 11, 2024
532446f
Add function for confirming database is ready to load
ksugden Oct 11, 2024
2d8a836
Refactored method signatures
antroy-madetech Oct 11, 2024
163c167
Add test for is_confirmed_ready_to_load negative case
ksugden Oct 11, 2024
4d99e22
Moved input_fn into class constructor
antroy-madetech Oct 11, 2024
84d9a05
Added in user confirmation
antroy-madetech Oct 11, 2024
0bc402e
Using click.prompt instead of builtin input
antroy-madetech Oct 11, 2024
1c7fb38
Added wait functionality into the db copy commands
antroy-madetech Oct 11, 2024
8561611
Merge branch 'main' into DBTP-1109-DataCopy
antroy-madetech Oct 11, 2024
c217a44
Updated entrypoint to use a text format for the dump file rather than…
antroy-madetech Oct 14, 2024
645d7aa
fix: make bucket and task names non-clashing (#598)
acodeninja Oct 14, 2024
34b4046
Fix test for new task name
antroy-madetech Oct 14, 2024
25d44bd
Wired in logs
antroy-madetech Oct 14, 2024
6ef0af6
Better messaging
antroy-madetech Oct 14, 2024
d0f8ed3
Tidy up
antroy-madetech Oct 14, 2024
65d5466
Removed commented out old database copy command
antroy-madetech Oct 16, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 47 additions & 9 deletions dbt_platform_helper/COMMANDS.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,8 @@
- [platform-helper notify environment-progress](#platform-helper-notify-environment-progress)
- [platform-helper notify add-comment](#platform-helper-notify-add-comment)
- [platform-helper database](#platform-helper-database)
- [platform-helper database copy](#platform-helper-database-copy)
- [platform-helper database dump](#platform-helper-database-dump)
- [platform-helper database load](#platform-helper-database-load)
- [platform-helper version](#platform-helper-version)
- [platform-helper version get-platform-helper-for-project](#platform-helper-version-get-platform-helper-for-project)

Expand Down Expand Up @@ -964,7 +965,7 @@ platform-helper notify add-comment <slack_channel_id> <slack_token>
## Usage

```
platform-helper database copy
platform-helper database (dump|load)
```

## Options
Expand All @@ -974,27 +975,64 @@ platform-helper database copy

## Commands

- [`copy`](#platform-helper-database-copy)
- [`dump`](#platform-helper-database-dump)
- [`load`](#platform-helper-database-load)

# platform-helper database copy
# platform-helper database dump

[↩ Parent](#platform-helper-database)

Copy source database to target database.
Dump a database into an S3 bucket.

## Usage

```
platform-helper database copy <source_db> <target_db>
platform-helper database dump --account-id <account_id> --app <application>
--env <environment> --database <database>
--vpc-name <vpc_name>
```

## Arguments
## Options

- `--account-id <text>`

- `--app <text>`

- `--env <text>`

- `--database <text>`

- `source_db <text>`
- `target_db <text>`
- `--vpc-name <text>`

- `--help <boolean>` _Defaults to False._
- Show this message and exit.

# platform-helper database load

[↩ Parent](#platform-helper-database)

Load a database from an S3 bucket.

## Usage

```
platform-helper database load --account-id <account_id> --app <application>
--env <environment> --database <database>
--vpc-name <vpc_name>
```

## Options

- `--account-id <text>`

- `--app <text>`

- `--env <text>`

- `--database <text>`

- `--vpc-name <text>`

- `--help <boolean>` _Defaults to False._
- Show this message and exit.

Expand Down
6 changes: 4 additions & 2 deletions dbt_platform_helper/commands/conduit.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,9 @@

from dbt_platform_helper.utils.application import Application
from dbt_platform_helper.utils.application import load_application
from dbt_platform_helper.utils.aws import update_postgres_parameter_with_master_secret
from dbt_platform_helper.utils.aws import (
get_postgres_connection_data_updated_with_master_secret,
)
from dbt_platform_helper.utils.click import ClickDocOptCommand
from dbt_platform_helper.utils.messages import abort_with_error
from dbt_platform_helper.utils.platform_config import is_terraform_project
Expand Down Expand Up @@ -171,7 +173,7 @@ def create_postgres_admin_task(
Name=master_secret_name, WithDecryption=True
)["Parameter"]["Value"]
connection_string = json.dumps(
update_postgres_parameter_with_master_secret(
get_postgres_connection_data_updated_with_master_secret(
session, read_only_secret_name, master_secret_arn
)
)
Expand Down
141 changes: 26 additions & 115 deletions dbt_platform_helper/commands/database.py
Original file line number Diff line number Diff line change
@@ -1,122 +1,33 @@
import subprocess
from typing import List

import click

from dbt_platform_helper.commands.conduit import add_stack_delete_policy_to_task_role
from dbt_platform_helper.commands.conduit import addon_client_is_running
from dbt_platform_helper.commands.conduit import connect_to_addon_client_task
from dbt_platform_helper.commands.conduit import get_cluster_arn
from dbt_platform_helper.commands.conduit import normalise_secret_name
from dbt_platform_helper.utils.application import load_application
from dbt_platform_helper.utils.aws import get_aws_session_or_abort
from dbt_platform_helper.utils.aws import update_postgres_parameter_with_master_secret
from dbt_platform_helper.commands.database_helpers import DatabaseCopy
from dbt_platform_helper.utils.click import ClickDocOptGroup
from dbt_platform_helper.utils.versioning import (
check_platform_helper_version_needs_update,
)


@click.group(chain=True, cls=ClickDocOptGroup)
def database():
check_platform_helper_version_needs_update()


@database.command(name="copy")
@click.argument("source_db", type=str, required=True)
@click.argument("target_db", type=str, required=True)
def copy(source_db: str, target_db: str):
"""Copy source database to target database."""
app = None
source_env = None
target_env = None

for tag in get_database_tags(source_db):
if tag["Key"] == "copilot-application":
app = tag["Value"]
if tag["Key"] == "copilot-environment":
source_env = tag["Value"]
if app is not None:
break

for tag in get_database_tags(target_db):
if tag["Key"] == "copilot-environment":
target_env = tag["Value"]
break

if not app or not source_env or not target_env:
click.secho(f"""Required database tags not found.""", fg="red")
exit(1)

if target_env == "prod":
click.secho(f"""The target database cannot be a production database.""", fg="red")
exit(1)

if source_db == target_db:
click.secho(f"""Source and target databases are the same.""", fg="red")
exit(1)

if not click.confirm(
click.style("Copying data from ", fg="yellow")
+ click.style(f"{source_db} ", fg="white", bold=True)
+ click.style(f"in environment {source_env} to ", fg="yellow", bold=True)
+ click.style(f"{target_db} ", fg="white", bold=True)
+ click.style(f"in environment {target_env}\n", fg="yellow", bold=True)
+ click.style("Do you want to continue?", fg="yellow"),
):
exit()

click.echo(f"""Starting task to copy data from {source_db} to {target_db}""")

source_db_connection = get_connection_string(app, source_env, source_db)
target_db_connection = get_connection_string(app, target_env, target_db)

application = load_application(app)
cluster_arn = get_cluster_arn(application, source_env)
task_name = f"database-copy-{app}-{source_env}-{app}-{target_env}"

if not addon_client_is_running(application, source_env, cluster_arn, task_name):
subprocess.call(
f"copilot task run --app {app} --env {source_env} "
f"--task-group-name {task_name} "
f"--image public.ecr.aws/uktrade/tunnel:database-copy "
f"--env-vars SOURCE_DB_CONNECTION='{source_db_connection}',TARGET_DB_CONNECTION='{target_db_connection}' "
"--platform-os linux "
"--platform-arch arm64",
shell=True,
)
add_stack_delete_policy_to_task_role(application, source_env, task_name)
connect_to_addon_client_task(application, source_env, cluster_arn, task_name)


def get_database_tags(db_identifier: str) -> List[dict]:
session = get_aws_session_or_abort()
rds = session.client("rds")

try:
db_instance = rds.describe_db_instances(DBInstanceIdentifier=db_identifier)["DBInstances"][
0
]

return db_instance["TagList"]
except rds.exceptions.DBInstanceNotFoundFault:
click.secho(
f"""Database {db_identifier} not found. Check the database identifier.""", fg="red"
)
exit(1)


def get_connection_string(app: str, env: str, db_identifier: str) -> str:
session = get_aws_session_or_abort()
addon_name = normalise_secret_name(db_identifier.split(f"{app}-{env}-", 1)[1])
connection_string_parameter = f"/copilot/{app}/{env}/secrets/{addon_name}_READ_ONLY_USER"
master_secret_name = f"/copilot/{app}/{env}/secrets/{addon_name}_RDS_MASTER_ARN"
master_secret_arn = session.client("ssm").get_parameter(
Name=master_secret_name, WithDecryption=True
)["Parameter"]["Value"]

conn = update_postgres_parameter_with_master_secret(
session, connection_string_parameter, master_secret_arn
)

return f"postgres://{conn['username']}:{conn['password']}@{conn['host']}:{conn['port']}/{conn['dbname']}"
pass


@database.command(name="dump")
@click.option("--account-id", type=str, required=True)
@click.option("--app", type=str, required=True)
@click.option("--env", type=str, required=True)
@click.option("--database", type=str, required=True)
@click.option("--vpc-name", type=str, required=True)
def dump(account_id, app, env, database, vpc_name):
"""Dump a database into an S3 bucket."""
data_copy = DatabaseCopy(account_id, app, env, database, vpc_name)
data_copy.dump()


@database.command(name="load")
@click.option("--account-id", type=str, required=True)
@click.option("--app", type=str, required=True)
@click.option("--env", type=str, required=True)
@click.option("--database", type=str, required=True)
@click.option("--vpc-name", type=str, required=True)
def load(account_id, app, env, database, vpc_name):
"""Load a database from an S3 bucket."""
data_copy = DatabaseCopy(account_id, app, env, database, vpc_name)
data_copy.load()
Loading