Skip to content

Commit

Permalink
Merge branch 'reset-modes' adding different reset modes during import
Browse files Browse the repository at this point in the history
This introduces support for "reset modes" -- different approaches for
ensuring a clean database before importing the fresh data. Initial
support is for three modes:

- `drop-database`: the default; drops the database & re-creates it.
- `drop-tables`: drops the tables the Django codebase is aware of,
  useful if the Django database user doesn't have access to drop the
  entire database.
- `none`: no attempt to reset the database, useful if the user has
  already manually configured the database or otherwise wants more
  control over setup.

Using the second of these in a staging-type environment is the main
motivation for this change.

This also updates the existing `PostgresSequences` extra to cope with
a sequence of a given name already being present by overwriting it.
This doesn't feel like the best solution (the result if the import
somehow lists the same sequence twice may be unexpected), but seemed
probably the simplest for now given what devdata is typically used for.

Fixes #23
  • Loading branch information
PeterJCLaw committed Jan 8, 2024
2 parents 1d7afd1 + 729a19f commit d4a90d0
Show file tree
Hide file tree
Showing 9 changed files with 499 additions and 70 deletions.
31 changes: 30 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,10 @@ Exporting, anonymising, and importing, are all configurable, and

#### Exporting

``` console
$ python manage.py devdata_export [dest] [app_label.ModelName ...]
```

This step allows a sync strategy to persist some data that will be used to
create a new development database. For example, the `QuerySetStrategy` can
export data from a table to a filesystem for later import.
Expand All @@ -124,12 +128,37 @@ step.

#### Importing

This step is responsible for creating a new database and filling it. If any
``` console
$ python manage.py devdata_import [src]
```

This step is responsible for preparing the database and filling it. If any
exporting strategies have been used those must have run first, or their outputs
must have been downloaded if they are being shared/hosted somewhere.

Factory-based strategies generate data during this process.

##### Reset modes

``` console
$ python manage.py devdata_import --reset-mode=$MODE [src]
```

By default any existing database will be removed, ensuring that a fresh database
is created for the imported data. This is expected to be the most common case
for local development, but may not always be suitable.

The following modes are offered:

- `drop-database`: the default; drops the database & re-creates it.
- `drop-tables`: drops the tables the Django codebase is aware of, useful if the
Django database user doesn't have access to drop the entire database.
- `none`: no attempt to reset the database, useful if the user has already
manually configured the database or otherwise wants more control over setup.

See the docstrings in [`src/devdata/reset_modes.py`](src/devdata/reset_modes.py)
for more details.

## Customising

#### Strategies
Expand Down
16 changes: 0 additions & 16 deletions src/devdata/engine.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@
disable_migrations,
get_all_models,
migrations_file_path,
nodb_cursor,
progress,
sort_model_strategies,
to_app_model_label,
Expand Down Expand Up @@ -116,23 +115,8 @@ def export_extras(django_dbname, dest, no_update=False):


def import_schema(src, django_dbname):
db_conf = settings.DATABASES[django_dbname]
pg_dbname = db_conf["NAME"]

connection = connections[django_dbname]

with nodb_cursor(connection) as cursor:
cursor.execute("DROP DATABASE IF EXISTS {}".format(pg_dbname))

creator = connection.creation
creator._execute_create_test_db(
cursor,
{
"dbname": pg_dbname,
"suffix": creator.sql_table_creation_suffix(),
},
)

with disable_migrations():
call_command(
"migrate",
Expand Down
25 changes: 25 additions & 0 deletions src/devdata/extras.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,18 @@ def ensure_dir_exists(self, dest: Path) -> None:


class PostgresSequences(ExtraExport, ExtraImport):
"""
Export & import Postgres sequences.
This provides support for reproducing sequences of the same type and at the
same value in an imported database.
During import any existing sequence of the same name is silently removed and
replaced. This simplifies the interaction with each of the possible reset
modes and approximately matches how `loaddata` treats importing rows with
matching primary keys.
"""

def __init__(self, *args, name="postgres-sequences", **kwargs):
super().__init__(*args, name=name, **kwargs)

Expand Down Expand Up @@ -155,6 +167,19 @@ def check_simple_value(mapping: Dict[str, str], *, key: str) -> str:
name = check_simple_value(sequence, key="sequencename")
data_type = check_simple_value(sequence, key="data_type")

# Support reset modes which don't drop the database. At some
# point it might be nice to be able to hook into the reset mode
# to remove sequences too, however that's likely complicated and
# it's easy enough to handle here.
#
# Sequences don't nicely fit into one of just schema or data,
# they're somewhat inherently both. Given that Django's
# "loaddata" over-writes existing rows in tables, it seems
# reasonable to do something similar for sequences -- even if
# that means we actually drop the sequence and fully re-create
# it.
cursor.execute(f"DROP SEQUENCE IF EXISTS {name}")

query = textwrap.dedent(
f"""
CREATE SEQUENCE {name}
Expand Down
15 changes: 13 additions & 2 deletions src/devdata/management/commands/devdata_import.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
import_extras,
validate_strategies,
)
from ...reset_modes import MODES, DropDatabaseReset
from ...settings import settings


Expand All @@ -30,22 +31,30 @@ def add_arguments(self, parser):
help="The database name to import to.",
default=DEFAULT_DB_ALIAS,
)
parser.add_argument(
"--reset-mode",
choices=MODES.values(),
type=MODES.__getitem__,
help="How to ensure the database is empty before importing the new schema (default: %(default)s).",
default=DropDatabaseReset.slug,
)
parser.add_argument(
"--no-input",
help="Disable confirmations before danger actions.",
action="store_true",
)

def handle(self, src, database, no_input=False, **options):
def handle(self, src, database, reset_mode, no_input=False, **options):
try:
validate_strategies()
except AssertionError as e:
raise CommandError(e)

if not no_input and (
input(
"You're about to delete the database {} ({}) from the host {}. "
"You're about to {} {} ({}) from the host {}. "
"Are you sure you want to continue? [y/N]: ".format(
reset_mode.description_for_confirmation,
self.style.WARNING(database),
self.style.WARNING(settings.DATABASES[database]["NAME"]),
self.style.WARNING(socket.gethostname()),
Expand All @@ -55,6 +64,8 @@ def handle(self, src, database, no_input=False, **options):
):
raise CommandError("Aborted")

reset_mode.reset_database(database)

src = (Path.cwd() / src).absolute()

import_schema(src, database)
Expand Down
132 changes: 132 additions & 0 deletions src/devdata/reset_modes.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
"""
Alternative implementations for ensuring a clean database before import.
"""

import abc

from django.db import connections
from django.db.migrations.recorder import MigrationRecorder

from .settings import settings
from .utils import nodb_cursor

MODES = {}


class Reset(abc.ABC):
def __init_subclass__(cls) -> None:
super().__init_subclass__()
MODES[cls.slug] = cls()

@property
@abc.abstractclassmethod
def slug(self) -> str:
raise NotImplementedError

@property
@abc.abstractclassmethod
def description_for_confirmation(self) -> str:
raise NotImplementedError

def reset_database(self, django_dbname: str) -> None:
raise NotImplementedError

def __str__(self) -> str:
# Use the slug as the str for easier integration into the CLI
return self.slug


class DropDatabaseReset(Reset):
"""
Drop the entire database and re-create it using Django's test utils.
This is suitable in cases where Django is configured with a database
superuser account, which is likely to be the case in local development.
"""

slug = "drop-database"

description_for_confirmation = "delete the database"

def reset_database(self, django_dbname: str) -> None:
db_conf = settings.DATABASES[django_dbname]
pg_dbname = db_conf["NAME"]

connection = connections[django_dbname]

with nodb_cursor(connection) as cursor:
cursor.execute("DROP DATABASE IF EXISTS {}".format(pg_dbname))

creator = connection.creation
creator._execute_create_test_db(
cursor,
{
"dbname": pg_dbname,
"suffix": creator.sql_table_creation_suffix(),
},
)


class DropTablesReset(Reset):
"""
Drop all the tables which Django knows about, including migration history.
This is suitable in cases where the current state of the database can be
assumed to be similar enough to the new state that removing the tables alone
is sufficient to clear out the data. For databases which support other forms
of data (e.g: Postgres sequences decoupled from tables) this mode will not
touch those data and the user must ensure they are handled suitably.
This is expected to be useful in cases where Django is configured with
administrative privileges within a database, but may not have access to drop
the entire database.
Note: this will not touch other database entities (e.g: Postgres sequences &
views) which may be present but are not managed by Django models -- even if
they were created by running migrations (e.g: via `RunSQL`).
"""

slug = "drop-tables"

description_for_confirmation = "delete all tables in the database"

def reset_database(self, django_dbname: str) -> None:
connection = connections[django_dbname]

with connection.cursor() as cursor:
table_names = connection.introspection.table_names(cursor)

models = connection.introspection.installed_models(table_names)

if MigrationRecorder(connection).has_table():
models.add(MigrationRecorder.Migration)

with connection.schema_editor() as editor:
for model in models:
editor.delete_model(model)


class NoReset(Reset):
"""
Perform no resetting against the database.
This is suitable in cases where the user has already manually configured the
target database or otherwise wants more control over the setup. The user is
responsible for ensuring that the database is in a state ready to have the
schema migrated into it.
Notes:
* As `loaddata` is used to import data, this mode may result in a merging
of the new and existing data (if there is any).
* Django's migrations table does not have any uniqueness constraints,
meaning that even identical rows may be reinserted and resulting in
apparently duplicate rows in that table. The effects of this on Django
are unknown. You have been warned.
"""

slug = "none"

description_for_confirmation = "merge into the existing database"

def reset_database(self, django_dbname: str) -> None:
pass
20 changes: 19 additions & 1 deletion tests/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@
from pathlib import Path

import pytest
from django.db import connection
from django.db import connection, connections
from django.db.migrations.recorder import MigrationRecorder

ALL_TEST_STRATEGIES = (
("admin.LogEntry", "default"),
Expand Down Expand Up @@ -46,6 +47,23 @@ def default_export_data(test_data_dir):
(test_data_dir / "postgres-sequences.json").write_text(empty_model)


@pytest.fixture()
def ensure_migrations_table():
# Ensure there's an existing django_migrations table, as there would be
# for a real database.
for conn in connections.all():
MigrationRecorder(conn).ensure_schema()

yield

# Remove the table at the end of the test, back to how the test database
# would normally be.
for conn in connections.all():
if MigrationRecorder(conn).has_table():
with conn.schema_editor() as editor:
editor.delete_model(MigrationRecorder.Migration)


@pytest.fixture(autouse=True)
def cleanup_test_data(test_data_dir):
yield
Expand Down
Loading

0 comments on commit d4a90d0

Please sign in to comment.