Skip to content

Commit

Permalink
Gov PaaS to DBT PaaS migration branch (#769)
Browse files Browse the repository at this point in the history
* Upgrading postgres to 16 (#737)

Co-authored-by: santinomolinaro <santino.molinaro@mail>

* TSS-1612: Prepare s3 client for migration (#751)

* tss-1612: S3 copilot switch

* Copilto DB URL

* format

---------

Co-authored-by: abarolo <[email protected]>

* TSS-1615: Pingdom healthcheck (#752)

* tss-1612: S3 copilot switch

* tss-1615: Pingdom API healthcheck

* url update

* format

---------

Co-authored-by: abarolo <[email protected]>

* chore: add .copilot directory with files required for image building

* TSS-1416: Xray & ASIM logging (#755)

* TSS-1416: Xray

* ASIM logging

* dbt-copilot-python dep

* Switch logging based on platform

* Handle all 3 loggers

---------

Co-authored-by: abarolo <[email protected]>

* chore: force pipeline build

* chore: update packeto builder version

* TSS-1615: celery config (#768)

* tss-1615: celery config

* format

---------

Co-authored-by: abarolo <[email protected]>

* TSS-1615: celery config (#768)

* tss-1615: celery config

* format

---------

Co-authored-by: abarolo <[email protected]>

* Add sample trace rate for APM

* Revert IAM roles to access public buckets

* remove unused import

* Update S3 client for documents

* Trigger deploy

* Restore boto session config when on DBT Platform

* Debug download url for av scan

* Add more debug to see where this is failing

* Trigger deploy

* Remove debugging

* chore: update poetry.lock and requirements files

* tss-1802: check history exists in publish flow (#797)

* check history exists in publish

* log

---------

Co-authored-by: abarolo <[email protected]>

* update (#798)

Co-authored-by: abarolo <[email protected]>

* Increase gunicorn worker timeout (#799)

Co-authored-by: abarolo <[email protected]>

* Upgrade gevent (#800)

Co-authored-by: abarolo <[email protected]>

* Reduce worker connection count (#801)

Co-authored-by: abarolo <[email protected]>

* (TEST) revert Procfile + remove temp file creation for s3 upload

* Use in memory data upload to s3 (#802)

Co-authored-by: abarolo <[email protected]>

* PAAS Migration branch cleanup (#803)

* Logging tasks and local celery + flower (#789)

* format

---------

Co-authored-by: abarolo <[email protected]>

* tss-1783: Add set_to_allowed_on to data workspace serializer (#790)

* tss-1783: Add set_to_allowed_on to data workspace serializer

---------

Co-authored-by: abarolo <[email protected]>

* rc/maybe-maybe-not-2 (#792)

* Bump tqdm from 4.66.2 to 4.66.3

Bumps [tqdm](https://github.com/tqdm/tqdm) from 4.66.2 to 4.66.3.
- [Release notes](https://github.com/tqdm/tqdm/releases)
- [Commits](tqdm/tqdm@v4.66.2...v4.66.3)

---
updated-dependencies:
- dependency-name: tqdm
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

* Regenerating requirements.txt file

* Bump jinja2 from 3.1.3 to 3.1.4 (#788)

* Bump jinja2 from 3.1.3 to 3.1.4

Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.3 to 3.1.4.
- [Release notes](https://github.com/pallets/jinja/releases)
- [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst)
- [Commits](pallets/jinja@3.1.3...3.1.4)

---
updated-dependencies:
- dependency-name: jinja2
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>

* Regenerating requirements.txt file

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

* Bump werkzeug from 3.0.2 to 3.0.3 (#787)

* Bump werkzeug from 3.0.2 to 3.0.3

Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.2 to 3.0.3.
- [Release notes](https://github.com/pallets/werkzeug/releases)
- [Changelog](https://github.com/pallets/werkzeug/blob/main/CHANGES.rst)
- [Commits](pallets/werkzeug@3.0.2...3.0.3)

---
updated-dependencies:
- dependency-name: werkzeug
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>

* Regenerating requirements.txt file

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

* TSS-1791 Updating Readme (#791)

* Updating README to include updates for mock SSO to the hosts file

* Removing whitespace

---------

Co-authored-by: Uka Osim <[email protected]>

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>
Co-authored-by: Elizabeth Pedley <[email protected]>

* feature/TSS-1790-move-notification-to-signals-to-celery-background-tasks (#794)

* feat: moving email to celery task

* feat: moving email to celery task

* feat: moving email to celery task

* lint fix

* lint fix

* feat: moving signal handlers to barrier tasks

* removing non required imports

* Extra logging for authorised requests (#793)

* Formatting and test fixes

* format

* test

* Fix last test

* format

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: abarolo <[email protected]>
Co-authored-by: Uka Osim <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>
Co-authored-by: Elizabeth Pedley <[email protected]>
Co-authored-by: Feroze Rub <[email protected]>

* Elastic APM is a boolean

* Trigger deploy

* Trigger deploy

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Santino Molinaro <[email protected]>
Co-authored-by: santinomolinaro <santino.molinaro@mail>
Co-authored-by: abarolo <[email protected]>
Co-authored-by: abarolo <[email protected]>
Co-authored-by: Gareth Pitt-Nash <[email protected]>
Co-authored-by: Emile Swarts <[email protected]>
Co-authored-by: Yusuf <[email protected]>
Co-authored-by: Uka Osim <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>
Co-authored-by: Elizabeth Pedley <[email protected]>
  • Loading branch information
12 people authored Jun 17, 2024
1 parent 192173e commit bb19989
Show file tree
Hide file tree
Showing 34 changed files with 1,140 additions and 425 deletions.
2 changes: 1 addition & 1 deletion .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ jobs:
# Specify service dependencies here if necessary
# CircleCI maintains a library of pre-built images
# documented at https://circleci.com/docs/2.0/circleci-images/
- image: postgres:13
- image: postgres:16
environment:
POSTGRES_DB: market_access
POSTGRES_USER: postgres # pragma: allowlist secret
Expand Down
4 changes: 4 additions & 0 deletions .copilot/config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
repository: dmas/dmas-backend
builder:
name: paketobuildpacks/builder-jammy-full
version: 0.3.339
6 changes: 6 additions & 0 deletions .copilot/image_build_run.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#!/usr/bin/env bash

# Exit early if something goes wrong
set -e

# Add commands below to run inside the container after all the other buildpacks have been applied
6 changes: 6 additions & 0 deletions .copilot/phases/build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#!/usr/bin/env bash

# Exit early if something goes wrong
set -e

# Add commands below to run as part of the build phase
6 changes: 6 additions & 0 deletions .copilot/phases/install.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#!/usr/bin/env bash

# Exit early if something goes wrong
set -e

# Add commands below to run as part of the install phase
6 changes: 6 additions & 0 deletions .copilot/phases/post_build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#!/usr/bin/env bash

# Exit early if something goes wrong
set -e

# Add commands below to run as part of the post_build phase
6 changes: 6 additions & 0 deletions .copilot/phases/pre_build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#!/usr/bin/env bash

# Exit early if something goes wrong
set -e

# Add commands below to run as part of the pre_build phase
2 changes: 1 addition & 1 deletion Procfile
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
web: python manage.py migrate --noinput && gunicorn config.wsgi:application --bind 0.0.0.0:$PORT --worker-class gevent --worker-connections 1000 --timeout 120 --log-file -
web: python manage.py migrate --noinput && opentelemetry-instrument gunicorn config.wsgi:application --bind 0.0.0.0:$PORT --worker-class gevent --worker-connections 1000 --timeout 240 --log-file -
celeryworker: celery -A config.celery worker -l info -Q celery
celerybeat: celery -A config.celery beat -l info -S django
19 changes: 1 addition & 18 deletions api/barrier_downloads/service.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
import csv
import io
import logging
from typing import List

Expand All @@ -10,7 +8,6 @@

from api.barrier_downloads import tasks
from api.barrier_downloads.constants import BARRIER_FIELD_TO_COLUMN_TITLE
from api.barrier_downloads.csv import _transform_csv_row
from api.barrier_downloads.exceptions import (
BarrierDownloadDoesNotExist,
BarrierDownloadNotificationError,
Expand All @@ -25,6 +22,7 @@
ProgrammeFundProgressUpdate,
)
from api.collaboration.models import TeamMember
from api.core.utils import serializer_to_csv_bytes
from api.documents.utils import get_bucket_name, get_s3_client_for_bucket
from api.user.constants import USER_ACTIVITY_EVENT_TYPES
from api.user.models import UserActvitiyLog
Expand All @@ -37,21 +35,6 @@ def get_s3_client_and_bucket_name():
return get_s3_client_for_bucket(bucket_id), get_bucket_name(bucket_id)


def serializer_to_csv_bytes(serializer, field_names) -> bytes:
output = io.StringIO()
writer = csv.DictWriter(
output,
extrasaction="ignore",
fieldnames=field_names.keys(),
quoting=csv.QUOTE_MINIMAL,
)
writer.writerow(field_names)
for row in serializer.data:
writer.writerow(_transform_csv_row(row))
content = output.getvalue().encode("utf-8")
return content


def create_barrier_download(user, filters: dict, barrier_ids: List) -> BarrierDownload:
filename = f"csv/{user.id}/DMAS_{now().strftime('%Y-%m-%d-%H-%M-%S')}.csv"
default_name = filename.split("/")[2]
Expand Down
14 changes: 12 additions & 2 deletions api/barriers/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -1061,15 +1061,25 @@ def add_new_version(self):
self.published_versions["versions"].setdefault(new_version, entry)

def get_published_version(self, version):
"""
Whole publish flow needs to be cleaned up, optimised, simplified.
ie When publishing a barrier this function is called ~20 times and it is querying db.
"""
version = str(version)

if self.published_versions:
logger.info(
f"self.published_versions: type({type(self.published_versions)}) value({self.published_versions})"
)
logger.info(f"(PublicBarrier): history exists - {self.history.exists()}")

timestamp = self.published_versions["versions"][version]["published_on"]
historic_public_barrier = self.history.as_of(
datetime.datetime.fromisoformat(timestamp)
)

return historic_public_barrier
else:
return None

@property
def latest_published_version(self):
Expand Down
44 changes: 25 additions & 19 deletions api/barriers/public_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,11 @@
import logging

from django.conf import settings
from django.core.files.temp import NamedTemporaryFile
from django.utils import timezone

from api.barriers.serializers.public_barriers import public_barriers_to_json
from api.core.utils import list_s3_public_data_files, upload_to_s3
from api.barriers.helpers import get_published_public_barriers
from api.barriers.serializers.public_barriers import PublicPublishedVersionSerializer
from api.core.utils import list_s3_public_data_files, s3_client

logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -86,16 +86,20 @@ def versioned_folder(version=None):
return f"{settings.PUBLIC_DATA_KEY_PREFIX}{version}"


def public_barrier_data_json_file_content(public_barriers=None):
data = {"barriers": public_barriers_to_json(public_barriers)}
return data


def metadata_json_file_content():
data = {"release_date": str(timezone.now().date())}
return data


def get_public_data_content():
public_barriers = [
pb.latest_published_version for pb in get_published_public_barriers()
]
return {
"barriers": PublicPublishedVersionSerializer(public_barriers, many=True).data
}


def public_release_to_s3(public_barriers=None, force_publish=False):
"""
Generate a new JSON file and upload it to S3 along with metadata info.
Expand All @@ -114,14 +118,16 @@ def public_release_to_s3(public_barriers=None, force_publish=False):
# To make sure all files use the same version
next_version = latest_file().next_version

with NamedTemporaryFile(mode="w+t") as tf:
json.dump(public_barrier_data_json_file_content(public_barriers), tf, indent=4)
tf.flush()
s3_filename = f"{versioned_folder(next_version)}/data.json"
upload_to_s3(tf.name, settings.PUBLIC_DATA_BUCKET, s3_filename)

with NamedTemporaryFile(mode="w+t") as tf:
json.dump(metadata_json_file_content(), tf, indent=4)
tf.flush()
s3_filename = f"{versioned_folder(next_version)}/metadata.json"
upload_to_s3(tf.name, settings.PUBLIC_DATA_BUCKET, s3_filename)
barrier_json = json.dumps(get_public_data_content())
metadata_json = json.dumps(metadata_json_file_content())
s3_filename = f"{versioned_folder(next_version)}/data.json"
metadata_filename = f"{versioned_folder(next_version)}/metadata.json"

s3 = s3_client()

s3.put_object(
Bucket=settings.PUBLIC_DATA_BUCKET, Body=barrier_json, Key=s3_filename
)
s3.put_object(
Bucket=settings.PUBLIC_DATA_BUCKET, Body=metadata_json, Key=metadata_filename
)
38 changes: 0 additions & 38 deletions api/barriers/serializers/public_barriers.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@
ReadOnlyTradingBlocField,
SectorField,
)
from api.barriers.helpers import get_published_public_barriers
from api.barriers.models import PublicBarrier, PublicBarrierLightTouchReviews
from api.barriers.serializers.mixins import LocationFieldMixin
from api.core.serializers.mixins import AllowNoneAtToRepresentationMixin
Expand Down Expand Up @@ -328,40 +327,3 @@ def get_sectors(self, obj):
return ReadOnlySectorsField(
to_repr_keys=("name",), sort=False
).to_representation(sectors)


def public_barriers_to_json(public_barriers=None):
"""
Helper to serialize latest published version of published barriers.
Public Barriers in the flat file should look similar.
{
"barriers": [
{
"id": "kjdfhkzx",
"title": "Belgian chocolate...",
"summary": "Lorem ipsum",
"status": {"name": "Open",}
"country": {"name": "Belgium",}
"caused_by_trading_bloc": false,
"trading_bloc": null,
"location": "Belgium"
"sectors: [
{"name": "Automotive"}
],
"categories": [
{"name": "Goods and Services"}
],
"last_published_on: "date",
"reported_on": "date"
}
]
}
If all sectors is true, use the sectors key to represent that as follows:
"sectors: [{"name": "All sectors"}],
"""
if public_barriers is None:
public_barriers = (
pb.latest_published_version for pb in get_published_public_barriers()
)
serializer = PublicPublishedVersionSerializer(public_barriers, many=True)
return serializer.data
18 changes: 18 additions & 0 deletions api/core/utils.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,12 @@
import csv
import io
import operator

import boto3
from botocore.exceptions import NoCredentialsError
from django.conf import settings

from api.barrier_downloads.csv import _transform_csv_row
from api.core.exceptions import S3UploadException


Expand Down Expand Up @@ -115,3 +118,18 @@ def list_s3_public_data_files(client=None):
)
for content in response.get("Contents", []):
yield content.get("Key")


def serializer_to_csv_bytes(serializer, field_names) -> bytes:
output = io.StringIO()
writer = csv.DictWriter(
output,
extrasaction="ignore",
fieldnames=field_names.keys(),
quoting=csv.QUOTE_MINIMAL,
)
writer.writerow(field_names)
for row in serializer.data:
writer.writerow(_transform_csv_row(row))
content = output.getvalue().encode("utf-8")
return content
25 changes: 16 additions & 9 deletions api/documents/utils.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
from functools import lru_cache
from logging import getLogger

import boto3
from dbt_copilot_python.utility import is_copilot
from django.apps import apps
from django.conf import settings
from django.core.exceptions import ObjectDoesNotExist
Expand Down Expand Up @@ -40,17 +40,24 @@ def get_bucket_name(bucket_id):
return get_bucket_credentials(bucket_id)["bucket_name"]


@lru_cache()
def get_s3_client_for_bucket(bucket_id):
"""Get S3 client for bucket id."""
credentials = get_bucket_credentials(bucket_id)
return boto3.client(
"s3",
aws_access_key_id=credentials["aws_access_key_id"],
aws_secret_access_key=credentials["aws_secret_access_key"],
region_name=credentials["aws_region"],
config=boto3.session.Config(signature_version="s3v4"),
)

if is_copilot():
return boto3.client(
"s3",
region_name=credentials["aws_region"],
config=boto3.session.Config(signature_version="s3v4"),
)
else:
return boto3.client(
"s3",
aws_access_key_id=credentials["aws_access_key_id"],
aws_secret_access_key=credentials["aws_secret_access_key"],
region_name=credentials["aws_region"],
config=boto3.session.Config(signature_version="s3v4"),
)


def sign_s3_url(bucket_id, key, method="get_object", expires=3600):
Expand Down
Loading

0 comments on commit bb19989

Please sign in to comment.