Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement region condition data export #3079

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from

Conversation

david-venhoff
Copy link
Member

Short description

Adds a new admin page with a button to download a region condition export.
The page itself is very minimal and will be improved with #3012.

Speed is not great, unfortunately. An export using production data takes about 6 minutes.

Proposed changes

  • Add a new view
  • Add a new endpoint to create a data export
  • Move some methods from the dashboard to the region, so that they can be shared by the dashboard and the region condition export

Side effects

None

Resolved issues

Fixes: #3053


Pull Request Review Guidelines

Speed is not great, unfortunately. An export using production data
takes about 6 minutes
Copy link

codeclimate bot commented Sep 27, 2024

Code Climate has analyzed commit 56d8a2e and detected 0 issues on this pull request.

The test coverage on the diff in this pull request is 72.8% (50% is the threshold).

This pull request will bring the total coverage in the repository to 82.6% (0.0% change).

View more on Code Climate.

Very hacky fix, but we are already using it at feedback_resource
@permission_required("cms.view_region")
def export_region_conditions(request: HttpRequest, file_format: str) -> HttpResponse:
"""
Creates a data export summarizing the condition of all region
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Creates a data export summarizing the condition of all region
Creates a data export summarizing the condition of all regions

@JoeyStk JoeyStk added the ‼️ prio: high Needs to be resolved ASAP. label Oct 4, 2024
@MizukiTemma
Copy link
Member

@david-venhoff Thank you for the PR, looks very good 😸

There is one question: shouldn't the feature only be available for Service Team (+ CMS)? it doesn't harm that App Team and Marketing Team can use this functionality, though it was so specified in the issue 🤔

@PeterNerlich
Copy link
Collaborator

PeterNerlich commented Oct 4, 2024

Had a quick look and for the stock dev test data it takes 43 queries (6 regions). I am confident that this can be significantly reduced, best case it could be one single humongous db query that we just dump as is into the desired format. I think a reduction to one quarter of the time might be realistic.

EDIT: I am no longer confident that we can significantly reduce this without major refactoring of the rest of our system to do less with python logic and more passing and building on querysets – which would also be an interest of mine, but not reasonable within the scope of this issue

Copy link
Collaborator

@PeterNerlich PeterNerlich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this was a disappointing investigation of mine 😁 Take my approve and a few over-perfectionist nitpicks and ideas I'm not even sure of myself, since I cannot contribute anything more helpful 😛

Comment on lines +7795 to +7797
#: cms/templates/region_condition/region_condition.html
msgid "This page is under construction"
msgstr "Diese Seite ist unter Konstruktion"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The least important of this PR, but I believe this is the proper phrase in German

Suggested change
#: cms/templates/region_condition/region_condition.html
msgid "This page is under construction"
msgstr "Diese Seite ist unter Konstruktion"
#: cms/templates/region_condition/region_condition.html
msgid "This page is under construction"
msgstr "Diese Seite ist noch im Bau"

It represents the to-be exported status of all regions.
"""

name = fields.Field(column_name=_("Region name"), attribute="name")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think, does it make more sense to just say "Region" in this column?

Suggested change
name = fields.Field(column_name=_("Region name"), attribute="name")
name = fields.Field(column_name=_("Region"), attribute="name")

Comment on lines +49 to +50
has_translation_package_been_booked = fields.Field(
column_name=_("Has translation package been booked"),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think adding a question mark makes it slightly better at conveying the meaning

Suggested change
has_translation_package_been_booked = fields.Field(
column_name=_("Has translation package been booked"),
has_translation_package_been_booked = fields.Field(
column_name=_("Has translation package been booked?"),

@david-venhoff
Copy link
Member Author

@PeterNerlich Thanks for the review :)

I think a reduction to one quarter of the time might be realistic.

I share this sentiment, and I went on to exactly the same investigation as you 🥲
I think that by far the largest bottleneck is in mapping Links to Regions and I wasted spent quite a bit of time trying to improve its performance, without much success. For example, it is possible to completely get rid of the filter_urls function and perform the calculations completely in the database. However my experiment created such an awful query that it crashed postgres ^^
My current hope is that we can create some indexes in the right places to speed this up, and I created #3077 for it, by I did not start investigating this yet.

Copy link
Contributor

@JoeyStk JoeyStk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much! This looks quite good already. However I noticed a few things when reviewing this with our test data.

Number of missing or outdated translations: In our test data I have eight pages that have a missing or outdated translation, the CSV says seven. I have the same error in Augsburg where I had 25 outdated translations for not empty pages, the CSV only says 19. Maybe it's worth to have a look at this branch, where attempted at implementing something similar :)

The rest seemed to be accurate. I think however that thorough testing after merging would be a great benefit :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
‼️ prio: high Needs to be resolved ASAP.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Data export for traffic light system
4 participants