-
Notifications
You must be signed in to change notification settings - Fork 528
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update: added migration script for migration of historic data #3299
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
Left a few comments below.
|
||
for translation in Translation.objects.filter(date__lt=end_date): | ||
for attr, action_type in traslation_info.items(): | ||
if getattr(translation, attr) is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also check that any date is before end_date
in order to avoid duplicate entries. It could be that translation was created before we started using ActionLog, but it could have been approved afterwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
researched on this and saw that django.db.models.Q
is a good way to do this
You'll see it in my latest update
Let me know what you think about it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the Translation.date
is before end_date
and Translation.approved_date
isn't, we'll still create the (duplicate) ActionLog
item.
It should be sufficient to check Translation.date
only in the QuerySet (as we did before) and then make sure that getattr(translation, attr)
is before end_date
.
Hello @mathjazz I was trying a way not to hardcode the |
| Q(unrejected_date__lt=end_date) | ||
) | ||
|
||
for translation in translations: | ||
for attr, action_type in traslation_info.items(): | ||
if getattr(translation, attr) is not None: | ||
actions_to_log.append( | ||
ActionLog( | ||
action_type=action_type, | ||
created_at=translation.date, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On line 42 I also noticed that since this is actionlog the creation date is not neccessarilly the creation of the translation date but it varies with the action type:
- approved date
- unapproved date
- rejected date
- unrejected date
- creation date
Should I also change this to match that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch, created_at
should indeed be the corresponding action date.
Hello @mathjazz made the changes that you recommended, Looking forward to your review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
traslation_info = {
translation_info = {
Would've gone unnotice for centuries and beyond 😅 thanks for that... Vs code auto giving variables made me miss that, might use vim next time 😅
Fix changes you requested :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, last two comments, then we're ready to test this with real data.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested the code on stage and it turns out it was too slow to actually complete.
Inline I provided an idea how to improve performance and another edge case that we need to take into account.
Fixed them 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll need to optimize the code further, because it still doesn't complete. I have some ideas and I'll do a bit of testing on stage first, so we don't run in circles for too long.
date = getattr(translation, attr) | ||
user_id = getattr(translation, action_user) | ||
|
||
# Skip logging 'translation:approved' if approved_date is the same as the date |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Comments are more useful when they describe the whys, not the whats.
This comment just says what the code does in an English language, but it's not clear why we are not logging such actions. Better comments would be something along the lines of If approved_date
and date
values are the same, translation was directly submitted as approved, in which case we only log translation:created
action.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the feedback. I agree that comments should be more focused on explaining the reasoning behind the code rather than just describing its functionality.
I’ll update the comments to clarify why we handle logging in this way. It should make the code’s intent clearer and provide better context for future readers.
Thanks again for pointing this out!
I was thinking a potential reason why it may be slow is due to creating the objects in bulk at the end instead of single processing each? Though this is just a theory, I'll have to prove it first, then I get back to you |
I'm curious: what is killing the migration? Is it a setting on Heroku, a setting in Django? Can it be overridden? |
Could be a setting, but I don't think this is the right approach, at least not right now. It took at least 30 minutes before the process was killed, and that's with We should try to make the code faster first. |
), | ||
} | ||
|
||
# date to end migration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# date to end migration | |
# Timestamp of the first Translation created after the introduction of ActionLog |
Co-authored-by: Matjaž Horvat <[email protected]>
Co-authored-by: Matjaž Horvat <[email protected]>
Using a more concise variable name Co-authored-by: Matjaž Horvat <[email protected]>
Using a more concise variable name for clarity Co-authored-by: Matjaž Horvat <[email protected]>
5657180
to
f33cb61
Compare
Without this change, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea this fixes it 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r+
Fix #2196
It migrates data from the previous years up to
(2020/1/ 7, 9:25:11:829125)