-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updates Unicode data files in a background task #7581
Updates Unicode data files in a background task #7581
Conversation
5703295
to
39ebf3b
Compare
39ebf3b
to
51f89e4
Compare
Signed-off-by: Jon Stovell <[email protected]>
Signed-off-by: Jon Stovell <[email protected]>
Signed-off-by: Jon Stovell <[email protected]>
51f89e4
to
a7e01e0
Compare
I figured out a way to make it even more proof against timeouts. Not that I have actually encountered any timeouts in testing, but extra defences don't hurt. |
Signed-off-by: Jon Stovell <[email protected]>
43f9e4f
to
61ee78c
Compare
Signed-off-by: Jon Stovell <[email protected]>
61ee78c
to
293b8c3
Compare
Signed-off-by: Jon Stovell <[email protected]>
Latest commit adds compatibility with #7574. |
Signed-off-by: Jon Stovell <[email protected]> # Conflicts: # other/update_unicode_data.php
Signed-off-by: Jon Stovell <[email protected]>
4585459
to
9a89607
Compare
I would love to see this merged before the next version of Unicode is released in September. Otherwise we will have to make another huge patch. (And then another the next September, and again the next, and the next, and the next...) |
Unicode 15.1 has been released now. I really, really, really don't think we should put the raw data directly in a patch like we did last time. So, could someone please review this? The release of a new version of Unicode actually makes testing easier; I've updated the instructions in the original post accordingly. |
My only concern is that it appears to write to the sources directory. We have instructed forums in the past that they can keep that readonly except for updates, which we supported ftp to make it writeable. |
I could add a notification in the error log if any of the Sources/Unicode/*.php files are not writable, with instructions about how to fix it. Then the admin can just follow those instructions to make them writable and let the update run again. |
That seems to be the easiest option at least for 2.1. |
I didn't actually give instructions about how to make files writable, because that's too long for an error message. But this should do the job. |
Signed-off-by: Jon Stovell <[email protected]>
3a5db74
to
60580a1
Compare
Hmmm.... As noted earlier, I did not want to do this in this fashion. Our pacman approach has served us well for quite some time. I see no reason to deviate from it. The cons, as I see them, are that any changes aren't backed out if a patch causes problems. It assumes the code & all changes are perfect & never need to be backed out via our normal pacman method. So now we have two paths to changes, and no path for reverting these particular changes should issues arise. |
I thought this was clear in the last discussion. Now I will need to consider reverting this... |
You are incorrect in that regard. These are data files, pure and simple. The inputs are extremely well documented and stable, and all this does is process the data into a more convenient format for PHP usage. Seriously, look at the data files. They contain nothing but arrays of strings. There is no logic in them. |
You are also incorrect to think that the updated Unicode data should be revertable. It emphatically should not. Not ever. Never. Why? Security. If a forum updates to a new version of Unicode and starts accepting characters that were added in that versions, but then downgrades to a previous version of Unicode, those characters become broken, illegal data that SMF no longer expects to be present in user supplied strings. A carefully crafted attack could definitely exploit this. The Unicode annex reports about security talk about this issue explicitly, and say that Unicode versions should never be downgraded after they've been updated. In case it's not clear, I say this all with the utmost respect and the friendliest of tones, but I will go to mat on this one. |
Based on an idea that came up in previous discussion, this PR refactors the code from update_unicode_data.php into a background task so that SMF can update its Unicode data files itself.
Pros:
Cons:
Testing
I've tried everything I can think of to break this code, and it just keeps working no matter what I do.
So now it is your turn! 😉
Notes