Slow queries caused by large QueuedJobDescriptor table, and lack of indexes #336

mateusz · 2021-02-19T00:08:41Z

SiteTreeContentReview extension does a global order by query filtered by a where on a non-indexed column (see example query (1) a the end). This query can take minutes to run if the descriptor table is large (in our case, 184k rows), and is triggered by dev/build on all classess with this extension applied, making dev/build take half an hour.

This behaviour can be potentially triggered from other, unexpected places, taking sites down. Our situation occurred, because the CleanupJob was not ran, as it has to be manually triggered. Manual trigger is not very good because the person deploying must be aware they need to go to the CMS and trigger it, which often doesn't happen.

We'd like to reduce the likelihood for this common failure scenario from occuring and requiring support to step in :)

Suggested solutions:

add indexes to non-indexed columns to make the problem less frequent/less impacting
default CleanupJob to on, and ensure it self-schedules

(1) example query we have seen in the wild, on top of a non-indexed Implementation column. I can confirm the issue is caused by order by. Running similar query on an indexed column is fast. Running similar query without order by is fast as well.

SELECT DISTINCT "QueuedJobDescriptor"."ClassName", "QueuedJobDescriptor"."LastEdited", "QueuedJobDescriptor"."Created", "QueuedJobDescriptor"."JobTitle", "QueuedJobDescriptor"."Signature", "QueuedJobDescriptor"."Implementation", "QueuedJobDescriptor"."StartAfter", "QueuedJobDescriptor"."JobStarted", "QueuedJobDescriptor"."JobRestarted", "QueuedJobDescriptor"."JobFinished", "QueuedJobDescriptor"."TotalSteps", "QueuedJobDescriptor"."StepsProcessed", "QueuedJobDescriptor"."LastProcessedCount", "QueuedJobDescriptor"."ResumeCounts", "QueuedJobDescriptor"."SavedJobData", "QueuedJobDescriptor"."SavedJobMessages", "QueuedJobDescriptor"."JobStatus", "QueuedJobDescriptor"."JobType", "QueuedJobDescriptor"."Worker", "QueuedJobDescriptor"."Expiry", "QueuedJobDescriptor"."NotifiedBroken", "QueuedJobDescriptor"."WorkerCount", "QueuedJobDescriptor"."RunAsID", "QueuedJobDescriptor"."ID", CASE WHEN "QueuedJobDescriptor"."ClassName" IS NOT NULL THEN "QueuedJobDescriptor"."ClassName" ELSE 'Symbiote\\QueuedJobs\\DataObjects\\QueuedJobDescriptor' END AS "RecordClassName" FROM "QueuedJobDescriptor" WHERE ("QueuedJobDescriptor"."Implementation" = 'SilverStripe\\ContentReview\\Jobs\\ContentReviewNotificationJob') ORDER BY "QueuedJobDescriptor"."Created" DESC LIMIT 1;

Internal problem record references PRB-160, PRB-161.

The text was updated successfully, but these errors were encountered:

Cheddam · 2021-02-22T01:52:56Z

Related issue: silverstripe/silverstripe-contentreview#143

emteknetnz · 2021-02-22T02:10:10Z

Putting an index on Implementation is pretty trivial @mateusz can you see any downside to this?

brettt89 · 2021-02-22T03:33:28Z

Putting an index on Implementation is pretty trivial @mateusz can you see any downside to this?

An index on a column that is 255 character limit could potentially cause the Database index size to balloon significantly.

For a better performance approach, an alternative "identifier" which is smaller should be considered (e.g. varchar(10)). This could be achieved through hashing the currently Implementation into a 10character string, or perhaps some other type of identification method.

emteknetnz · 2021-02-22T04:33:35Z

That ClassName in the original description is 63 chars long, it's probably a pretty typical length. This doesn't seem like it's a cause for concern at all?

We do index ClassName on every other table, though that's stored as an enum instead of a varchar, not sure if that affects the index size.

an alternative "identifier" which is smaller should be considered (e.g. varchar(10))

I not too keen on this at all. We could add a new column for this, though then there's the issue of all the existing data that would have null values for this. We'd end up having to do something like

CASE WHEN HashColumn IS NULL THEN Implementation ELSE HashColumn

And HashColumn will still be null lots of the time, so it's kind of solved nothing

Also I'm not aware of this pattern being followed anywhere else in Silverstripe, and I'm not keen to add in extra complexity for the sake of a problem (oversized indexes) that may not even exist

mateusz · 2021-02-22T20:30:32Z

For me the main thing is to hear and see reflected in docs, if hundreds of thousands of QueuedJobDescriptor records is considered "normal" for this module.

If it's normal - then need to find ways to get the module actually perform, and index is some kind of a solution.

If it's not normal and the module can't handle it - I'd like to see the cleanup job enabled, and running by default.

Brett might be right about the index size ballooning, although I don't know enough to offer readily usable advice. For comparison, URLSegment index (b-tree) on SiteTree (varchar(255)) on 20k records in this particular database is 1.5MB. That would suggest 75 MB for 1 mil records. Theoretical maximum however is 255b * 3 (for UTF) * 1 mil = 730 MB, indeed that's a lot.

For non-unique indexes, I've seen somewhere that you can create indexes that are shorter than the column, e.g. create a 10-byte index on top of a VARCHAR(255), which will only take first ten bytes into account. This could be good enough for sorting, or then maybe not - if all entries in the table have the same Implementation, mysql might need to go through all of them anyway to resolve ties.

In the end it's back to the main point - is QueuedJobDescriptor table a historical record, or a working table that's supposed to be relatively short?

emteknetnz · 2021-02-23T03:20:25Z

My understanding is that it's a working table, not a historical table. That said I'd like a 2nd opinion - @chillu Do you think QueuedJobDescriptor would ever be used for auditing purposes?

Cheddam mentioned this issue Feb 22, 2021

QueuedJob query using 'Implementation' column without index silverstripe/silverstripe-contentreview#143

Closed

Cheddam added affects/v4 impact/medium type/bug impact/high and removed impact/medium labels Feb 22, 2021

emteknetnz added type/enhancement impact/medium and removed type/bug impact/high labels Jun 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow queries caused by large QueuedJobDescriptor table, and lack of indexes #336

Slow queries caused by large QueuedJobDescriptor table, and lack of indexes #336

mateusz commented Feb 19, 2021 •

edited

Loading

Cheddam commented Feb 22, 2021

emteknetnz commented Feb 22, 2021

brettt89 commented Feb 22, 2021

emteknetnz commented Feb 22, 2021

mateusz commented Feb 22, 2021

emteknetnz commented Feb 23, 2021

Slow queries caused by large QueuedJobDescriptor table, and lack of indexes #336

Slow queries caused by large QueuedJobDescriptor table, and lack of indexes #336

Comments

mateusz commented Feb 19, 2021 • edited Loading

Cheddam commented Feb 22, 2021

emteknetnz commented Feb 22, 2021

brettt89 commented Feb 22, 2021

emteknetnz commented Feb 22, 2021

mateusz commented Feb 22, 2021

emteknetnz commented Feb 23, 2021

mateusz commented Feb 19, 2021 •

edited

Loading