Reduce number of DB queries #118

roborourke · 2022-09-07T15:01:18Z

Fixes #117

The DB query fetches pretty much everything, just varying on statuses and site ID which won't change unless someone needs to query the jobs directly.

The resulting array is then filtered in PHP instead so the trade off is an 0(n) process in PHP rather than additional requests to the db each time wp_next_scheduled() is called.

roborourke · 2022-09-07T16:16:45Z

Thoughts on this method? I'm seeing multiple db queries on an initial population of the jobs, then a single query for most page views thereafter.

rmccue · 2022-09-14T11:42:06Z

I'd say it makes more sense to be querying this data within MySQL; I'm not sure why doing this in PHP is preferential?

kovshenin · 2022-09-14T11:58:13Z

Would love to see some benchmarks, as well as peak memory usage with large jobs tables.

roborourke · 2022-09-14T12:11:48Z

@rmccue this is how I interpreted your comment here:

#117 (comment)

Fetching / populating the full array of jobs once, then querying from the non-persistent cache.

I'll try and figure out a way to benchmark this.

rmccue · 2022-09-14T14:44:11Z

Fetching / populating the full array of jobs once, then querying from the non-persistent cache.

Yes, but I wouldn't necessarily implement that within this function since this is a generic querying function.

roborourke · 2022-09-14T16:48:22Z

I've updated it to just do the generic query on the pre_get_scheduled_event hook instead.

roborourke · 2022-09-15T13:41:21Z

It's not really a solid benchmark test but using ab locally it seems the change (applied before the 2nd run) has some improvement:

# ~/projects/altis-dev on git:master x [14:33:09] C:22
$ ab -n 100 -c 3 https://altis-dev.altis.dev/
This is ApacheBench, Version 2.3 <$Revision: 1879490 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking altis-dev.altis.dev (be patient).....done


Server Software:        nginx
Server Hostname:        altis-dev.altis.dev
Server Port:            443
SSL/TLS Protocol:       TLSv1.2,ECDHE-RSA-CHACHA20-POLY1305,2048,256
Server Temp Key:        ECDH X25519 253 bits
TLS Server Name:        altis-dev.altis.dev

Document Path:          /
Document Length:        44002 bytes

Concurrency Level:      3
Time taken for tests:   89.420 seconds
Complete requests:      100
Failed requests:        0
Total transferred:      4434100 bytes
HTML transferred:       4400200 bytes
Requests per second:    1.12 [#/sec] (mean)
Time per request:       2682.591 [ms] (mean)
Time per request:       894.197 [ms] (mean, across all concurrent requests)
Transfer rate:          48.43 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        5   11   5.8     10      56
Processing:  1177 2556 1161.0   2809    5732
Waiting:     1174 2554 1160.6   2807    5730
Total:       1186 2567 1163.9   2820    5788

Percentage of the requests served within a certain time (ms)
  50%   2820
  66%   2867
  75%   2876
  80%   2893
  90%   4205
  95%   5385
  98%   5422
  99%   5788
 100%   5788 (longest request)

# ~/projects/altis-dev on git:master x [14:35:37] 
$ ab -n 100 -c 3 https://altis-dev.altis.dev/
This is ApacheBench, Version 2.3 <$Revision: 1879490 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking altis-dev.altis.dev (be patient).....done


Server Software:        nginx
Server Hostname:        altis-dev.altis.dev
Server Port:            443
SSL/TLS Protocol:       TLSv1.2,ECDHE-RSA-CHACHA20-POLY1305,2048,256
Server Temp Key:        ECDH X25519 253 bits
TLS Server Name:        altis-dev.altis.dev

Document Path:          /
Document Length:        44002 bytes

Concurrency Level:      3
Time taken for tests:   80.575 seconds
Complete requests:      100
Failed requests:        0
Total transferred:      4434100 bytes
HTML transferred:       4400200 bytes
Requests per second:    1.24 [#/sec] (mean)
Time per request:       2417.259 [ms] (mean)
Time per request:       805.753 [ms] (mean, across all concurrent requests)
Transfer rate:          53.74 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        6   10   2.8     10      30
Processing:  1169 2344 838.4   2788    3878
Waiting:     1169 2343 838.5   2788    3878
Total:       1178 2354 838.9   2798    3892

Percentage of the requests served within a certain time (ms)
  50%   2798
  66%   2818
  75%   2831
  80%   2839
  90%   2881
  95%   3873
  98%   3886
  99%   3892
 100%   3892 (longest request)

kovshenin · 2022-09-15T13:50:53Z

inc/connector/namespace.php

@@ -382,16 +382,39 @@ function pre_get_scheduled_event( $pre, $hook, $args, $timestamp ) {
 	}

 	$jobs = Job::get_jobs_by_query( [
+		'args' => null,
+		'hook' => null,
+		'limit' => 100,


What happens when there's 100 jobs in the table with hook foo and we're looking for the hook bar that happens to be at position 101?

I was wondering the same but it's a problem we'd see in other situations. I guess that number was chosen because by default we're only looking for the waiting and running statuses.

See also:

https://github.com/humanmade/Cavalcade/blob/master/inc/connector/namespace.php#L294-L306

https://github.com/humanmade/Cavalcade/blob/master/inc/connector/namespace.php#L420-L430

@rmccue able to comment on this? There is at least a filter so if a site has more than 100 combinations of hooks & args that are upcoming or running you could filter limit via cavalcade.get_jobs_by_query.args.

inc/connector/namespace.php

kadamwhite

Thanks for looking into this. Would be curious if there's a way to profile the change before and after but that sounds positive!

inc/connector/namespace.php

roborourke · 2022-09-21T13:09:27Z

Thanks for looking into this. Would be curious if there's a way to profile the change before and after but that sounds positive!

@kadamwhite I've done a basic benchmark locally but I'm not completely sure on the best way to do this. Would be good to figure out some test with a baseline performance fixture to compare changes against... Something I need to look into, benchmarking isn't something I see done that often in WP plugin land.

roborourke · 2022-11-18T18:32:16Z

Keen to get back to this one, especially while we're looking for every performance improvement we can get.

svandragt

Looks fine since I last reviewed. I don't understand the benchmarks without comment, is the first one the current setup, and if so it takes a longer time but processes less requests?
Apart from whether it's an improvement the code is ok.

roborourke · 2022-11-21T11:02:45Z

@svandragt it was the same number of requests for each run, the first one is indeed before making the change so, with this change it completed 100 requests 9 seconds faster.

I'll have a look into how we can put some benchmarks into the CI tests.

roborourke · 2022-11-22T11:44:31Z

Had a chat through with @kovshenin who ran some deeper analysis of the db queries and edge cases we could expect.

There's a balance to be had between the indexes used, the number of queries run, and the amount of data a query may return over the network.

Front loading all the results in the way this PR does is could hit problems in the following scenarios:

more than 100 (or whatever LIMIT we set) jobs for a given hook with different args results in wp_next_scheduled() checks returning false negatives
removing the limit could result in a lot of data bring transferred unnecessarily from the db on every request if args is large
because there's no index on or including nextrun and we ORDER BY nextrun MySQL has to scan way more rows than are needed and apply the LIMIT afterwards, using WHERE and filesort which is less efficient

On balance the current approach is the best we'll get. Given the primary problem we're aiming to have an impact on is TTFB for end user page requests there is an alternative approach we could take which would be to use the pre_get_scheduled_event filter to just return true on any non-admin requests, provided they are in the init hook, as init will run in the admin context.

kadamwhite · 2022-11-22T16:32:25Z

Ah, nuts, had been hoping this would scale.

Given the primary problem we're aiming to have an impact on is TTFB for end user page requests there is an alternative approach we could take which would be to use the pre_get_scheduled_event filter to just return true on any non-admin requests, provided they are in the init hook, as init will run in the admin context.

That makes sense, and feels like a good compromise.

kovshenin · 2022-11-23T10:33:03Z

It is a good compromise, though I'd be careful returning true for all events. Instead maybe have an explicit list of events that you expect to be scheduled at all times, and let the checks for those only run within wp-admin.

roborourke · 2022-11-23T10:59:45Z

Will note on the Altis Cloud issue I raised here humanmade/altis-cloud#702

roborourke marked this pull request as draft September 7, 2022 15:01

roborourke requested review from johnbillion and rmccue September 7, 2022 16:15

roborourke added 4 commits September 9, 2022 16:57

Filter array after db query

680f9ea

Use single array_filter loop, 0(n)

5b90aa9

Fix array to object

7153c56

Use phpcs:disable for single rule

8ac9687

roborourke changed the base branch from master to fix-travis September 9, 2022 16:00

roborourke force-pushed the db-performance branch from ba3705b to 8ac9687 Compare September 9, 2022 16:00

roborourke changed the title ~~WIP: DB performance improvements~~ DB performance improvements Sep 9, 2022

roborourke marked this pull request as ready for review September 9, 2022 16:01

Fix SQL BETWEEN transliteration to PHP logic.

86dea1f

Base automatically changed from fix-travis to master September 13, 2022 18:18

roborourke requested a review from kovshenin September 13, 2022 18:20

roborourke changed the title ~~DB performance improvements~~ Reduce number of DB queries Sep 13, 2022

roborourke added 2 commits September 14, 2022 17:43

Move PHP post filter logic to pre_get_scheduled_event

0b727a1

Use standard limit of 100 used everywhere else in the connector

c863370

kovshenin reviewed Sep 15, 2022

View reviewed changes

svandragt reviewed Sep 21, 2022

View reviewed changes

inc/connector/namespace.php Outdated Show resolved Hide resolved

kadamwhite reviewed Sep 21, 2022

View reviewed changes

tfrommen reviewed Sep 21, 2022

View reviewed changes

inc/connector/namespace.php Outdated Show resolved Hide resolved

Improve job filtering performance, check wp_error earlier

7ba57f7

missjwo requested review from kovshenin, svandragt, kadamwhite and tfrommen November 18, 2022 17:24

Refresh linterbot

a6847ec

svandragt approved these changes Nov 21, 2022

View reviewed changes

roborourke closed this Nov 22, 2022

roborourke deleted the db-performance branch November 22, 2022 11:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce number of DB queries #118

Reduce number of DB queries #118

roborourke commented Sep 7, 2022 •

edited

Loading

roborourke commented Sep 7, 2022 •

edited

Loading

rmccue commented Sep 14, 2022

kovshenin commented Sep 14, 2022

roborourke commented Sep 14, 2022

rmccue commented Sep 14, 2022

roborourke commented Sep 14, 2022

roborourke commented Sep 15, 2022

kovshenin Sep 15, 2022

roborourke Sep 15, 2022

roborourke Sep 15, 2022 •

edited

Loading

kadamwhite left a comment

roborourke commented Sep 21, 2022

roborourke commented Nov 18, 2022

svandragt left a comment

roborourke commented Nov 21, 2022

roborourke commented Nov 22, 2022

kadamwhite commented Nov 22, 2022

kovshenin commented Nov 23, 2022

roborourke commented Nov 23, 2022

Reduce number of DB queries #118

Reduce number of DB queries #118

Conversation

roborourke commented Sep 7, 2022 • edited Loading

roborourke commented Sep 7, 2022 • edited Loading

rmccue commented Sep 14, 2022

kovshenin commented Sep 14, 2022

roborourke commented Sep 14, 2022

rmccue commented Sep 14, 2022

roborourke commented Sep 14, 2022

roborourke commented Sep 15, 2022

kovshenin Sep 15, 2022

Choose a reason for hiding this comment

roborourke Sep 15, 2022

Choose a reason for hiding this comment

roborourke Sep 15, 2022 • edited Loading

Choose a reason for hiding this comment

kadamwhite left a comment

Choose a reason for hiding this comment

roborourke commented Sep 21, 2022

roborourke commented Nov 18, 2022

svandragt left a comment

Choose a reason for hiding this comment

roborourke commented Nov 21, 2022

roborourke commented Nov 22, 2022

kadamwhite commented Nov 22, 2022

kovshenin commented Nov 23, 2022

roborourke commented Nov 23, 2022

roborourke commented Sep 7, 2022 •

edited

Loading

roborourke commented Sep 7, 2022 •

edited

Loading

roborourke Sep 15, 2022 •

edited

Loading