Consider pagination of canned queries #856

simonw · 2020-06-20T03:15:59Z

The new canned_queries() plugin hook from #852 combined with plugins like https://github.com/simonw/datasette-saved-queries could mean that some installations end up with hundreds or even thousands of canned queries. I should consider pagination or some other way of ensuring that this doesn't cause performance problems for Datasette.

The text was updated successfully, but these errors were encountered:

bram2000 · 2021-05-18T10:49:11Z

Hi Simon, I'm using a canned query to do some geospatial stuff, but it maxes out at 1000 rows returned. I can't see any Link headers to follow to get the next page of data. Is there any way currently to work around this 1000 row limit for canned queries?

Thanks,
Jon

simonw · 2021-05-18T15:56:45Z

Tables and views get "stream all rows" at the moment, so one workaround is to define a SQL view for your query - this only works for queries that don't take any parameters though (although you may be able to define a view and then pass it extra fields using the Datasette table interface, like on https://latest.datasette.io/fixtures/paginated_view?content_extra__contains=9)

I've explored this problem in a bit more detail in https://githu.com/simonw/django-sql-dashboard and I think I have a pattern that could work.

For your canned query, you could implement the pattern yourself by setting up two canned queries that look something like this:

https://github-to-sqlite.dogsheep.net/github?sql=select+rowid%2C+sha%2C+author_date+from+commits+order+by+rowid+limit+1000

select rowid, sha, author_date from commits order by rowid limit 1000

That gets you the first set of 1,000 results. The important thing here is to order by a unique column, in this case rowid - because then subsequent pages can be loaded by a separate canned query that looks like this:

select rowid, sha, author_date from commits where rowid > :after order by rowid limit 1000

https://github-to-sqlite.dogsheep.net/github?sql=select+rowid%2C+sha%2C+author_date+from+commits+where+rowid+%3E+%3Aafter+order+by+rowid+limit+1000&after=1000

You then need to write code which knows how to generate these queries - start with the first query with no where clause (or if you are using rowid you can just use the second query and pass it ?after=0 for the first call) - then keep calling the query passing in the last rowid you recieved as the after parameter.

Basically this is an implementation of keyset pagination with a smart client. When Datasette grows the ability to do this itself it will work by executing this mechanism inside the Python code, which is how the "stream all rows" option for tables works at the moment.

bram2000 · 2021-05-21T14:22:41Z

Thanks Simon this is working very well.

simonw added the research label Jun 20, 2020

This was referenced Nov 2, 2020

"View all" option for facets, to provide a (paginated) list of ALL of the facet counts plus a link to view them #1080

Open

Advanced CSV export for arbitrary queries #1083

Open

simonw mentioned this issue Dec 30, 2020

JavaScript plugin hooks mechanism similar to pluggy #983

Open

simonw mentioned this issue May 23, 2021

"More" link for facets that shows _facet_size=max results #1337

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider pagination of canned queries #856

Consider pagination of canned queries #856

simonw commented Jun 20, 2020

bram2000 commented May 18, 2021 •

edited

Loading

simonw commented May 18, 2021

bram2000 commented May 21, 2021

Consider pagination of canned queries #856

Consider pagination of canned queries #856

Comments

simonw commented Jun 20, 2020

bram2000 commented May 18, 2021 • edited Loading

simonw commented May 18, 2021

bram2000 commented May 21, 2021

bram2000 commented May 18, 2021 •

edited

Loading