Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seq scan for janitor login/consent flows cleanup SQL query #3822

Open
5 tasks done
David-Wobrock opened this issue Aug 26, 2024 · 2 comments
Open
5 tasks done

Seq scan for janitor login/consent flows cleanup SQL query #3822

David-Wobrock opened this issue Aug 26, 2024 · 2 comments
Labels
bug Something is not working.

Comments

@David-Wobrock
Copy link

Preflight checklist

Ory Network Project

No response

Describe the bug

The Hydra janitor is running this query to cleanup inactive login/consent requests:

SELECT login_challenge
FROM hydra_oauth2_flow
WHERE (
    (state != 6)
        OR (login_error IS NOT NULL AND login_error <> '{}' AND login_error <> '')
        OR (consent_error IS NOT NULL AND consent_error <> '{}' AND consent_error <> '')
    )
  AND requested_at < '2024-08-25 08:00:00'
  AND nid = '326c59d4-e93c-47fc-96d4-40c5522f17c1'
ORDER BY login_challenge
    LIMIT 50000;

(with some example values).
See https://github.com/ory/hydra/blob/v2.2.0/persistence/sql/persister_consent.go#L732-L792

This is running a sequential query on PostgreSQL:

 Limit  (cost=2552485.01..2552485.01 rows=1 width=33)
   ->  Sort  (cost=2552485.01..2552485.01 rows=1 width=33)
         Sort Key: login_challenge
         ->  Gather  (cost=1000.00..2552485.00 rows=1 width=33)
               Workers Planned: 2
               ->  Parallel Seq Scan on hydra_oauth2_flow  (cost=0.00..2551484.90 rows=1 width=33)
                     Filter: ((requested_at < '2024-08-25 08:00:00'::timestamp without time zone) AND (nid = '326c59d4-e93c-47fc-96d4-40c5522f17c1'::uuid) AND ((state <> 6) OR ((login_error IS NOT NULL) AND (login_error <> '{}'::text) AND (login_error <> ''::text)) OR ((consent_error IS NOT NULL) AND (consent_error <> '{}'::text) AND (consent_error <> ''::text))))
(7 rows)

On a ~18 million row database, this query can take between 15 and 40 seconds, depending on the load.

The expected behaviour to have an index for this query by default in Hydra.

Reproducing the bug

Run the janitor SQL above.

Relevant log output

No response

Relevant configuration

No response

Version

v2.2.0

On which operating system are you observing this issue?

None

In which environment are you deploying?

Kubernetes with Helm

Additional Context

We tried indexing manually requested_at or state, but since the query has many AND/ORs, PostgreSQL 15.5 is not able to pick up the index properly and keeps running the sequential scan.

@David-Wobrock David-Wobrock added the bug Something is not working. label Aug 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is not working.
Projects
None yet
Development

No branches or pull requests

3 participants
@David-Wobrock and others