Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feed DAO query performance issue with high feed volume #18227

Open
elguero6 opened this issue Oct 10, 2024 · 0 comments
Open

Feed DAO query performance issue with high feed volume #18227

elguero6 opened this issue Oct 10, 2024 · 0 comments

Comments

@elguero6
Copy link

elguero6 commented Oct 10, 2024

Affected module
Backend Error (specifically FeedDAO)causing the UI unable to show feeds

Describe the bug

When there are a large number of feeds, the listThreadsByEntityLink SQL query times out and throws a PSQLException. The error message indicates that the statement is being canceled due to a statement timeout.

[org.postgresql.util.PSQLException: ERROR: canceling statement due to statement timeout [statement:"/* FeedDAO.listThreadsByEntityLink */ SELECT json FROM thread_entity WHERE type = 'Conversation' AND resolved = false AND updatedAt < 9223372036854775807 AND hash_id in (SELECT fromFQNHash FROM field_relationship WHERE (:fqnPrefixHash IS NULL OR toFQNHash LIKE CONCAT(:fqnPrefixHash, '.%') OR toFQNHash=:fqnPrefixHash) AND fromType='THREAD' AND (:toType IS NULL OR toType LIKE CONCAT(:toType, '.%') OR toType=:toType) AND relation= :relation) AND (:userName IS NULL OR MD5(id) in (SELECT toFQNHash FROM field_relationship WHERE ((fromType='user' AND fromFQNHash= :userName) OR (fromType='team' AND fromFQNHash IN (:__teamNames_0))) AND toType='THREAD' AND relation= :filterRelation) )ORDER BY createdAt DESC LIMIT :limit", arguments:{positional:{1:table,2:11,3:3,6:-1}, named:{toType:table,filterRelation:-1,limit:11,userName:null,__teamNames_0:,fqnPrefixHash:b18959b82c335dd364f7fd8833fc8d16.54b56e667c796ad3c0d55c1774b754f7.6d3c5993ca017d0ff169b425d3193f02.9c0570fa882856fa23042e0cc0dc15ef,relation:3}, finder:[]}]] was thrown while processing request.

To Reproduce

Unfortunately, reproducing this issue requires having a large number of feeds in the database. However, here are some possible steps:

Create a large number of threads/entities with relationships to other threads/entities
Run the listThreadsByEntityLink query with a large limit value (e.g. 10000)

Observe the error message and stack trace

Expected behavior

The query should be able to handle a large number of feeds without timing out.

Version:

  • OS: Kubernetes
  • OpenMetadata version: 1.5.6
  • OpenMetadata Ingestion package version: 1.5.6

Additional context
This issue only occurs when there are a large number of feeds in the database. The listThreadsByEntityLink query is used to retrieve related threads for a given entity link. It's possible that the query needs to be optimized or that the timeout value should be increased.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant