Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CascadeTracker creates veeery big indexes when using secondary assocs on folders #415

Open
hi-ko opened this issue Dec 7, 2022 · 2 comments

Comments

@hi-ko
Copy link

hi-ko commented Dec 7, 2022

Creating secondary parent/child associations for nodes is a fairly common use case. Alfresco itself uses this feature for workflow folders, for example.
Unfortunately, this does not play well with CascadeTracker. For example, when a folder is "linked" to the companyhome (as Alfresco does for workflows) by adding the companyhome as a secondary parent, CascadeTracker goes through the entire repo again to be added as a list of paths to the index document.
This results in a very large index size that may even exceed the size of the entire repo.

Secondary child associations worked fine in the old solr4 implementation until ASS introduced CascadeTracker. The only option we have today is to disable PATH tracking completely (alfresco.cascade.tracker.enabled=false) or remove all secondary child associations on folders/containers, which may brake use cases and functionality.

As a compromise, we should introduce something like alfresco.cascade.tracksecondaries.enabled to support PATH queries but avoid multiplying index size and tracking time and resources.

@hi-ko
Copy link
Author

hi-ko commented Dec 7, 2022

As a proof of concept I removed the ~20 secondary child assocs on folders in a repo having ~40 mio nodes.
After reindexing the index shrinked from 1 TB to < 50 GB.

@Fikili
Copy link

Fikili commented Dec 11, 2022

I agree, this issue is really annoying.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants