Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching file tree (Feature suggestion) #265

Open
Day0Dreamer opened this issue Aug 10, 2024 · 0 comments
Open

Caching file tree (Feature suggestion) #265

Day0Dreamer opened this issue Aug 10, 2024 · 0 comments

Comments

@Day0Dreamer
Copy link

Day0Dreamer commented Aug 10, 2024

So, there I was, trying to build a Google Drive tree file structure using "anytree" as the backbone.

You get a list of files from Google and get to guess what is what and belongs where. It is not ordered, so sometimes you get a file from a subfolder, before said subfolder.

Each file (folder is also files there) has a parent in a form of an ID.

Search function doesn't cache stuff, and on big trees, it takes hours to find the parent of yet another millionths file.

Cache functionality suggested in the documentation (and the suggestion seems to getting outdated too) implores a lru cache.

Problem being - it caches the None response for a parent when we got from Google the child first and have not yet created a parent. So when the parent arrives and we rerun the search on an orphan file, cached None gets returned out of the cache.

So I ended up with a following code chunk:

from anytree import Node, RenderTree, search
from functools import lru_cache, wraps


def cache_non_none(func):
    cached_func = lru_cache(maxsize=None)(func)

    @wraps(func)
    def wrapper(*args, **kwargs):
        result = cached_func(*args, **kwargs)
        if result is None:
            # If the result is None, clear this specific cache entry
            cached_func.cache_clear()  # Clears the entire cache
            return None
        return result

    return wrapper


@cache_non_none
def find_by_attribute(node, value, name="name", maxlevel=None):
    return search.find_by_attr(node, value, name=name, maxlevel=maxlevel)

that discards caching if None is found, and retains caching if something was indeed found.

Now anybody feeling like making a pull request, I've got nothing against it.
Just wanted to share corner solution to a corner case of mine

Love <3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant