Blobbers can become unresponsive when there are large number of files/directories. #117

lpoli · 2021-06-20T16:09:20Z

I came across list-all subcommand in zboxcli which uses getRemoteFilesAndDirs function: https://github.com/0chain/gosdk/blob/master/zboxcore/sdk/sync.go#L44

What it does is, it requests blobbers recursively to get list of files/directories and further traverses inside each child directories and so on till end. So if there are say 100 subdirectories then it will make atleast 100 such requests.

There is another option to request for ObjectTree from blobbers i.e. to call http request to blobbers as given in the doc: https://api.0chain.net/#402a1367-2f35-430b-9eaa-42917ead886b
So for instance if I send request for retrieving ObjectTree for root path then it will return the json response of whole file hierarchy in that respective allocation.

Above call is fine if there are smaller number of files but we need to consider that an allocation can contain thousands of such files. For about 5 directories and 5 files the response size was about 60KB so for large number of files it will be large sized response making blobber busy to serve request for certain amount of time as metadata can be of for example; 20MB which obviously stalls the blobber.

And above is just for single allocation scenario. Blobbers however are not confined to single allocation and there can be multitude of clients requests.

So the solution can be to provide paginated response or partial tree response(say we only provide few levels of tree depth in response).
There are two other options i.e. ObjectPath and ReferencePath requests. However both can grow larger in response size and have same issue as ObjectTree requests.

Kishan-Dhakan · 2021-06-26T19:25:44Z

list-all and list, both make calls to the function NewListRequest in gosdk/zboxcore/zboxutil/http.go. In this, the an http GET request is made to the endpoint /v1/file/list/ whose response is the list (docs here).

Therefore, one way to go is, the list and list-all commands can provide offset and limit params (ex: show 100 items starting from the 51st item). Then, we extract and show the responses requested by client. This is not efficient as it would still mean fetching the entire list, but the current API doesn't have pagination implemented (as per the docs).

The other way, is to update the 0chain API to have a pagination as well, i.e., accept param(s) at the endpoint /v1/file/list/ which provide context for a paginated response.

Cc: @iamrz1

guruhubb · 2021-06-27T04:06:44Z

Just need to make the change at the blobber end

lpoli · 2021-06-27T06:44:00Z

Hello Andrei,
I am working on making 0fs where user can mount their allocation to some directory and access files using system commands, same as local files.

It will be good user experience if they can "cd" into some directory and "ls" list all files in some directory quickly. Calling blobbers for each such requests will be slow and making frequent requests for each operation is also costly for blobber.

So to minimize this issue I need to have paginated view of ObjectTree. Currently request made to get ObjectTree for some path will return all the file tree from that respective path. What would be good was to have paginated response in both direction (paginated breadth and depth)
For example; User can have 1000 files in same directory so returning all the file info in response is infeasible. Similar issue is going till the end depth of the tree.

So I think we should paginate both side; breadth and depth.

0fs would be constructing full tree and it will update only when allocation root changes(which is hash of combined path and file changes). That way 0fs can also save already read file into disk temporarily providing easy access.

Kishan-Dhakan · 2021-06-27T06:56:24Z

Just need to make the change at the blobber end

Right. I only mentioned the other way because this issue was opened in gosdk instead of blobber.

moldis · 2021-10-27T08:58:02Z

Need to add this ticket to clients @lpoli

lpoli · 2021-10-27T11:22:50Z

What do you mean by clients?

lpoli · 2022-04-15T08:03:15Z

With 64GB, blobber can handle even such large files.
There is GetRefs endpoint in blobber, which should be used and other method to get metadata should be replaced.
With GetRefs, consensus is also calculated among common fields.

lpoli added the refactor label Jun 20, 2021

lpoli assigned cnlangzi, guruhubb and bbist Jun 20, 2021

guruhubb assigned Kishan-Dhakan and unassigned guruhubb and bbist Jun 21, 2021

Kishan-Dhakan assigned lpoli Jun 26, 2021

guruhubb assigned andrenerd Jun 27, 2021

guruhubb unassigned cnlangzi, lpoli and Kishan-Dhakan Jun 27, 2021

guruhubb assigned Kishan-Dhakan and unassigned andrenerd Jun 28, 2021

guruhubb assigned lpoli Jul 7, 2021

lpoli added this to the m1 milestone Jul 13, 2021

lpoli mentioned this issue Jul 18, 2021

Add pagination to avoid Blobbers being unresponsive 0chain/blobber#228

Closed

lpoli mentioned this issue Jul 28, 2021

File References Pagination 0chain/blobber#273

Merged

lpoli mentioned this issue Aug 11, 2021

Refs gets updated date when its siblings gets updated. 0chain/blobber#301

Closed

moldis added the mainnet label Oct 27, 2021

lpoli added post-mainnet and removed mainnet labels Apr 15, 2022

Kishan-Dhakan removed their assignment Apr 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Blobbers can become unresponsive when there are large number of files/directories. #117

Blobbers can become unresponsive when there are large number of files/directories. #117

lpoli commented Jun 20, 2021

Kishan-Dhakan commented Jun 26, 2021 •

edited

Loading

guruhubb commented Jun 27, 2021 •

edited

Loading

lpoli commented Jun 27, 2021 •

edited

Loading

Kishan-Dhakan commented Jun 27, 2021

moldis commented Oct 27, 2021

lpoli commented Oct 27, 2021

lpoli commented Apr 15, 2022

Blobbers can become unresponsive when there are large number of files/directories. #117

Blobbers can become unresponsive when there are large number of files/directories. #117

Comments

lpoli commented Jun 20, 2021

Kishan-Dhakan commented Jun 26, 2021 • edited Loading

guruhubb commented Jun 27, 2021 • edited Loading

lpoli commented Jun 27, 2021 • edited Loading

Kishan-Dhakan commented Jun 27, 2021

moldis commented Oct 27, 2021

lpoli commented Oct 27, 2021

lpoli commented Apr 15, 2022

Kishan-Dhakan commented Jun 26, 2021 •

edited

Loading

guruhubb commented Jun 27, 2021 •

edited

Loading

lpoli commented Jun 27, 2021 •

edited

Loading