Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

000108: new errors never seen before #324

Open
yarikoptic opened this issue Feb 1, 2023 · 2 comments
Open

000108: new errors never seen before #324

yarikoptic opened this issue Feb 1, 2023 · 2 comments

Comments

@yarikoptic
Copy link
Member

Run finished with some new errors I believe

$ time python -m tools.backups2datalad -l WARNING --backup-root /mnt/backup/dandi --config tools/backups2datalad.cfg.yaml update-from-backup  --workers 3 000108
...
... some known problems were spotted like https://github.com/dandi/dandisets/issues/298
...
whereis: 351 failed
whereis: 6494 failed
whereis: 2904 failed
fatal: Unable to write new index file
fatal: Unable to write new index file
fatal: Unable to write new index file
fatal: Unable to write new index file
fatal: Unable to write new index file
fatal: Unable to write new index file
2023-01-31T10:56:38-0500 [ERROR   ] backups2datalad: Job failed on input <Dandiset 000108/draft>:
Traceback (most recent call last):
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/aioutil.py", line 168, in dowork
    outp = await func(inp)
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/datasetter.py", line 145, in update_dandiset
    changed = await self.sync_dataset(dandiset, ds, dmanager)
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/datasetter.py", line 188, in sync_dataset
    await syncer.sync_assets(error_on_change)
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/syncer.py", line 36, in sync_assets
    self.report = await async_assets(
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/asyncer.py", line 500, in async_assets
    nursery.start_soon(dm.read_addurl)
  File "/home/dandi/miniconda3/envs/dandisets/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 662, in __aexit__
    raise exceptions[0]
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/zarr.py", line 537, in sync_zarr
    await zsync.run()
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/zarr.py", line 139, in run
    if not await self.needs_sync(client, last_sync, local_paths):
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/zarr.py", line 333, in needs_sync
    async for obj in ao:
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/zarr.py", line 407, in aiter_objects
    async for page in client.get_paginator("list_objects_v2").paginate(
  File "/home/dandi/miniconda3/envs/dandisets/lib/python3.8/site-packages/aiobotocore/paginate.py", line 32, in __anext__
    response = await self._make_request(current_kwargs)
  File "/home/dandi/miniconda3/envs/dandisets/lib/python3.8/site-packages/aiobotocore/client.py", line 265, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (SlowDown) when calling the ListObjectsV2 operation (reached max retries: 4): Please reduce your request rate.
Traceback (most recent call last):
  File "/home/dandi/miniconda3/envs/dandisets/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/dandi/miniconda3/envs/dandisets/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/__main__.py", line 513, in <module>
    main(_anyio_backend="asyncio")
  File "/home/dandi/miniconda3/envs/dandisets/lib/python3.8/site-packages/asyncclick/core.py", line 1157, in __call__
    return anyio.run(self._main, main, args, kwargs, **opts)
  File "/home/dandi/miniconda3/envs/dandisets/lib/python3.8/site-packages/anyio/_core/_eventloop.py", line 70, in run
    return asynclib.run(func, *args, **backend_options)
  File "/home/dandi/miniconda3/envs/dandisets/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 292, in run
    return native_run(wrapper(), debug=debug)
  File "/home/dandi/miniconda3/envs/dandisets/lib/python3.8/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/home/dandi/miniconda3/envs/dandisets/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()
  File "/home/dandi/miniconda3/envs/dandisets/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 287, in wrapper
    return await func(*args)
  File "/home/dandi/miniconda3/envs/dandisets/lib/python3.8/site-packages/asyncclick/core.py", line 1160, in _main
    return await main(*args, **kwargs)
  File "/home/dandi/miniconda3/envs/dandisets/lib/python3.8/site-packages/asyncclick/core.py", line 1076, in main
    rv = await self.invoke(ctx)
  File "/home/dandi/miniconda3/envs/dandisets/lib/python3.8/site-packages/asyncclick/core.py", line 1687, in invoke
    return await _process_result(await sub_ctx.command.invoke(sub_ctx))
  File "/home/dandi/miniconda3/envs/dandisets/lib/python3.8/site-packages/asyncclick/core.py", line 1434, in invoke
    return await ctx.invoke(self.callback, **ctx.params)
  File "/home/dandi/miniconda3/envs/dandisets/lib/python3.8/site-packages/asyncclick/core.py", line 780, in invoke
    rv = await rv
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/__main__.py", line 186, in update_from_backup
    await datasetter.update_from_backup(dandisets, exclude=exclude)
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/datasetter.py", line 97, in update_from_backup
    raise RuntimeError(
RuntimeError: Backups for 1 Dandiset failed

real    1071m35.912s
user    976m43.011s
sys     402m1.266s

so we have new AFAIK

  • botocore.exceptions.ClientError: An error occurred (SlowDown) when calling the ListObjectsV2 operation (reached max retries: 4): Please reduce your request rate.
  • fatal: Unable to write new index file

I will now check for dirty zarrs, do resets, pop the stashed fix for #298, and try again

@yarikoptic
Copy link
Member Author

I think we were fine recently

@yarikoptic
Copy link
Member Author

yarikoptic commented Oct 2, 2023

we hit it again, so ideally we should add / extend retrying there so that service remains robus

details from the email
>> python -m tools.backups2datalad -l WARNING --backup-root /mnt/backup/dandi --config tools/backups2datalad.cfg.yaml update-from-backup 000108                                                                                                                    
2023-09-29T15:30:10-0400 [WARNING ] dandi: A newer version (0.56.2) of dandi/dandi-cli is available. You are using 0.55.1                                                                                                                                          
whereis: 3017 failed                                                                                                                                                                                                                                               
whereis: 3014 failed                                                                                                                                                                                                                                               
whereis: 3022 failed                                                                                                                                                                                                                                               
whereis: 3015 failed                                                                                                                                                                                                                                               
whereis: 2997 failed                                                                                                                                                                                                                                               
2023-09-30T08:21:24-0400 [ERROR   ] backups2datalad: Job failed on input <Dandiset 000108/draft>:                                                                                                                                                                  
Traceback (most recent call last):                                                                                                                                                                                                                                 
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/aioutil.py", line 173, in dowork                                                                                                                                                                         
    outp = await func(inp)                                                                                                                                                                                                                                         
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/datasetter.py", line 143, in update_dandiset                                                                                                                                                             
    changed = await self.sync_dataset(dandiset, ds, dmanager)                                                                                                                                                                                                      
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/datasetter.py", line 188, in sync_dataset                                                                                                                                                                
    await syncer.sync_assets(error_on_change)                                                                                                                                                                                                                      
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/syncer.py", line 36, in sync_assets                                                                                                                                                                      
    self.report = await async_assets(                                                                                                                                                                                                                              
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/asyncer.py", line 522, in async_assets                                                                                                                                                                   
    async with AsyncAnnex(ds.pathobj) as annex, httpx.AsyncClient(                                                                                                                                                                                                 
  File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 662, in __aexit__                                                                                                                                  
    raise exceptions[0]                                                                                                                                                                                                                                            
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/zarr.py", line 560, in sync_zarr                                                                                                                                                                         
    await zsync.run()                                                                                                                                                                                                                                              
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/zarr.py", line 140, in run                                                                                                                                                                               
    if not await self.needs_sync(client, last_sync, local_paths):                                                                                                                                                                                                  
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/zarr.py", line 356, in needs_sync                                                                                                                                                                        
    async for obj in ao:                                                                                                                                                                                                                                           
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/zarr.py", line 430, in aiter_objects                                                                                                                                                                     
    async for page in client.get_paginator("list_objects_v2").paginate(                                                                                                                                                                                            
  File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/aiobotocore/paginate.py", line 30, in __anext__                                                                                                                                       
    response = await self._make_request(current_kwargs)                                                                                                                                                                                                            
  File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/aiobotocore/client.py", line 358, in _make_api_call                                                                                                                                   
    raise error_class(parsed_response, operation_name)                                                                                                                                                                                                             
botocore.exceptions.ClientError: An error occurred (SlowDown) when calling the ListObjectsV2 operation (reached max retries: 4): Please reduce your request rate.                                                                                                  
2023-09-30T08:21:24-0400 [ERROR   ] backups2datalad: An error occurred:                                                                                                                                                                                            
Traceback (most recent call last):                                                                                                                                                                                                                                 
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/__main__.py", line 111, in wrapped                                                                                                                                                                       
    await f(datasetter, *args, **kwargs)                                                                                                                                                                                                                           
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/__main__.py", line 213, in update_from_backup                                                                                                                                                            
    await datasetter.update_from_backup(dandisets, exclude=exclude)                                                                                                                                                                                                
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/datasetter.py", line 97, in update_from_backup                                                                                                                                                           
    raise RuntimeError(                                                                                                                                                                                                                                            
RuntimeError: Backups for 1 Dandiset failed                                                                                                                                                                                                                        
Logs saved to /mnt/backup/dandi/dandisets/.git/dandi/backups2datalad/2023.09.29.19.30.09Z.log   
edit 1: attempt to rerun has failed... I guess I needed to do some cleanup first in zarrs -- I did only in 000108 -- only git reset --hard iirc
dandi@drogon:/mnt/backup/dandi/dandisets$ flock -E 0 -e -n /home/dandi/.run/backup2datalad-cron.lock bash -c '/mnt/backup/dandi/dandisets/tools/backups2datalad-update-cron-108'

> eval python -m tools.backups2datalad -l WARNING --backup-root /mnt/backup/dandi --config tools/backups2datalad.cfg.yaml update-from-backup 000108
>> python -m tools.backups2datalad -l WARNING --backup-root /mnt/backup/dandi --config tools/backups2datalad.cfg.yaml update-from-backup 000108
2023-10-02T15:07:36-0400 [WARNING ] dandi: A newer version (0.56.2) of dandi/dandi-cli is available. You are using 0.55.1
create_sibling_github(ok): [sibling repository 'github' created at https://github.com/dandizarrs/a99f05b9-c7a2-455e-ba70-acfd9a4a7e55]
create_sibling_github(ok): [sibling repository 'github' created at https://github.com/dandizarrs/540960ca-8d34-4780-8fad-23c53771fd19]
create_sibling_github(ok): [sibling repository 'github' created at https://github.com/dandizarrs/91bca37f-9bcc-4673-a68d-e4168fb31043]
create_sibling_github(ok): [sibling repository 'github' created at https://github.com/dandizarrs/61f7be27-2d51-4787-8964-c6c338ea255a]
create_sibling_github(ok): [sibling repository 'github' created at https://github.com/dandizarrs/ffe1a8c8-f9c5-4405-b755-566516319471]
create_sibling_github(ok): [sibling repository 'github' created at https://github.com/dandizarrs/c424cbb8-d41c-446e-bf1f-cd3b4fd64d79]
configure-sibling(ok): . (sibling)
action summary:
  configure-sibling (ok: 1)
  create_sibling_github (ok: 1)
configure-sibling(ok): . (sibling)
action summary:
  configure-sibling (ok: 1)
  create_sibling_github (ok: 1)
configure-sibling(ok): . (sibling)
action summary:
  configure-sibling (ok: 1)
  create_sibling_github (ok: 1)
configure-sibling(ok): . (sibling)
action summary:
  configure-sibling (ok: 1)
  create_sibling_github (ok: 1)
configure-sibling(ok): . (sibling)
action summary:
  configure-sibling (ok: 1)
  create_sibling_github (ok: 1)
configure-sibling(ok): . (sibling)
action summary:
  configure-sibling (ok: 1)
  create_sibling_github (ok: 1)
Traceback (most recent call last):
  File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/__main__.py", line 516, in <module>
    main(_anyio_backend="asyncio")
  File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/asyncclick/core.py", line 1157, in __call__
    return anyio.run(self._main, main, args, kwargs, **opts)
  File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/anyio/_core/_eventloop.py", line 70, in run
    return asynclib.run(func, *args, **backend_options)
  File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 292, in run
    return native_run(wrapper(), debug=debug)
  File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/asyncio/base_events.py", line 641, in run_until_complete
    return future.result()
  File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 287, in wrapper
    return await func(*args)
  File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/asyncclick/core.py", line 1160, in _main
    return await main(*args, **kwargs)
  File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/asyncclick/core.py", line 1076, in main
    rv = await self.invoke(ctx)
  File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/asyncclick/core.py", line 1687, in invoke
    return await _process_result(await sub_ctx.command.invoke(sub_ctx))
  File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/asyncclick/core.py", line 1434, in invoke
    return await ctx.invoke(self.callback, **ctx.params)
  File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/asyncclick/core.py", line 780, in invoke
    rv = await rv
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/__main__.py", line 111, in wrapped
    await f(datasetter, *args, **kwargs)
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/__main__.py", line 213, in update_from_backup
    await datasetter.update_from_backup(dandisets, exclude=exclude)
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/datasetter.py", line 80, in update_from_backup
    report = await pool_amap(
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/aioutil.py", line 180, in pool_amap
    async with anyio.create_task_group() as tg:
  File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 662, in __aexit__
    raise exceptions[0]
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/aioutil.py", line 173, in dowork
    outp = await func(inp)
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/datasetter.py", line 143, in update_dandiset
    changed = await self.sync_dataset(dandiset, ds, dmanager)
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/datasetter.py", line 188, in sync_dataset
    await syncer.sync_assets(error_on_change)
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/syncer.py", line 36, in sync_assets
    self.report = await async_assets(
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/asyncer.py", line 522, in async_assets
    async with AsyncAnnex(ds.pathobj) as annex, httpx.AsyncClient(
  File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 660, in __aexit__
    raise ExceptionGroup(exceptions)
anyio._backends._asyncio.ExceptionGroup: 5 exceptions were raised in the task group:
----------------------------
Traceback (most recent call last):
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/zarr.py", line 540, in sync_zarr
    raise RuntimeError(
RuntimeError: Zarr 1330bacc-6a54-4a14-b2db-6b4ec86d428e in Dandiset 000108 is dirty; clean or save before running
----------------------------
Traceback (most recent call last):
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/zarr.py", line 540, in sync_zarr
    raise RuntimeError(
RuntimeError: Zarr 449c91c7-aec2-418e-a232-2cdd16d9546c in Dandiset 000108 is dirty; clean or save before running
----------------------------
Traceback (most recent call last):
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/zarr.py", line 540, in sync_zarr
    raise RuntimeError(
RuntimeError: Zarr 99582f18-2ef9-4505-83d7-e00be54136a2 in Dandiset 000108 is dirty; clean or save before running
----------------------------
Traceback (most recent call last):
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/zarr.py", line 540, in sync_zarr
    raise RuntimeError(
RuntimeError: Zarr bd8ad6cf-c8a6-4d9f-bd91-6301a2bab092 in Dandiset 000108 is dirty; clean or save before running
----------------------------
Traceback (most recent call last):
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/zarr.py", line 540, in sync_zarr
    raise RuntimeError(
RuntimeError: Zarr e914512d-0842-408e-b37c-b5104954de71 in Dandiset 000108 is dirty; clean or save before running

@yarikoptic yarikoptic reopened this Oct 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant