Skip to content

Commit

Permalink
✨ AsyncIterDataPipe for running concurrent tasks
Browse files Browse the repository at this point in the history
An asynchronous iterable-style DataPipe for processing tasks concurrently! Composition over inheritance. Subclassing from collections.abc.AsyncIterable in Python's standard library. Added some basic API docstring, and have setup some extlinks and intersphinx mappings in the docs/_config.yml file for linking to terms in the Python glossary.
  • Loading branch information
weiji14 committed Jul 29, 2023
1 parent fdff5d5 commit 34c99c7
Show file tree
Hide file tree
Showing 4 changed files with 47 additions and 1 deletion.
9 changes: 9 additions & 0 deletions bambooflow/datapipes/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
"""
An asynchronous-style DataPipe is one that implements the
:py:meth:`__aiter__ <object.__aiter__>` protocol, and represents an
:py-term:`asynchronous iterable <asynchronous-iterable>` over data samples.
This is well-suited for cases when I/O latency is slow, e.g. when waiting on
network connections, or performing read operations on multiple files at once.
"""

from bambooflow.datapipes.aiter import AsyncIterDataPipe
23 changes: 23 additions & 0 deletions bambooflow/datapipes/aiter.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
"""
Base classes for Asynchronous Iterable DataPipes.
"""
import collections


class AsyncIterDataPipe(collections.abc.AsyncIterable):
"""
Asynchronous iterable-style DataPipes.
All DataPipes that represent an asynchronous iterable of data samples
should subclass this. This style of DataPipes is particularly useful for
performing I/O-bound tasks such as streaming data from a network disk drive
or reading multiple files concurrently. ``AsyncIterDataPipe`` is
initialized in a lazy fashion, and its elements are computed only when
:py:meth:`__anext__ <object.__anext__>` is called on the async iterator of
an ``AsyncIterDataPipe``.
"""

def __repr__(self) -> str:
# Instead of showing <bamboopipe. ... .AsyncIterableWrapper at 0x.....>,
# return the class name like <AsyncIterableWrapper>
return str(self.__class__.__qualname__)
10 changes: 10 additions & 0 deletions docs/_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,5 +31,15 @@ sphinx:
config:
myst_all_links_external: true
html_show_copyright: false
extlinks:
py-term:
- 'https://docs.python.org/3/glossary.html#term-%s'
- '%s'
intersphinx_mapping:
python:
- 'https://docs.python.org/3/'
- null
extra_extensions:
- 'sphinx.ext.autodoc'
- 'sphinx.ext.extlinks'
- 'sphinx.ext.intersphinx'
6 changes: 5 additions & 1 deletion docs/api.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
# API Reference

## Asynchronous-style DataPipes

```{eval-rst}
.. automodule:: bambooflow
.. automodule:: bambooflow.datapipes
:members:
.. autoclass:: bambooflow.datapipes.AsyncIterDataPipe
:show-inheritance:
```

0 comments on commit 34c99c7

Please sign in to comment.