Skip to content

maki-nage/rxray

Repository files navigation

RxRay

Github WorkFlows

RxPy operator to distribute a computation with ray

Get Started

The distribute operator can be used directly in an existing pipeline to parallelize computations:

data = range(200)
ray.init()

rx.from_(data).pipe(
    rxray.distribute(
        lambda: rx.pipe(ops.map(lambda i: i*2)),
    ),
).subscribe()

When the distributed computation is stateful, items can be pinned to an actor with a key-based selector:

data = [(i, j) for i in range(17) for j in range(100)]
random.shuffle(data)
ray.init()

rx.from_(data).pipe(
    rxray.distribute(
        lambda: rx.pipe(
            ops.group_by(lambda i: i[0]),
            ops.flat_map(lambda g: g.pipe(
                ops.map(lambda i: i[1]),
                ops.average(),
                ops.map(lambda i: (g.key, i)),
            ))
        ),
        actor_selector=rxray.partition_by_key(lambda i: i[0]),
    ),
).subscribe()

Installation

RxRay is available on PyPi and can be installed with pip:

python3 -m pip install rxray