feat: to/from PyTorch JaggedTensor #3246

maxymnaumchyk · 2024-09-17T21:22:34Z

No description provided.

…tensor-functions' into maxymnaumchyk/add-to-from-jaggedtensor-functions # Conflicts: # src/awkward/operations/ak_to_jaggedtensor.py

codecov · 2024-09-17T21:34:12Z

Codecov Report

Attention: Patch coverage is 23.25581% with 66 lines in your changes missing coverage. Please review.

Project coverage is 81.85%. Comparing base (b749e49) to head (c68321c).
Report is 162 commits behind head on main.

Files with missing lines	Patch %	Lines
src/awkward/operations/ak_to_jaggedtensor.py	20.00%	40 Missing ⚠️
src/awkward/operations/ak_from_jaggedtensor.py	23.52%	26 Missing ⚠️

Additional details and impacted files

Files with missing lines	Coverage Δ
src/awkward/operations/__init__.py	`100.00% <100.00%> (ø)`
src/awkward/operations/ak_from_jaggedtensor.py	`23.52% <23.52%> (ø)`
src/awkward/operations/ak_to_jaggedtensor.py	`20.00% <20.00%> (ø)`

... and 105 files with indirect coverage changes

maxymnaumchyk · 2024-09-18T17:21:04Z

@jpivarski should I also leave out a "keep_regular" parameter since it does the same as ak.from_regular()? It's kind of the same situation we talked about today (about "padded" parameter).

jpivarski · 2024-09-18T17:52:42Z

You're right: it is. The situation is that we should be providing an interface to the user that's like Lego bricks that they can put together however they like. If there's an alternative way of doing something, it shouldn't be a feature of the new functions, because then we'd have to explain why someone would use one or the other.

I agree that the padded and keep_regular arguments are more convenient if that's exactly what someone wants; passing padded=True or keep_regular=True is easier than the multi-step process it would be with the other method. However, these shortcuts would only work in the cases that these functions apply to, which are limited by the capabilities of TensorFlow and PyTorch's ragged array implementations.

maxymnaumchyk · 2024-09-18T18:13:14Z

thanks for such a detailed answer!

…nd()

…ions

jpivarski

This is great work! But it might be solving the wrong problem. As we discussed at the meeting, this fbgemm-gpu-cpu module is not something ML users seem to be familiar with, so adding to/from functions wouldn't help them. It doesn't seem to be the interface that they use to implement DeepSets and GNNs, the ML models that might actually involve ragged data. So we're going to follow-up with ML experts to find out what interfaces they really do need.

Meanwhile, as discussed at the meeting, you'll be adding

ak.to_torch using Content.to_backend_array
ak.from_torch

for rectilinear arrays only (allow_record=False and allow_missing=False, following ak_to_numpy.py).

Another two functions,

would be needed to pre-process an idiomatic Awkward Array into the kind of interface that PyTorch-Geometric needs, which isn't one RaggedTensor object like TensorFlow; it's a few, separate, completely rectilinear arrays. "The way to do it" needs to be explained as a User Guide that puts all of these functions together, rather than a single function that tries to do everything in one call.

Although I'm setting this to "request changes," we'll likely be closing this PR and following up with new ones.

maxymnaumchyk and others added 6 commits September 17, 2024 23:51

add new to_jaggedtensor function

9e28cb1

update __init__.py file

14b09d2

add imports

70a2630

style: pre-commit fixes

9feb758

add style changes

4674108

Merge remote-tracking branch 'origin/maxymnaumchyk/add-to-from-jagged…

c0da418

…tensor-functions' into maxymnaumchyk/add-to-from-jaggedtensor-functions # Conflicts: # src/awkward/operations/ak_to_jaggedtensor.py

maxymnaumchyk added 7 commits September 18, 2024 00:44

add style changes

000e10a

check if fbgemm is installed

b820b51

correct checking if fbgemm is installed

ade482e

add cuda support

fc97f37

add comments

0f48a9a

style changes

5e68ab5

fix tests

2cdc0db

maxymnaumchyk and others added 8 commits September 19, 2024 16:59

leave out the "padded" and "keep_regular" arguments

5d2eaad

add error for non existing backend

81fd78a

*better* handle backend stated by user

4244642

separately handle cupy arrays

86d5922

delete "backend" argument since the same can be done with ak.to_backe…

b316181

…nd()

add new "to_jaggedtensor" function

d1d723b

style fixes

940b466

Merge branch 'main' into maxymnaumchyk/add-to-from-jaggedtensor-funct…

c68321c

…ions

ianna marked this pull request as ready for review September 25, 2024 14:03

ianna requested a review from jpivarski September 25, 2024 14:03

jpivarski requested changes Sep 25, 2024

View reviewed changes

maxymnaumchyk mentioned this pull request Oct 3, 2024

Add interoperability between Awkward Array and ML libraries #3267

Open

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: to/from PyTorch JaggedTensor #3246

feat: to/from PyTorch JaggedTensor #3246

maxymnaumchyk commented Sep 17, 2024

codecov bot commented Sep 17, 2024 •

edited

Loading

maxymnaumchyk commented Sep 18, 2024

jpivarski commented Sep 18, 2024

maxymnaumchyk commented Sep 18, 2024

jpivarski left a comment

feat: to/from PyTorch JaggedTensor #3246

Are you sure you want to change the base?

feat: to/from PyTorch JaggedTensor #3246

Conversation

maxymnaumchyk commented Sep 17, 2024

codecov bot commented Sep 17, 2024 • edited Loading

Codecov Report

maxymnaumchyk commented Sep 18, 2024

jpivarski commented Sep 18, 2024

maxymnaumchyk commented Sep 18, 2024

jpivarski left a comment

Choose a reason for hiding this comment

codecov bot commented Sep 17, 2024 •

edited

Loading