Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tensor deepcopy does not copy raw_tensor #1541

Open
albertz opened this issue Jun 17, 2024 · 1 comment
Open

Tensor deepcopy does not copy raw_tensor #1541

albertz opened this issue Jun 17, 2024 · 1 comment

Comments

@albertz
Copy link
Member

albertz commented Jun 17, 2024

deepcopy on Tensor will not copy the raw_tensor:

def __getstate__(self):
    d = {k: getattr(self, k) for k in self.__slots__}
    d["_raw_tensor"] = None  # do not store the TF tensors
    return d

def __setstate__(self, state):
    for k, v in state.items():
        setattr(self, k, v)

Why is that?

First of all, the reason should be documented. Because I don't exactly remember anymore why we do it this way.

But further, this causes some problems in RF modules where we keep some auxiliary constant tensors around, e.g.rf.PiecewiseLinear. If there is a good reason to keep it this way, we need to think about how to solve it in rf.PiecewiseLinear. One solution is to just use rf.Parameter (with auxiliary=True) instead. But a developer might run into this problem again and the error was very confusing. Namely, in the __init__, I was double checking that raw_tensor is set, but then in __call__, raw_tensor was not set anymore, and this caused raw_backend to be None, so many RF functions will fail with AttributeError: 'NoneType' object has no attribute ....

Maybe it makes sense to control the behavior (e.g. via a context scope) to switch between copying raw_tensor and not copying it. We first should understand the reason why we do not copy it.

(Maybe we can just make a dummy draft PR where we remove this line of code, i.e. where we do copy it, and see what tests are failing...)

Test case (e.g. for test_torch_frontend.py):

def test_module_deepcopy_tensor():
    import copy

    class _MyModule(rf.Module):
        def __init__(self):
            super().__init__()
            self.tensor = rf.convert_to_tensor(np.array([1.0, 2.0, 3.0], dtype=np.float32), dims=[Dim(3)])

    mod = _MyModule()
    assert isinstance(mod.tensor.raw_tensor, torch.Tensor)
    mod_ = copy.deepcopy(mod)
    assert isinstance(mod_.tensor.raw_tensor, torch.Tensor)
albertz added a commit that referenced this issue Jun 17, 2024
@albertz
Copy link
Member Author

albertz commented Jul 15, 2024

Why is that?

I assume when I wrote this, I thought mostly about rf.Parameter, where I thought that you never want to copy the content but instead the ParamInit (which could be raw values, in which case it would be copied, but usually it's some random init scheme, so only the scheme is copied but not the values).

Also see for example this: pytorch/pytorch#86274

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant