Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Deallocation Pass #1167

Open
wants to merge 48 commits into
base: main
Choose a base branch
from
Open

Update Deallocation Pass #1167

wants to merge 48 commits into from

Conversation

tzunghanjuang
Copy link
Collaborator

@tzunghanjuang tzunghanjuang commented Oct 1, 2024

Context:
Replace bufferization-deallocation with bufferization-deallocation-pipeline, which contains ownership-based-deallocation from MLIR. The pass requires each catalyst operation have a getEffect method.

This PR is open for review to reproduce the enzyme double free problem.

Related GitHub Issues:
Follow up of #1027

LLVM discord

@tzunghanjuang
Copy link
Collaborator Author

tzunghanjuang commented Oct 3, 2024

This code snippet breaks enzyme.

    %base_buffer, %offset = memref.extract_strided_metadata %1 : memref<f64> -> memref<f64>, index
    %alloc_5 = memref.alloc() : memref<6xindex>
    %intptr = memref.extract_aligned_pointer_as_index %base_buffer : memref<f64> -> index
    memref.store %intptr, %alloc_5[%c0] : memref<6xindex>

Error:

val:   %58 = ptrtoint ptr %13 to i64 origin=  %58 = ptrtoint ptr %13 to i64

Might be relevant? llvm/llvm-project#64334

@dime10
Copy link
Contributor

dime10 commented Oct 3, 2024

This code snippet breaks enzyme.

    %base_buffer, %offset = memref.extract_strided_metadata %1 : memref<f64> -> memref<f64>, index
    %alloc_5 = memref.alloc() : memref<6xindex>
    %intptr = memref.extract_aligned_pointer_as_index %base_buffer : memref<f64> -> index
    memref.store %intptr, %alloc_5[%c0] : memref<6xindex>

Error:

val:   %58 = ptrtoint ptr %13 to i64 origin=  %58 = ptrtoint ptr %13 to i64

I wonder if @rmoyard has seen something similar?

@rmoyard
Copy link
Contributor

rmoyard commented Oct 4, 2024

@tzunghanjuang Can you share how to reproduce the Enzyme bug?

@tzunghanjuang
Copy link
Collaborator Author

tzunghanjuang commented Oct 4, 2024

@rmoyard . This test will trigger the error.

import jax.numpy as jnp
import pennylane as qml

from catalyst import jacobian, qjit

def test_one_to_many(backend, diff_method):
    """Test a tall Jacobian (one input to many outputs)"""

    @qml.qnode(qml.device(backend, wires=1), diff_method=diff_method)
    def workflow(x):
        qml.RX(x, wires=0)
        return qml.expval(qml.PauliZ(wires=0))

    def postprocess(x):
        w = workflow(x)
        return jnp.array([jnp.cos(w), w, w * 2])

    @qjit(keep_intermediate=True)
    def jac_postprocess(x):
        return jacobian(postprocess, method="auto")(x)

    return jac_postprocess(0.5)

test_one_to_many("lightning.qubit", "parameter-shift")

@rmoyard
Copy link
Contributor

rmoyard commented Oct 4, 2024

@tzunghanjuang Thanks I will take a look at it later today

@tzunghanjuang
Copy link
Collaborator Author

tzunghanjuang commented Oct 4, 2024

I think this is related to the enzyme bug (_mlir_memref_to_llvm_alloc). I will try hacking the mlir side to avoid generating this kind of code for now.

Edit: The error is not directly linked to this part.

  %17 = call ptr @_mlir_memref_to_llvm_alloc(i64 72)
  %18 = ptrtoint ptr %17 to i64
  %19 = add i64 %18, 63
  %20 = and i64 %19, -64
  %21 = inttoptr i64 %20 to ptr 
Illegal updateAnalysis prev:{[-1]:Integer} new: {[-1]:Pointer}
val:   %20 = and i64 %19, -64 origin=  %20 = and i64 %19, -6

@tzunghanjuang
Copy link
Collaborator Author

Small code snippet that triggers the Enzyme error.

define internal void @loss.cloned(ptr %0, ptr %1, i64 %2, i64 %3, i64 %4, ptr nocapture readnone %5, ptr nocapture writeonly %6, i64 %7) {
  %9 = tail call ptr @_mlir_memref_to_llvm_alloc(i64 0)
  %10 = call ptr @_mlir_memref_to_llvm_alloc(i64 24)
  %11 = ptrtoint ptr %9 to i64
  store i64 %11, ptr %10, align 4, !tbaa !4
  ret void
}

Error:

error: <unknown>:0:0: in function loss.cloned void (ptr, ptr, i64, i64, i64, ptr, ptr, i64): Enzyme: <analysis>
ptr %0: {[-1]:Pointer}, intvals: {}
ptr %1: {[-1]:Pointer}, intvals: {}
i64 %2: {[-1]:Integer}, intvals: {}
i64 %3: {[-1]:Integer}, intvals: {}
i64 %4: {[-1]:Integer}, intvals: {}
ptr %5: {[-1]:Pointer}, intvals: {}
ptr %6: {[-1]:Pointer}, intvals: {}
i64 %7: {[-1]:Integer}, intvals: {}
  %10 = call ptr @_mlir_memref_to_llvm_alloc(i64 24): {[-1]:Pointer, [-1,0]:Integer, [-1,1]:Integer, [-1,2]:Integer, [-1,3]:Integer, [-1,4]:Integer, [-1,5]:Integer, [-1,6]:Integer, [-1,7]:Integer}, intvals: {}
  %11 = ptrtoint ptr %9 to i64: {[-1]:Integer}, intvals: {}
  %9 = tail call ptr @_mlir_memref_to_llvm_alloc(i64 0): {[-1]:Pointer}, intvals: {}
</analysis>
Illegal updateAnalysis prev:{[-1]:Integer} new: {[-1]:Pointer}
val:   %11 = ptrtoint ptr %9 to i64 origin=  %11 = ptrtoint ptr %9 to i64

@tzunghanjuang
Copy link
Collaborator Author

Even after sorting out tbaa issue, we still have free(): double free detected in tcache 2 related to Enzyme .
The produced code is sth like the following. It is clear that %53 is freed twice.

diffepostprocess.cloned.exit:                     ; preds = %._crit_edge.i.1.i, %82
  %85 = icmp ne i64 %64, %79
  %86 = icmp ne i64 %62, %79
  %87 = and i1 %85, %86
  %88 = icmp ne i64 %68, %79
  %89 = and i1 %88, %87
  %90 = and i1 %80, %89
  %91 = getelementptr inbounds i8, ptr %60, i64 2
  store i1 %90, ptr %91, align 1, !tbaa !53
  tail call void @_mlir_memref_to_llvm_free(ptr nonnull %53) #11
  tail call void @_mlir_memref_to_llvm_free(ptr nonnull %61) #11
  %92 = load double, ptr %16, align 8, !tbaa !2, !alias.scope !75, !noalias !78
  store double 0.000000e+00, ptr %16, align 8, !tbaa !2, !alias.scope !75, !noalias !78
  %93 = load double, ptr %"'ipc13.i", align 64, !tbaa !2, !alias.scope !49, !noalias !46
  %94 = fadd fast double %93, %92
  store double %94, ptr %"'ipc13.i", align 64, !tbaa !2, !alias.scope !49, !noalias !46
  tail call void @_mlir_memref_to_llvm_free(ptr nonnull %60)
  tail call void @_mlir_memref_to_llvm_free(ptr nonnull %54)
  tail call void @_mlir_memref_to_llvm_free(ptr nonnull %53)
  tail call void @_mlir_memref_to_llvm_free(ptr nonnull %52)

@tzunghanjuang tzunghanjuang marked this pull request as ready for review October 10, 2024 20:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants