Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forward declarations are incomplete entities and we should skip them. #6

Merged

Conversation

vgvassilev
Copy link

The work on the InterOp library has shown that we try to build proxy classes for entities which are only forward declared which leads to crashes. This patch tries to protect against these cases by saving some work and exiting early.

@wlav, can you take a look? This change is perhaps useful upstream.

@vgvassilev vgvassilev requested a review from wlav March 11, 2023 21:25
@wlav
Copy link

wlav commented Mar 11, 2023

No: opaque handles should be supported, eg. for GUI libraries and GPUs.

@vgvassilev
Copy link
Author

No: opaque handles should be supported, eg. for GUI libraries and GPUs.

Then in that case we enter in a grey area. What should an interface such as IsAbstract return for an incomplete class?

@wlav
Copy link

wlav commented Mar 11, 2023

I don't have a good answer for that (other then that all property queries could return yes/no/maybe), but it's not relevant in the cppyy implementation as the IsAbstract query is only for whether or not to have a valid constructor, so you can ask IsComplete first, then never ask IsAbstract.

@vgvassilev
Copy link
Author

I don't have a good answer for that (other then that all property queries could return yes/no/maybe), but it's not relevant in the cppyy implementation as the IsAbstract query is only for whether or not to have a valid constructor, so you can ask IsComplete first, then never ask IsAbstract.

In that case could you help narrow which operations in that code need to be wrapped in IsComplete and which should be left out to make the opaque entities still work?

@wlav
Copy link

wlav commented Mar 11, 2023

I think to get the right semantics, this ordering could be changed:

    if (!hasConstructor) {
        PyCallable* defctor = nullptr;
        if (isAbstract)
            defctor = new CPPAbstractClassConstructor(scope, (Cppyy::TCppMethod_t)0);
        else if (isNamespace)
            defctor = new CPPNamespaceConstructor(scope, (Cppyy::TCppMethod_t)0);
        else if (!Cppyy::IsComplete(Cppyy::GetScopedFinalName(scope))) {
            ((CPPScope*)pyclass)->fFlags |= CPPScope::kIsInComplete;
            defctor = new CPPIncompleteClassConstructor(scope, (Cppyy::TCppMethod_t)0);
        } else
            defctor = new CPPAllPrivateClassConstructor(scope, (Cppyy::TCppMethod_t)0);
        cache["__init__"].push_back(defctor);
    }

For the other uses that I see, the two mean the same thing.

Perhaps an alternate API could be IsConstructable() returning reasons as to why or why not said scope is constructable.

@vgvassilev vgvassilev force-pushed the skip-incomplete-entities branch from 50812f4 to 4e3ad10 Compare March 12, 2023 06:18
@vgvassilev
Copy link
Author

I think to get the right semantics, this ordering could be changed:

    if (!hasConstructor) {
        PyCallable* defctor = nullptr;
        if (isAbstract)
            defctor = new CPPAbstractClassConstructor(scope, (Cppyy::TCppMethod_t)0);
        else if (isNamespace)
            defctor = new CPPNamespaceConstructor(scope, (Cppyy::TCppMethod_t)0);
        else if (!Cppyy::IsComplete(Cppyy::GetScopedFinalName(scope))) {
            ((CPPScope*)pyclass)->fFlags |= CPPScope::kIsInComplete;
            defctor = new CPPIncompleteClassConstructor(scope, (Cppyy::TCppMethod_t)0);
        } else
            defctor = new CPPAllPrivateClassConstructor(scope, (Cppyy::TCppMethod_t)0);
        cache["__init__"].push_back(defctor);
    }

For the other uses that I see, the two mean the same thing.

Perhaps an alternate API could be IsConstructable() returning reasons as to why or why not said scope is constructable.

I am not sure, but we might get away with an interface like: GetAllConstructors and insert the hooks. Then the rest should go on demand, we shouldn't require GetMethods calls at that stage...

I have added some changes which partially capture our discussion here. Can you take a look?

@vgvassilev vgvassilev force-pushed the skip-incomplete-entities branch 3 times, most recently from 7bf461a to 0cd1046 Compare March 12, 2023 12:01
@wlav
Copy link

wlav commented Mar 12, 2023

we might get away with an interface like: GetAllConstructors and insert the hooks.

no, as said, we also need the reasons for why there may not be constructors to generate clear error messages. E.g. an abstract class can be used in inheritance, but the derived class may still be abstract if some pure virtual method was missed. An incomplete class can not be used like that at all.

@vgvassilev
Copy link
Author

we might get away with an interface like: GetAllConstructors and insert the hooks.

no, as said, we also need the reasons for why there may not be constructors to generate clear error messages. E.g. an abstract class can be used in inheritance, but the derived class may still be abstract if some pure virtual method was missed. An incomplete class can not be used like that at all.

Yes, but that would be reflection information on the whole class rather than collecting all members. I am suggesting that GetAllConstructors would only collect a set of constructors and the questions on the scope can be asked. In the case of incomplete type we won't even need to collect constructors, etc.

@vgvassilev vgvassilev force-pushed the skip-incomplete-entities branch 2 times, most recently from 4620d0c to 4bf5fa1 Compare March 13, 2023 07:48
@vgvassilev
Copy link
Author

@wlav, could you take a look if I am doing something crazy here. I'd like to merge this PR with the hope it will make it upstream.

@wlav
Copy link

wlav commented Mar 13, 2023

There's a whole bunch of things mixed together in here, which I'd rather see disentangled (and I'm about to cut a release, so it'll have to wait a week anyway).

I'm not touching CI at this point, as I'm getting a contribution (starting with cppyy-backend) to get this done and the CI will be a build for the various pythons on various platforms, followed by an upload to PyPI, so something very different from what you have here.

I'm not understanding the changes from CppType_t to CppScope_t as these cases only involve types, not scopes.

I'm also not understanding the other renaming and API change. That definitely needs to wait for after the release, as it introduces incompatibilities that aren't easy to handle with pip.

@vgvassilev
Copy link
Author

There's a whole bunch of things mixed together in here, which I'd rather see disentangled (and I'm about to cut a release, so it'll have to wait a week anyway).

Indeed, I can open a few other PRs, however the commits seem pretty disentangled.

I'm not touching CI at this point, as I'm getting a contribution (starting with cppyy-backend) to get this done and the CI will be a build for the various pythons on various platforms, followed by an upload to PyPI, so something very different from what you have here.

I agree, let's see what you will get out of that.

I'm not understanding the changes from CppType_t to CppScope_t as these cases only involve types, not scopes.

Throughout the codebase we've seen that CppType_t and CppScope_t are sometimes used interchargeable. That's probably due to passing TClass and TClassRef which I think derive from each other. There if you expect a CppType_t and CppScope_t the casts are not problematic. However, we pass around a clang::Decl* and a clang::Type* which end up in random crashes as they are not directly convertible. I believe these changes are useful mainstream for consistency reasons.

If we take a step back, we should probably have some strong typing for CppScope_t and CppType_t so that the compiler can check if we are passing the wrong thing.

I'm also not understanding the other renaming and API change. That definitely needs to wait for after the release, as it introduces incompatibilities that aren't easy to handle with pip.

Which change renamed an API?

@wlav
Copy link

wlav commented Mar 13, 2023

CppScope_t and CppType_t are used sometimes interchangeably b/c certain questions pertain to both, and types are scopes, so that's valid. How do you resolve that if they have become completely distinct types? (I'd figure rather opaque handles that derive from each other might work.)

(Aside, TClassRef wraps a TClass, but yes, ROOT/meta does not distinguish between classes, structs, unions, and namespaces; only enums are an odd ball.)

Which change renamed an API?

Cppyy::GetMethodArgType() renamed to Cppyy::GetMethodArgTypeAsString(). Not touching such this close to a release. :}

@vgvassilev
Copy link
Author

CppScope_t and CppType_t are used sometimes interchangeably b/c certain questions pertain to both, and types are scopes, so that's valid.

Hm... ok. The changes that are here seem not to fall into this category as we had interfaces saying CppType_t but always being called with a scope, for instance.

How do you resolve that if they have become completely distinct types? (I'd figure rather opaque handles that derive from each other might work.)

I don't think we need to. There are certain operations that can be performed on Decls and other on Types. We have interface GetScopeFromType and GetTypeFromScope if we need to convert but I'd prefer to have these explicit. This is more aligned to the underlying compiler API. I'd be happy if we had a special opaque entity for both. Eg:
struct ScopeCursor {void* fPtr; }; and struct TypeCursor {void* fPtr; }; where the compiler could enforce some checks if we try to call an interface with the unexpected parameter.

(Aside, TClassRef wraps a TClass, but yes, ROOT/meta does not distinguish between classes, structs, unions, and namespaces; only enums are an odd ball.)

Which change renamed an API?

Cppyy::GetMethodArgType() renamed to Cppyy::GetMethodArgTypeAsString(). Not touching such this close to a release. :}

Ah, yes, the new design uses the opaque handles so Cppyy::GetMethodArgType() returns a CppType_t and Cppyy::GetMethodArgTypeAsString() a std::string. We needed that change because the Dispatcher tried to compile hex pointers :)

I was not suggesting to merge that in upstream before the release. It was more towards getting your opinion if that's mergeable upstream at all or what could have made it mergeable. These changes are needed for us to make progress with the tests.

@wlav
Copy link

wlav commented Mar 13, 2023

struct ScopeCursor {void* fPtr; }; and struct TypeCursor {void* fPtr; }; where the compiler could enforce some checks if we try to call an interface with the unexpected parameter.

I like that. :) Then either inheritance or custom cast operators to go between the two and in particular, we should only ever need to go from type to scope, with scope to type (likely?) to be an error.

@vgvassilev
Copy link
Author

struct ScopeCursor {void* fPtr; }; and struct TypeCursor {void* fPtr; }; where the compiler could enforce some checks if we try to call an interface with the unexpected parameter.

I like that. :) Then either inheritance or custom cast operators to go between the two and in particular, we should only ever need to go from type to scope, with scope to type (likely?) to be an error.

However, that's a major undertaking - maybe good for gsoc...

@wlav
Copy link

wlav commented Mar 13, 2023

However, that's a major undertaking - maybe good for gsoc...

I don't think so. :) And there seems to be a real benefit there to prevent crashes.

struct ScopeCursor {
    void* fDecl;
};

typedef ScopeCursor* CppScope_t;

struct TypeCursor {
    void* fType;
    operator CppScope_t() { return GetScopeFromType(*this); }
};

typedef TypeCursor* CppType_t;

No? Seems simple enough and seems less work to get it right from the start then try to recover later from changes such as the one proposed where a scope is now accepted that should ever only be a type?

@vgvassilev
Copy link
Author

However, that's a major undertaking - maybe good for gsoc...

I don't think so. :) And there seems to be a real benefit there to prevent crashes.

struct ScopeCursor {
    void* fDecl;
};

typedef ScopeCursor* CppScope_t;

struct TypeCursor {
    void* fType;
    operator CppScope_t() { return GetScopeFromType(*this); }
};

typedef TypeCursor* CppType_t;

No? Seems simple enough and seems less work to get it right from the start then try to recover later from changes such as the one proposed where a scope is now accepted that should ever only be a type?

Yes, that might work, @sudo-panda was looking into updating these and that might be a good way forward. However, I wanted to look into better ways to strong typing as used here:
https://github.com/llvm/llvm-project/blob/4875e0670926b0a40ce0dd2c3d490b1866889f54/clang/include/clang-c/Index.h#L2243-L2247

When we have a forward declaration this means that the entity is opaque to the
compiler. Asking further details generally should work, however, there is a lot
of gray zones. For example, if we ask if a forward declared entity is abstract,
there is no yes/no answer.

The C++ type system has the notion of complete/incomplete types where we can
do our best with the little information we have. This patch introduces this
into the way we generate proxy calls for the C++ enities. This needs quite a bit
more refactoring.
@vgvassilev vgvassilev force-pushed the skip-incomplete-entities branch from 3f58da9 to 7b46ae9 Compare March 13, 2023 20:41
@vgvassilev
Copy link
Author

I'd like to merge this PR. Let's move this discussion here: #8

@vgvassilev vgvassilev merged commit 1ac78a8 into compiler-research:master Mar 14, 2023
@vgvassilev vgvassilev deleted the skip-incomplete-entities branch March 14, 2023 05:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants