Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support type cast detection #1081

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ratoaq2
Copy link

@ratoaq2 ratoaq2 commented Mar 12, 2023

  • Detect checkcast asm instructions as TypeCast (similar to InstanceofCheck)
  • Implicit checkcast instructions generated by compiler due to method invocations are ignored

I decided to implement this feature myself since I have a real use case which I need this and the other PR that started implementing it has no activity since almost 1 year.

Just a brief info about the use case that I need:

Detect certain type casts that should be forbidden in our code because it could lead to ClassCastException due to use of hibernate proxies and polymorphic entities

Some notes to add:

  1. This feature detect type casts statements in java. It doesn't consider calls to the cast method of a given class as a cast statement. So FooBar foo = FooBar.class.cast(obj) is not considered as a type cast. It's just a call to a method which does the cast.
  2. Some checkcast instructions are ignored deliberately. In java bytecode we might encounter checkcast after a call to a method, as the following example in com.tngtech.archunit.example.onionarchitecture.adapter.cli.AdministrationCLI:
    private static void handle(String[] args, AdministrationPort port) {
        // violates the pairwise independence of adapters
        ProductRepository repository = port.getInstanceOf(ProductRepository.class);
        long count = repository.getTotalCount();
        // parse arguments and re-configure application according to count through port
    }
  private static void handle(java.lang.String[], com.tngtech.archunit.example.onionarchitecture.application.AdministrationPort);
    Code:
       0: aload_1
       1: ldc           #19                 // class com/tngtech/archunit/example/onionarchitecture/adapter/persistence/ProductRepository
       3: invokeinterface #21,  2           // InterfaceMethod com/tngtech/archunit/example/onionarchitecture/application/AdministrationPort.getInstanceOf:(Ljava/lang/Class;)Ljava/lang/Object;
       8: checkcast     #19                 // class com/tngtech/archunit/example/onionarchitecture/adapter/persistence/ProductRepository
      11: astore_2
      12: aload_2
      13: invokeinterface #27,  1           // InterfaceMethod com/tngtech/archunit/example/onionarchitecture/adapter/persistence/ProductRepository.getTotalCount:()J
      18: lstore_3
      19: return

Java compiler adds a checkcast after the method call port.getInstanceOf(ProductRepository.class). While implementing this feature, I was detecting this as an implicit cast, but that proved of no use and brought only more confusion.

Resolves #710

@ratoaq2 ratoaq2 force-pushed the feature/type-cast-main branch 2 times, most recently from 2730507 to bc7c038 Compare March 12, 2023 18:05
@ratoaq2
Copy link
Author

ratoaq2 commented Mar 14, 2023

I just changed my mind about not considering FooBar.class.cast(obj) as a type cast.

Using existing archunit API, it's not possible to detect which class cast FooBar.class.cast(obj) refers to, since the MethodCallTarget refers only to the java.lang.Class as you can see its toString() -> target{java.lang.Class.cast(java.lang.Object)}

So, I believe we should consider FooBar.class.cast(obj) as type cast statements as if they were (FooBar) obj

Detect checkcast asm instructions as TypeCast (similar to InstanceofCheck)
Implicit checkcast instructions generated by compiler due to method invocations are ignored

Resolves TNG#710

Signed-off-by: Allan Jones <[email protected]>
@codecholeric
Copy link
Collaborator

Thanks a lot for your contribution and sorry for the long silence!! I was super overloaded 🙈 It looks widely very nice, just that it also seems to be much more complex than I anticipated with these "implicit" checkcasts 🙈 I'm not completely sure what is the right way to move forward here. At the moment, I'm not sure this "implict" vs "explicit" logic works reliable. For example I added some crazy code

@Test
public void example() {
    class Foo {
        String a;
        void call() {
            long l = 42L;
            int x = (int) l;
            a = (String) new Object();
            System.out.println(x);
            System.out.println((ArrayList) (List) ImmutableList.of(""));
            String foo = (String) new Object();
        }
    }

    System.out.println(new ClassFileImporter().importClass(Foo.class).getTypeCasts());
}

And with the current logic I see zero TypeCasts. That doesn't seem like we want (even though the code is a little crazy), right?

I'm also not sure how (int) l should be handled. I think it will never be a checkcast, but there are conversions like l2i that could be used. But I also think it's fair to say this is no type cast, even thought the syntax looks the same, so we ignore it here completely (as it is now). Just wanted to discuss it real quick.

I see 2 options to fix the problem, a) we invest more time to find what criteria we really need to distinguish explicit from implicit checkcast or b) we don't support TypeCast, but Checkcast in the domain (with respective Javadoc) and just record all the checkcast statements without any complex extra logic. Because for your use case for example the latter would likely suffice, because it would be a smell no matter if there is a checkcast for an explicit or implicit reason, no? (because implict would still mean that you used incompatible types somewhere, it might even better to find this also?)

*/
package com.tngtech.archunit.core.importer;

class TypeCastRecorder {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name was a little confusing to me on the first read, because technically this doesn't "record type casts", no? It just stores some meta-info that's queried later on when creating a type case. But it's not recording the actual checkcast statement, etc. I think it should either get a different name or also handle what to do when we encounter the checkcast, i.e. create the RawTypeCast then when we encounter the respective place 🤷‍♂️

@ratoaq2
Copy link
Author

ratoaq2 commented Aug 27, 2023

And with the current logic I see zero TypeCasts. That doesn't seem like we want (even though the code is a little crazy), right?

I'm revisiting the code now and indeed there's a basic bug. Most likely TypeCastRecorder.java (please, if possible, suggest a better name 🙏) should only rely on the opcode being invokeinterface:

    void registerMethodInstruction(int opcode) {
        implicit = Opcodes.INVOKEINTERFACE == opcode;
    }

That way it will correctly detect for the given example. Of course I'll need to check if nothing else gets broken or the tests which might need to be updated.

I'm also not sure how (int) l should be handled. I think it will never be a checkcast, but there are conversions like l2i that could be used. But I also think it's fair to say this is no type cast, even thought the syntax looks the same, so we ignore it here completely (as it is now). Just wanted to discuss it real quick.

indeed the (int) doesn't generate a checkcast, but l2i. Probably we could handle it inside the scope of this feature if it's decided to keep TypeCast as a name, instead of Checkcast.

Also in the code snippet, the (List) cast can't be detected because it's already a List and the compiler won't generate any instruction for that. For the (ArrayList) the compiler do generate a checkcast, so we're fine.

I see 2 options to fix the problem, a) we invest more time to find what criteria we really need to distinguish explicit from implicit checkcast or b) we don't support TypeCast, but Checkcast in the domain (with respective Javadoc) and just record all the checkcast statements without any complex extra logic. Because for your use case for example the latter would likely suffice, because it would be a smell no matter if there is a checkcast for an explicit or implicit reason, no? (because implict would still mean that you used incompatible types somewhere, it might even better to find this also?)

For me, option a seems more reasonable, unless it turns out to not be feasible or reliable. My main point is the initial scenario that I tried to cover: I need to be able to create an ArchUnit rule to not allow casting of certain classes, and if ArchUnit also returns me implicit detections, my rule will most likely always fails. We could still detect implicit checkcasts generated by the compiler, but some more info would need to be present so I can correctly filter them out.

I believe this implicit check using only the opcode == invokeinterface has a good chance to be the only thing needed for knowing if this was a implicit checkcast generated by the compiler.
When I get some time, let me try to change the code with that logic and check the tests.

PS: Just noticed the java 8 tests failed. I could fix that as well in a next push

@ratoaq2
Copy link
Author

ratoaq2 commented Aug 27, 2023

Yes, it's not that simple. Maybe option b is the only way

@codecholeric
Copy link
Collaborator

Maybe option b) is also alright, no? 🤔 Only bad thing about is that users won't know what "check cast" is supposed to be, so we definitely would have to explain it well in Javadoc. Regarding your use case, I would think that any CheckCast actually is a smell in that Hibernate case, no? 🤔 Or what would be the scenario where a check cast is fine for that case, as opposed to an "explicit" one? 🤔

@ratoaq2
Copy link
Author

ratoaq2 commented Oct 22, 2023

Probably option B is the only option.
Unfortunately that won't help my case, because I need to avoid that developers do a explicit cast of "certain" entities (which should use Hibernate.unproxy or Hibernate.getClass(..))

If the developer writes the following "compliant" code:

List<TargetEntity> results = entities.stream().map(Hibernate::unproxy).toList();

The compiler might still generate a checkcast and here there's nothing wrong with the code. It will be a false positive

@codecholeric
Copy link
Collaborator

You might be able to test if CheckCast goes together with one of the allowed Hibernate methods though, no? 🤔 I.e. checking all valid Hibernate "safecast" method calls and then checking for every CheckCast if the line number matches one of the "safecast" method call line numbers? 🤔 Not bulletproof for sure, but maybe better than nothing? 🤷

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for casts
2 participants