-
Notifications
You must be signed in to change notification settings - Fork 6
Bytecode‐level invocation
When you invoke methods in Java, the instructions get compiled to four(*) types of instructions.
*: There is invokedynamic
, but it isnt directly related to usual method calling and is quite the beast to explain.
The main instructions for invoking methods are invokestatic
, invokevirtual
, invokeinterface
and invokespecial
. Each of these basically do the same thing with some minor differences and restrictions.
Invokes a static method which you have access to. For this to work, you need to provide:
- Owner class name
- Method name
- Method descriptor (for disambiguation against other methods with the same name)
Method invocation will fail if the method is either not static, not accessible or the owner class couldn't be loaded.
All arguments in the method descriptor will be popped from the stack IN REVERSE, so the arguments are to be provided in order.
Example: Calling a method add(long, String)
will expect the following stack:
0: long
1: (long top)
2: ref (type String)
which can, for example, be obtained like so:
lconst_0 ; push 0L to the stack
ldc "hello world" ; push "hello world" (string) to the stack
After method invocation, if the method has a return type (it isn't void), the return value is pushed to the stack.
Here is some code that invokes add
in the same class:
public static int add(int a, int b) {
return a + b;
}
public static void main(String[] args) {
int result = add(5, 5);
}
And here are the instructions this compiles to
iconst_5 ; push 5
iconst_5 ; push 5
invokestatic Main.add (II)I ; invoke int add(int, int)
istore_1 ; store the return value in local 1 (result in the code above)
We can see iconst_5 is written twice meaning it loads 2 values (5, 5) onto the stack, then we can see those 2 values are used in the invocation to our add method which returns the result that we then store in a local variable.
Invokevirtual is the exact counterpart to invokestatic. It invokes non-static (aka "virtual") methods.
The general value aquiring algorithm is the same as for invokestatic, with one exception: invokevirtual expects an instance on the stack.
The instance has to be the exact type or a child type of the owner of the method provided to invokevirtual. If you were to call String.toLowerCase(), the owner stack element would have to be either a String or a descendant of String (not practically possible since String is final).
The instance is to be provided before any arguments. That is, it has to be one stack element below the first argument. Example: Abc.hi(int)
will expect the following stack:
0: Abc or child
1: int
If the instance is null, a NullPointerException is thrown.
Which method ends up being invoked is up to the type of the instance. invokevirtual goes through the type hierarchy up through all classes until there are none left. If any of them contain a method definition for the requested method, that method is invoked. This algorithm generally ends at the method we give to invokevirtual, since it has to exist for the instruction to even begin this algorithm.
Example: consider class A
defining method abc()
and class B extends A
overriding abc()
from A. "Calling" A.abc()
with the instance stack element being of type B will begin the search at B, find that there is a method abc()
defined, and invoke that, despite us mentioning A.abc()
. If this method did not exist in B, the search would continue in A, and the method is defined there anyway.
Let's say we have this similar code block we had for invokestatic
except that the add()
method now does not have a static
access modifier:
public int add(int a, int b) {
return a + b;
}
public void entry() {
int result = add(5, 5);
}
The compiled bytecode for entry():
aload 0 ; load `this` (since we're in a nonstatic method, our first variable is always `this`). This is our instance
iconst_5 ; 5
iconst_5 ; 5
invokevirtual Main.add (II)I ; Call int Main.add(int, int), with the instance being `this` and the two arguments being 5 and 5
istore_1 ; store result in local variable 1 (aka "result" in the above code)
The additional aload 0
loads the very first local variable as the instance. Since we are in a nonstatic method, this is this
, and is thus an object (that's why this is using aload
).
invokeinterface
and invokevirtual
are largely the same, with the same resolution algorithm. The difference being that invokeinterface throws an exception when there isn't an appropiate implementation found at runtime, since this is a thing that can theoretically happen with interfaces.
Here's an example:
public interface MyInterface {
void performAction();
}
public class MyClass implements MyInterface {
// the implementation of performAction()
@Override
public void performAction() {
System.out.println("Action performed!");
}
}
and here is an example of how this would be used:
public class Main {
public static void main(String[] args) {
MyInterface object = new MyClass(); // create an object through MyClass
object.performAction(); // invoke the implementation
}
}
invokeinterface
is used to invoke an implementation provided by a class which in this case is MyClass
that implements the interface.
Basically it is used to invoke an implementation of a method that is from an interface.
Invokespecial is a bit weird, since it does not follow the typical polymorphism that invokevirtual and invokeinterface go through, yet it still requires an instance on the stack.
Invokespecial does not follow the type inheritance up to find the first implementing member. Instead, it allows direct access to specific method implementations on specific classes in the type hierarchy, no matter what the actual type on the stack is. This, in theory, could allow for A.abc()
from above being called directly even if the type on the stack is B
, and B defines its own implementation.
Since this is practically a nuclear bomb in the eyes of the normal polymorphism Java employs, invokespecial usage is heavily restricted to only allow for child classes to invoke methods of parent classes. That is, it allows B to call abc() of A. No one else may do that.
This is used to implement super.abc()
calls. Any super.abc()
you see in class B is actually an invokespecial A.abc()
. This is also true for super();
constructor calls.
Invokespecial has another use: constructors.
Any constructor you see is actually a combination of (mostly) 3 instructions:
new Abc ; allocates a new instance of the class Abc. This instance cant yet be used for anything and is uninitialized; the constructor hasn't been executed yet
dup ; duplicate the uninitialized instance we just made. now we have 2 references to the same object on the stack.
; any arguments to the constructor would be pushed or loaded in here
invokespecial Abc.<init>()V ; invoke the constructor defined in Abc. This will initialize the object and leave us with the second reference we just made using dup, which now points to a valid, initialized object.
The invokedynamic instruction is the newest addition to the JVM's invocation instructions. It was introduced in Java 7 and is mainly used for implementing lambdas and method references.
Unlike other invocation instructions, invokedynamic doesn't specify a method to call directly. Instead, it uses a bootstrap method to determine what to call at runtime. This makes it very flexible but also harder to analyze statically.
Here's how it works:
First time the instruction runs, it calls its bootstrap method.
The bootstrap method returns a CallSite object.
The JVM links the instruction to this CallSite.
Next times, it just calls the method in the CallSite.
This is great for the JVM but it's a pain in the ass for reverse engineers. You can't just look at the bytecode and know what's being called - you have to figure out what the bootstrap method is doing.
You might also see the bootstrap method being referred to as the "bsm".