Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transpiled autoboxing of byte to Byte does not work with unsigned byte values in JavaScript anymore #260

Open
rbsdc opened this issue Jan 17, 2025 · 4 comments

Comments

@rbsdc
Copy link

rbsdc commented Jan 17, 2025

Describe the bug
I am not sure if this is a bug or something that was intentionally changed.

The transpiled version of Byte.valueOf(byte) accepts now only signed byte values (-128 to 127). If a value > 127 or a value < -128 is given, undefined is returned.

In previous versions of j2cl, the transpiled value Byte.valueOf(byte) always returned a Byte object, regardless of the value (values > 255 also worked). I am not sure, when this change was introduced.

This change breaks our application. We have a Java library that parses PDF files. Reading the PDF file is done via an interface. In Java, we use e.g. RandomAccessFile, and in the transpiled JavaScript version we use FileReader (returning unsigned byte values 0-255, therefore the error). In Java, the code works fine. In JavaScript, the code also worked, but because of this change not anymore. (at least, where Byte.valueOf(byte)is used in the transpiled version; most other parts seem to work as before).

More specifcally, in Java, I have the following code:

List<Byte> bytes = new ArrayList<>();
while (! parser.isEndReached()) {
   byte charCode = parser.getCurrentByte(); // the reader implementation is used here
   ....
  bytes.add(charCode);   // here happens the autoboxing, in the transpiled version with Byte.valueOf(byte), where the error happens.
}

Transpiled version.

for (var b = module$exports$java$util$ArrayList$impl.$create__(), c = 1; !a.isEndReached();) {
        var d = a.getCurrentByte();
        ...
        b.add(module$exports$java$lang$Byte$impl.m_valueOf__byte__java_lang_Byte(d))
    }

I recently upgraded bazel (and it seems j2cl too), because it seems there was a problem with the new String(byte[]) constructor, which, however, seems to fixed now. This is how I stumbled on this error.

Bazel version
7.4.0 (I was previously on 5.4.1)

Expected behavior
The transpiled JavaScript version of Byte.valueOf(byte) should handle unsigned bytes correctly.

@gkdn
Copy link
Member

gkdn commented Jan 17, 2025

In Java, we use e.g. RandomAccessFile, and in the transpiled JavaScript version we use FileReader (returning unsigned byte values 0-255, therefore the error). In Java, the code works fine. In JavaScript, the code also worked, but because of this change not anymore.

A byte value in Java is signed and could be only between values '-128' and '127'. You can never get a value out of this range in JDK.

And only reason if you would get a value out of this range in J2CL would be incorrect JS boundary code that declares byte but value returned from JS is not guaranteed to be byte. When such a value leaks in the Java type system, the behavior would be undefined.

You mentioned the value is coming from FileReader; given J2CL doesn't provide that class, I recommended checking the implementation of the class and make sure it conforms to boundaries of byte values.

@rbsdc
Copy link
Author

rbsdc commented Jan 20, 2025

@gkdn thank you very much for the explanation!

I am aware that Java uses signed bytes. In JavaScript, it is possible to work with unsigned or signed bytes (although the type "byte" does not exist). In our case, our JavaScript code that is using the transpiled Java library was working with unsigned arrays in the form of Uint8Array (see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Uint8Array), returned from our usage of the FileReader. FileReader refers to the standard JavaScript API file reader (https://developer.mozilla.org/en-US/docs/Web/API/FileReader) .

Since the code was working fine until now, I was under the assumption, that the transpiled Java code is designed to both function when using both signed or unsigned JavaScript bytes. However, it looks like the transpiled Java code should only be called with signed bytes in JavaScript, or else there will be runtime errors? Since this change of the transpiled Byte.valueOf method seems intentional, are there more relevant methods that have been changed in this way recently? May I ask if there is a change log or similar?

My workaround for now, without changing too much of our code, is the following: Instead of the Java code bytes.add(charCode); (see above), I use bytes.add((byte) charCode & 0xff); . In Java, (byte) charCode & 0xff does not make too much sense. However, the resulting JavaScript is then bytes.add(module$exports$java$lang$Byte$impl.m_valueOf__byte__java_lang_Byte(module$exports$vmbootstrap$Primitives$impl.narrowIntToByte(charCode & 255))) instead of b.add(module$exports$java$lang$Byte$impl.m_valueOf__byte__java_lang_Byte(d)). Like this, it is working. Since it is known at compile time that charCode is a byte, would it not make sense to always use the narrowIntToByte-method when autoboxing?

@gkdn
Copy link
Member

gkdn commented Jan 22, 2025

However, it looks like the transpiled Java code should only be called with signed bytes in JavaScript, or else there will be runtime errors?

Yes, you need to adhere to the rules of Java otherwise you would be leaking incorrect values into type system (e.g. if type is byte, the value is assumed to be correct). The way the error will manifest itself could be in surprising; might be runtime error (if you are lucky), might just some math error or number printing error etc.

Since this change of the transpiled Byte.valueOf method seems intentional, are there more relevant methods that have been changed in this way recently? May I ask if there is a change log or similar?

As I mentioned these are errors distant to the actual source of the problem; the changes there are not intended to be no-op and should be no-op can expose such typing errors (it might have showed up in any change - that's why I said incorrect values has undefined behavior).

The way to address the issue is; when you expose such API from JavaScript, you should use a Java type at the JS boundary that actually correctly encapsulates the JS value. For example, if the value is 0-255, you can use short, int etc in Java. After that (i.e. when the value and Java type are consistent), you don't need any hacks: regular Java operation like the narrowing cast from int to byte will take care of the coercion.

Another example of that would be; let's say the JS value can be (undefined | null | number). In that case you would need to use Double or any of the its super types (e.g. j.l.Object) as the corresponding Java type. That will enforce you to correctly handle it within Java type-system without any hacks (e.g. null-check, intValue() etc.).

I noticed that we never open-sourced the documentation around type representations. I just did that which might help a bit here.

@gkdn
Copy link
Member

gkdn commented Jan 22, 2025

In our case, our JavaScript code that is using the transpiled Java library was working with unsigned arrays in the form of Uint8Array

Just note that if you simply change the view to Int8Array, this will have correct ranges and values could be treated as byte.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants