Skip to content

Commit

Permalink
Specify charset to read resources
Browse files Browse the repository at this point in the history
  • Loading branch information
daniellansun committed Dec 30, 2023
1 parent c8bd556 commit aa7cb8e
Showing 1 changed file with 2 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.URL;
import java.nio.charset.StandardCharsets;
import java.util.Enumeration;
import java.util.LinkedHashSet;
import java.util.Set;
Expand All @@ -44,7 +45,7 @@ public static Set<String> getRegisteredExtensions(ClassLoader loader) {
globalServices = loader.getResources("META-INF/services/org.codehaus.groovy.source.Extensions");
}
for (URL service : DefaultGroovyMethods.toSet(globalServices)) {
try (BufferedReader svcIn = new BufferedReader(new InputStreamReader(URLStreams.openUncachedStream(service)))) {
try (BufferedReader svcIn = new BufferedReader(new InputStreamReader(URLStreams.openUncachedStream(service), StandardCharsets.UTF_8))) {

This comment has been minimized.

Copy link
@eric-milles

eric-milles Jan 3, 2024

Member

Is there a standard somewhere that specifies the encoding of these service resources? For example, .properties files are ISO-8859-1 by default.

This comment has been minimized.

Copy link
@daniellansun

daniellansun Jan 3, 2024

Author Contributor

AFAIK, UTF-8 is compatible with ISO-8859-1, so it's OK even if files are ISO-8859-1 by default.

This comment has been minimized.

Copy link
@eric-milles

eric-milles Jan 3, 2024

Member

I don't know that the services files are ISO-8859-1 or if UTF-8 is fully compatible. The idea is to find if there is any specification of what charset these files are expected to be and use that and add a link to the spec. Otherwise you just swap one bug for another.

This comment has been minimized.

Copy link
@blackdrag

blackdrag Jan 4, 2024

Contributor

I think the answer in https://stackoverflow.com/questions/59479438/really-true-that-properties-files-must-be-encoded-in-iso-8859-1-before-java-9 has a good overview for this. There is not much of a spec.

String extension = svcIn.readLine();
while (extension != null) {
extension = extension.trim();
Expand Down

3 comments on commit aa7cb8e

@daniellansun
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try (BufferedReader svcIn = new BufferedReader(new InputStreamReader(URLStreams.openUncachedStream(service), StandardCharsets.UTF_8))) {

This is existing code, which is using UTF-8 to read services files, so I suppose services files are UTF-8 because I can not find any spec for now.

/cc @blackdrag @paulk-asert

@blackdrag
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try (BufferedReader svcIn = new BufferedReader(new InputStreamReader(URLStreams.openUncachedStream(service), StandardCharsets.UTF_8))) {

This is existing code, which is using UTF-8 to read services files, so I suppose services files are UTF-8 because I can not find any spec for now.

That is our file, we define everything for it. If it says nowhere that this is UTF-8, then we should define that and it is good. And actually we are not really documenting this very well I think

@daniellansun
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try (BufferedReader svcIn = new BufferedReader(new InputStreamReader(URLStreams.openUncachedStream(service), StandardCharsets.UTF_8))) {

This is existing code, which is using UTF-8 to read services files, so I suppose services files are UTF-8 because I can not find any spec for now.

That is our file, we define everything for it. If it says nowhere that this is UTF-8, then we should define that and it is good. And actually we are not really documenting this very well I think

Understood ;-)

Please sign in to comment.