Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update builder (Docker and package files for better java support) #113

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

michaelmayer2
Copy link
Contributor

This pull request will add better java support to our opinionated R builds.

Main issue is that the build of R from sources will auto-detect the java version present on the system. This is great to start with. The problem however that it detects the JAVA_HOME and sets it to the physically existing path. Such a path not only contains the major version of Java but also any patch level or minor release versions as defined by the linux distro. In this pull request all the files in the directory builder are updated to set JAVA_HOME to a folder that is independent of any security patches or updates applied to the underlying linux distro.

Accepting this pull request will fix any issue that leads to the urban myth that using Java in R is difficult.

An article on solutions.rstudio.com is in preparation https://github.com/rstudio/solutions.rstudio.com/pull/210

@@ -67,6 +66,9 @@ ENV CONFIGURE_OPTIONS="\
--with-tk-config=/usr/lib64/tkConfig.sh \
--enable-prebuilt-html"

# Make sure that patching the OS does not break Java
ENV JAVA_HOME=/usr/lib/jvm/jre-11-openjdk
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While working on RHEL 9 support and trying to incorporate these improvements there, I learned that the default R from EPEL/Fedora uses a completely unversioned path that's managed by the alternatives system:

R CMD config JAVA_HOME
# /usr/lib/jvm/jre

# To change the major version of Java at /usr/lib/jvm/jre:
update-alternatives --config java

From quick testing, using JAVA_HOME=/usr/lib/jvm/jre seems to support even major Java version switches without having to run R CMD javareconf. Switching from Java 11 to 17 seemed to just work, although switching from Java 11/17 to 8 still required a R CMD javareconf since some paths changed from 8 to 11:

# Java 8
JAVA_LIBS = -L$(JAVA_HOME)/lib/amd64/server -ljvm
JAVA_LD_LIBRARY_PATH = $(JAVA_HOME)/lib/amd64/server

# Java 11/17
JAVA_LIBS = -L$(JAVA_HOME)/lib/server -ljvm
JAVA_LD_LIBRARY_PATH = $(JAVA_HOME)/lib/server

What do you think of using this completely unversioned alternatives path? It doesn't work for all major version switches, but would at least accommodate users going from 11 to 17. And would also match EPEL/Fedora R's behavior.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found similar behavior with the Java packages on openSUSE/SLES 15.3. The path they use there is /usr/lib64/jvm:

zypper -n install java-11-openjdk-devel java-1_8_0-openjdk-devel
ls /usr/lib64/jvm -la
# lrwxrwxrwx 1 root root   26 Aug  8 23:17 java -> /etc/alternatives/java_sdk
# lrwxrwxrwx 1 root root   29 Aug  8 23:17 java-11 -> /etc/alternatives/java_sdk_11
# lrwxrwxrwx 1 root root   18 Apr 21 08:49 java-11-openjdk -> java-11-openjdk-11
# drwxr-xr-x 7 root root 4096 Aug  8 23:17 java-11-openjdk-11
# lrwxrwxrwx 1 root root   34 Aug  8 23:17 java-openjdk -> /etc/alternatives/java_sdk_openjdk
# lrwxrwxrwx 1 root root   21 Aug  8 23:17 jre -> /etc/alternatives/jre
# lrwxrwxrwx 1 root root   24 Aug  8 23:17 jre-11 -> /etc/alternatives/jre_11
# lrwxrwxrwx 1 root root   18 Apr 21 08:49 jre-11-openjdk -> java-11-openjdk-11
# lrwxrwxrwx 1 root root   29 Aug  8 23:17 jre-openjdk -> /etc/alternatives/jre_openjdk
/usr/lib64/jvm/jre/bin/java -version
# openjdk version "11.0.15" 2022-04-19

update-alternatives --config java

/usr/lib64/jvm/jre/bin/java -version
# openjdk version "1.8.0_332"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest we keep the version number encoded in JAVA_HOME, e.g. /usr/lib/jvm/jre-11-openjdk. As you point out, moving between java 11 and 17 seem to work but not when moving to 8 (not without a javareconf).

The idea of this PR is to make Java integration a bit more seamless. I am not sure if we are creating more issues by moving to the extremely generic path compared to the issues that we are trying to solve.

@@ -17,6 +17,9 @@ RUN chmod 0777 /opt
# Pin fpm to avoid git dependency in 1.12.0
RUN gem install fpm:1.11.0

# Make sure that patching the OS does not break Java
ENV JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Ubuntu/Debian, the R builds here appear to already use this major-versioned location:

R CMD config JAVA_HOME
# /usr/lib/jvm/java-11-openjdk-amd64

Checking the default Ubuntu/Debian R packages, they appear to all use /usr/lib/jvm/default-java from default-jdk or default-jre:

R CMD config JAVA_HOME
# /usr/lib/jvm/default-java

Unfortunately it doesn't look like you can control the JRE location using the alternatives system in Ubuntu/Debian. /usr/lib/jvm/default-java is put there by default-jdk/default-jre, and not present if you install Java 11 or Java 8 specifically.

ls /usr/lib/jvm -la 
# lrwxrwxrwx 1 root root   25 Jul 17  2019 default-java -> java-1.11.0-openjdk-amd64
# lrwxrwxrwx 1 root root   21 Jul 20 22:37 java-1.11.0-openjdk-amd64 -> java-11-openjdk-amd64
# lrwxrwxrwx 1 root root   21 Jul 22 21:02 java-1.17.0-openjdk-amd64 -> java-17-openjdk-amd64
# drwxr-xr-x 9 root root 4096 Aug  8 23:34 java-11-openjdk-amd64
# drwxr-xr-x 9 root root 4096 Aug  8 23:34 java-17-openjdk-amd64
# drwxr-xr-x 2 root root 4096 Aug  8 23:34 openjdk-11
# drwxr-xr-x 2 root root 4096 Aug  8 23:34 openjdk-17

But still, maybe it's better to use /usr/lib/jvm/default-java anyway to match the distro default R and because it's common for users to install default-jdk (not sure, might be wrong about this)? At least for RSPM, default-jdk is what we've been using for instructions on installing Java.

@@ -59,6 +59,7 @@ fpm \
-d xz-devel \
-d zip \
-d zlib-devel \
-d java-11-openjdk-devel \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since Java is often an optional dependency and somewhat heavy, could we leave it out of the DEB/RPM package dependencies? That would better support users who want to switch between Java 8, 11, 17, etc. as well. I think the default R packages from the various distros also leave Java as an optional dependency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can leave java out but why do we then spend effort on configuring Java in the first place ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if there was a specific reason for configuring Java in the first place, and would guess that it was just borrowed from how Debian and Fedora packaged R. It is nice to be able to just install Java and have it work out of the box without R CMD javareconf. But I still do think Java is too heavy to add as a DEB/RPM dependency since these are billed as "minimal" R builds and Java is an optional dependency. It'll add a few hundred MBs to many consumers who don't necessarily need Java, like the RStudio R Docker images, the RStudio Pro Product images, the GitHub Actions for R, and others. And those on Java 8 or 17 will have an extra unused installation of Java.

Copy link
Contributor Author

@michaelmayer2 michaelmayer2 Aug 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I definitely agree with your sentiment about the size of the R build if we go with the Java integration. So far I only thought that this was an "opinionated R build" but not a "minimal build". Once probably could segregate the Java related things into a separate DEB/RPM but then the installation process becomes too complex again.

For now we can always refer to https://solutions.rstudio.com/r/using-rjava/ as a reference document for users wishing to use Java with R.

The idea of the PR was to make our R builds a bit more holistic and consistent but based on the arguments made above I agree it may be one or two steps too much in the wrong direction.

Let's make sure we can get OpenBLAS into Ubuntu 20.04 R builds and then we should close this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's kind of both opinionated and minimal. We do some opinionated things that aren't completely minimal, like including OpenBLAS and the -devel packages of a few libraries that are required to install packages that link against the R shared library (which would only be included in r-base-dev, not r-base for example). A separate DEB/RPM that includes Java would be nice, but agreed that it'd complicate installation. Maybe a middle ground would be to provide users with a preconfigured Java where possible, like in the various Docker images. That article also looks great, and I'll try to reference it when any Java configuration questions come up.

Do you still want to explicitly set JAVA_HOME to at least make a first time Java installation more seamless? I've done that with the RHEL 9 R builds: https://github.com/rstudio/r-builds/pull/133/files#diff-7c6c557e074afff5ac851dbf63d384a5deaea84626a0f6945f41767510c08c39R77-R83. I think it's still better than letting R default it to something like /usr/lib/jvm/java-11-openjdk-11.0.16.0.8-1.el9_0.x86_64.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, setting JAVA_HOME for the RHEL9 builds to the more generic folder I would strongly recommend. Also agree with the notion of extra DEB/RPM packages for Java where we even could be in the business of providing packages for any JDK offered, e.g. 1.8.0, 11, 17, ...

@michaelmayer2 michaelmayer2 mentioned this pull request Aug 15, 2022
@glin
Copy link
Contributor

glin commented Aug 16, 2022

@michaelmayer2 I went ahead and resolved the merge conflicts, just clearing out some of the removed EOL distros.

@glin glin mentioned this pull request Aug 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants