Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TransformIndirectLoadChain at JITServer #20767

Merged

Conversation

luke-li-2003
Copy link
Contributor

Implement TransformIndirectLoadChain partially for the JITServer so it can employ the Vector API during optimization.

@luke-li-2003 luke-li-2003 force-pushed the TransformIndirectLoadChainAtServer branch 2 times, most recently from ed4300a to 8eef209 Compare December 5, 2024 21:38
@luke-li-2003
Copy link
Contributor Author

@mpirvu

@mpirvu mpirvu self-assigned this Dec 5, 2024
@mpirvu mpirvu added comp:jit comp:jitserver Artifacts related to JIT-as-a-Service project labels Dec 5, 2024
Copy link
Contributor

@mpirvu mpirvu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to do some refactoring to avoid code duplication? The risk is that any fixes to dereferenceStructPointerChain() or verifyFieldAccess() are not going to be ported to transformIndirectLoadChainServerImpl

@luke-li-2003
Copy link
Contributor Author

Is it possible to do some refactoring to avoid code duplication?

It is possible, the main challenge is that the non-server implementation assumes VM access throughout the code, so we need to do some major re-writing.

@mpirvu
Copy link
Contributor

mpirvu commented Dec 10, 2024

How about moving all the JITServer changes down into dereferenceStructPointerChain.
So create a new, much simpler variant, that only handles some case for remote compilations. Something like this:

static void *dereferenceStructPointer(TR::KnownObjectTable::Index baseKnownObjectIndex, TR::Node *baseNode, bool isBaseStableArray, TR::Node *curNode, TR::Compilation *comp, void *valuePtr)
   {
   if (baseNode == curNode)
      {
      TR_ASSERT(false, "dereferenceStructPointerChain has no idea what to dereference");
      traceMsg(comp, "Caller has already dereferenced node %p, returning NULL as dereferenceStructPointerChain has no idea what to dereference\n", curNode);
      return NULL;
      }
   else
      {
      TR_ASSERT(curNode != NULL, "Field node is NULL");
      TR_ASSERT(curNode->getOpCode().hasSymbolReference(), "Node must have a symref");

      TR::SymbolReference *symRef = curNode->getSymbolReference();
      TR::Symbol *symbol = symRef->getSymbol();
      TR::Node *addressChildNode = symbol->isArrayShadowSymbol() ? curNode->getFirstChild()->getFirstChild() : curNode->getFirstChild();

      // The addressChildNode must has a symRef so that we can verify it
      if (!addressChildNode->getOpCode().hasSymbolReference())
         return NULL;

      if (isBaseStableArray)
         TR_ASSERT_FATAL(addressChildNode == baseNode, "We should have only one level of indirection for stable arrays\n");

      if (addressChildNode == baseNode)
         {
        // If baseStruct/baseNode and deemed verifiable by the caller do we still need to verify them?
        // Part of the verification. Only Java fields are considered
        if (isJavaField(symRef, comp)) // symbol->isShadow() && (cpIndex >= 0 || symbol->getRecognizedField() != TR::Symbol::UnknownField)
            {
            TR_OpaqueClassBlock *fieldClass = NULL;
            // Fabricated fields don't have valid cp index
            if (field->getCPIndex() < 0 &&
                field->getSymbol()->getRecognizedField() != TR::Symbol::UnknownField)
                {
                const char* className;
                int32_t length;
                className = field->getSymbol()->owningClassNameCharsForRecognizedField(length);
                fieldClass = fej9->getClassFromSignature(className, length, field->getOwningMethod(comp));
                }
            else
                fieldClass = field->getOwningMethod(comp)->getDeclaringClassFromFieldOrStatic(comp, field->getCPIndex());

            if (fieldClass == NULL)
                return NULL;
            TR_OpaqueClassBlock *objectClass =fej9->getObjectClassFromKnownObjectIndex(comp, baseKnownObjectIndex)
            TR_YesNoMaybe objectContainsField = fej9->isInstanceOf(objectClass, fieldClass, true);
            if (objectContainsField != TR_yes)
                return NULL;
            if (TR::TransformUtil::avoidFoldingInstanceField(baseStruct, symRef, comp)) // This needs some work
                {
                if (comp->getOption(TR_TraceOptDetails))
                    {
                    traceMsg(comp, "avoid folding load of field #%d from object at index %d\n", symRef->getReferenceNumber(), baseKnownObjectIndex);
                    }
                return NULL;
                }
            
            // (void*)comp->getKnownObjectTable()->getPointer(baseKnownObjectIndex)
            // value stored at fieldAddress from object and offset  fieldAddress = curStruct + symRef->getOffset(); if (!fieldAddress) return NULL
            // if (isCollectedReference())
            // value = fej9->getReferenceFieldAtAddress((uintptr_t)fieldAddress);  //==> Needs VM acess
            uint64_t data = sendMessageToServer(baseKnownObjectIndex, symRef->getOffset()); // Should be big enough to hold a double, a 64-bit thing and a pointer; also should have a representation for noData
            // pass value back to the caller
            TR::DataType loadType = node->getDataType();
            switch (loadType)
               {
               case TR::Int32:
                 *(int32_t*)valuePtr = (int32_t)data;
                 break;
             .....
               case TR::Address:
                  *(uintptr_t*)valuePtr = (uintptr_t)data;
                  break;
               }
            return valuePtr;
            }
         }
      }
   return NULL;
   }

This one, on failure it returns NULL, much like the original dereferenceStructPointerChain. On success it returns a pointer to where the data is going to be found.
So the caller uses a union for all data types possible:

union {
    int32_t i;
    int64_t l;
    float f;
    double d;
    void *p;
} value;

and it calls void *fieldAddress = dereferenceStructPointerChain(baseAddress, baseExpression, isBaseStableArray, node, comp, &value);
So, fieldAddress and &value are going to be the same.
Then the caller continues with the original code, but isCollectedReference case it should avoid doing uintptr_t value = fej9->getReferenceFieldAtAddress((uintptr_t)fieldAddress); because the client has already obtained that value.

@luke-li-2003
Copy link
Contributor Author

I have yet to implement the union, but I just want to make sure the overall structure of the code is correct.

@luke-li-2003 luke-li-2003 force-pushed the TransformIndirectLoadChainAtServer branch from e4c608a to eeebb5d Compare December 10, 2024 23:36
@luke-li-2003
Copy link
Contributor Author

I have yet to organise the commits. Vector API now works for both server and non-server mode. I also had to bypass transformIndirectChainAt because I am not sure how to implement it yet.

@luke-li-2003 luke-li-2003 force-pushed the TransformIndirectLoadChainAtServer branch from eeebb5d to c7311bc Compare December 10, 2024 23:38
runtime/compiler/optimizer/J9TransformUtil.cpp Outdated Show resolved Hide resolved
runtime/compiler/optimizer/J9TransformUtil.cpp Outdated Show resolved Hide resolved
runtime/compiler/optimizer/J9TransformUtil.cpp Outdated Show resolved Hide resolved
@luke-li-2003 luke-li-2003 force-pushed the TransformIndirectLoadChainAtServer branch 2 times, most recently from ea629a8 to 8da6728 Compare December 11, 2024 22:28
runtime/compiler/optimizer/J9TransformUtil.cpp Outdated Show resolved Hide resolved
runtime/compiler/optimizer/J9TransformUtil.cpp Outdated Show resolved Hide resolved
runtime/compiler/optimizer/J9TransformUtil.cpp Outdated Show resolved Hide resolved
runtime/compiler/optimizer/J9TransformUtil.cpp Outdated Show resolved Hide resolved
runtime/compiler/control/JITClientCompilationThread.cpp Outdated Show resolved Hide resolved
@luke-li-2003 luke-li-2003 force-pushed the TransformIndirectLoadChainAtServer branch 9 times, most recently from c39d9a2 to 62b720a Compare December 13, 2024 18:11
@luke-li-2003 luke-li-2003 force-pushed the TransformIndirectLoadChainAtServer branch from 62b720a to 176c759 Compare December 16, 2024 19:03
isArrayWithConstantElements(symRef, comp));
}

if (knotIndex != TR::KnownObjectTable::UNKNOWN)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original code has a test on value being non-null. Is this test equivalent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should be

runtime/compiler/optimizer/J9TransformUtil.cpp Outdated Show resolved Hide resolved
@luke-li-2003
Copy link
Contributor Author

With addArrayWithConstantElements being protected, I have to involve a change in omr as well

eclipse-omr/omr#7594

@mpirvu
Copy link
Contributor

mpirvu commented Dec 17, 2024

I am thinking that we can delete OMR::KnownObjectTable::updateKnownObjectTableAtServer(Index index, uintptr_t *objectReferenceLocation) from OMR since it's only used in OpenJ9

@mpirvu
Copy link
Contributor

mpirvu commented Dec 17, 2024

jenkins compile xlinux jdk21

@mpirvu
Copy link
Contributor

mpirvu commented Dec 17, 2024

jenkins compile all jdk8,jdk21

@luke-li-2003 luke-li-2003 force-pushed the TransformIndirectLoadChainAtServer branch from 150fa69 to ee700f1 Compare December 17, 2024 18:23
@luke-li-2003 luke-li-2003 marked this pull request as ready for review December 17, 2024 18:52
@mpirvu
Copy link
Contributor

mpirvu commented Dec 17, 2024

jenkins test sanity all jdk23

@mpirvu
Copy link
Contributor

mpirvu commented Dec 18, 2024

cmdLineTester_callsitedbgddrext_openj9_0 failed on xlinux and zlinux.
jdk_lang_j9_0 failed on openjdk zlinux
many failures on openjdk xlinux and mac

@mpirvu
Copy link
Contributor

mpirvu commented Dec 18, 2024

I started a 10x grinder on x86 with targets that failed: https://openj9-jenkins.osuosl.org/job/Grinder/4042/

@luke-li-2003 luke-li-2003 force-pushed the TransformIndirectLoadChainAtServer branch from ee700f1 to 3a7a696 Compare December 18, 2024 19:54
@mpirvu
Copy link
Contributor

mpirvu commented Dec 18, 2024

jenkins test sanity all jdk23

@mpirvu
Copy link
Contributor

mpirvu commented Dec 19, 2024

mac fails cmdLineTester_jython_0

22:12:07  Test start time: 2024/12/18 22:12:06 Eastern Standard Time
22:12:07  Running command: "/Users/jenkins/workspace/Test_openjdk23_j9_sanity.functional_x86-64_mac_Personal_testList_0/jdkbinary/j2sdk-image/bin/java"   -XshowSettings:vm -Dpython.options.showJavaExceptions=true -Dpython.options.includeJavaStackInExceptions=true -Dpython.options.showPythonProxyExceptions=true -cp "/Users/jenkins/workspace/Test_openjdk23_j9_sanity.functional_x86-64_mac_Personal_testList_0/../../testDependency/lib/jython-standalone.jar:/Users/jenkins/workspace/Test_openjdk23_j9_sanity.functional_x86-64_mac_Personal_testList_0/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/jython/cmdLineTester_jython.jar" JythonHello
22:12:07  Time spent starting: 26 milliseconds
22:12:08  Time spent executing: 776 milliseconds
22:12:08  Test result: FAILED
22:12:08  Output from test:
22:12:08   [ERR] VM settings:
22:12:08   [ERR]     Max. Heap Size (Estimated): 4.00G
22:12:08   [ERR]     Using VM: Eclipse OpenJ9 VM
22:12:08   [ERR] 
22:12:08   [ERR] Exception in thread "main" java.lang.NoClassDefFoundError: org.python.core.ThreadStateMapping
22:12:08   [ERR] 	at org.python.core.Py.<clinit>(Py.java:1734)
22:12:08   [ERR] 	at org.python.util.PythonInterpreter.<init>(PythonInterpreter.java:99)
22:12:08   [ERR] 	at org.python.util.PythonInterpreter.<init>(PythonInterpreter.java:93)
22:12:08   [ERR] 	at org.python.util.InteractiveInterpreter.<init>(InteractiveInterpreter.java:39)
22:12:08   [ERR] 	at org.python.util.InteractiveInterpreter.<init>(InteractiveInterpreter.java:28)
22:12:08   [ERR] 	at org.python.util.InteractiveInterpreter.<init>(InteractiveInterpreter.java:18)
22:12:08   [ERR] 	at JythonHello.main(JythonHello.java:29)
22:12:08   [ERR] Caused by: java.lang.ClassNotFoundException: org.python.core.ThreadStateMapping
22:12:08   [ERR] 	at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:827)
22:12:08   [ERR] 	at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
22:12:08   [ERR] 	at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:1098)
22:12:08   [ERR] 	... 7 more
22:12:08  >> Success condition was not found: [Output match: Hello Python World!]

zlinux fails jdk_lang_j9_0 and that is a crash in compiler

21:31:10  Unhandled exception
21:31:10  Type=Segmentation error vmState=0x000532ff
21:31:10  J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=c4eb3ed5 Signal_Code=00000001
21:31:10  Handler1=000003FF8C8C7CA8 Handler2=000003FF8C7B1B18 InaccessibleAddress=0000000000000000
21:31:10  gpr0=000003FF0948C2EA gpr1=000003FF81A03140 gpr2=000003FF3C002200 gpr3=0000000000000000
21:31:10  gpr4=0000000000000000 gpr5=0000000000000000 gpr6=000003FF08400000 gpr7=000003FF444EF448
21:31:10  gpr8=0000000000000002 gpr9=000003FF090A6170 gpr10=0000000000000000 gpr11=000003FF3C00A5D0
21:31:10  gpr12=000003FF81A34000 gpr13=000003FF00000004 gpr14=000003FF80D98F02 gpr15=000003FF444EF1E8
...
21:31:10  Method_being_compiled=MethodTypeDescTest.testMethodTypeDesc(Ljava/lang/constant/MethodTypeDesc;Ljava/lang/invoke/MethodType;)V
21:31:10  Target=2_90_20241218_43 (Linux 3.10.0-1160.118.1.el7.s390x)
21:31:10  CPU=s390x (4 logical CPUs) (0x1ec1b1000 RAM)
21:31:10  ----------- Stack Backtrace -----------
21:31:10  _ZN11TR_J9VMBase14getObjectClassEm+0x5c (0x000003FF80D98F44 [libj9jit29.so+0x298f44])
21:31:10  _ZN32TR_InvariantArgumentPreexistence7performEv+0xcfc (0x000003FF81436234 [libj9jit29.so+0x936234])
21:31:10  _ZN3OMR9Optimizer19performOptimizationEPK20OptimizationStrategyiii.localalias+0x89c (0x000003FF8144714C [libj9jit29.so+0x94714c])
21:31:10  _ZN3OMR9Optimizer8optimizeEv+0x1e6 (0x000003FF81448F2E [libj9jit29.so+0x948f2e])
21:31:10  _ZN3OMR20ResolvedMethodSymbol5genILEP11TR_FrontEndPN2TR11CompilationEPNS3_20SymbolReferenceTableERNS3_12IlGenRequestE+0x37c (0x000003FF811FEFD4 [libj9jit29.so+0x6fefd4])
21:31:10  _ZN18TR_J9InlinerPolicy25_tryToGenerateILForMethodEPN2TR20ResolvedMethodSymbolES2_P13TR_CallTarget+0x180 (0x000003FF80F39170 [libj9jit29.so+0x439170])
21:31:10  _ZN14TR_InlinerBase17inlineCallTarget2EP12TR_CallStackP13TR_CallTargetPPN2TR7TreeTopEbi+0x364 (0x000003FF81302254 [libj9jit29.so+0x802254])
21:31:10  _ZN14TR_InlinerBase16inlineCallTargetEP12TR_CallStackP13TR_CallTargetbP14TR_PrexArgInfoPPN2TR7TreeTopE+0x1a6 (0x000003FF80F6A11E [libj9jit29.so+0x46a11e])
21:31:10  _ZN14TR_InlinerBase15inlineFromGraphEP12TR_CallStackP13TR_CallTargetP24TR_InnerPreexistenceInfo+0x356 (0x000003FF8130190E [libj9jit29.so+0x80190e])
21:31:10  _ZN14TR_InlinerBase17inlineCallTarget2EP12TR_CallStackP13TR_CallTargetPPN2TR7TreeTopEbi+0x1e96 (0x000003FF81303D86 [libj9jit29.so+0x803d86])
21:31:10  _ZN14TR_InlinerBase16inlineCallTargetEP12TR_CallStackP13TR_CallTargetbP14TR_PrexArgInfoPPN2TR7TreeTopE+0x1a6 (0x000003FF80F6A11E [libj9jit29.so+0x46a11e])
21:31:10  _ZN14TR_InlinerBase15inlineFromGraphEP12TR_CallStackP13TR_CallTargetP24TR_InnerPreexistenceInfo+0x356 (0x000003FF8130190E [libj9jit29.so+0x80190e])
21:31:10  _ZN14TR_InlinerBase17inlineCallTarget2EP12TR_CallStackP13TR_CallTargetPPN2TR7TreeTopEbi+0x1e96 (0x000003FF81303D86 [libj9jit29.so+0x803d86])
21:31:10  _ZN14TR_InlinerBase16inlineCallTargetEP12TR_CallStackP13TR_CallTargetbP14TR_PrexArgInfoPPN2TR7TreeTopE+0x1a6 (0x000003FF80F6A11E [libj9jit29.so+0x46a11e])
21:31:10  _ZN14TR_InlinerBase15inlineFromGraphEP12TR_CallStackP13TR_CallTargetP24TR_InnerPreexistenceInfo+0x356 (0x000003FF8130190E [libj9jit29.so+0x80190e])
21:31:10  _ZN14TR_InlinerBase17inlineCallTarget2EP12TR_CallStackP13TR_CallTargetPPN2TR7TreeTopEbi+0x1e96 (0x000003FF81303D86 [libj9jit29.so+0x803d86])
21:31:10  _ZN14TR_InlinerBase16inlineCallTargetEP12TR_CallStackP13TR_CallTargetbP14TR_PrexArgInfoPPN2TR7TreeTopE+0x1a6 (0x000003FF80F6A11E [libj9jit29.so+0x46a11e])
21:31:10  _ZN28TR_MultipleCallTargetInliner17inlineCallTargetsEPN2TR20ResolvedMethodSymbolEP12TR_CallStackP24TR_InnerPreexistenceInfo+0x1018 (0x000003FF80F455C8 [libj9jit29.so+0x4455c8])
21:31:10  _ZN14TR_InlinerBase15performInliningEPN2TR20ResolvedMethodSymbolE+0xd0 (0x000003FF813052A8 [libj9jit29.so+0x8052a8])
21:31:10  _ZN10TR_Inliner7performEv+0x16a (0x000003FF80F3B0D2 [libj9jit29.so+0x43b0d2])
21:31:10  _ZN3OMR9Optimizer19performOptimizationEPK20OptimizationStrategyiii.localalias+0x89c (0x000003FF8144714C [libj9jit29.so+0x94714c])

I am concerned about this one

Implement TransformIndirectLoadChain partially for the JITServer
so it can employ the Vector API during optimization.

Signed-off-by: Luke Li <[email protected]>
@luke-li-2003 luke-li-2003 force-pushed the TransformIndirectLoadChainAtServer branch from 3a7a696 to 3a96621 Compare December 19, 2024 22:28
@mpirvu
Copy link
Contributor

mpirvu commented Dec 20, 2024

jenkins test sanity all jdk23

1 similar comment
@mpirvu
Copy link
Contributor

mpirvu commented Dec 20, 2024

jenkins test sanity all jdk23

@luke-li-2003
Copy link
Contributor Author

The aarch64 failure can be reproduced by the nightly build.

https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder/46011/console

More testing required for the Power failure.

@mpirvu
Copy link
Contributor

mpirvu commented Dec 30, 2024

on plinux we have one failure: cmdLineTester_criu_jitserverAcrossCheckpoint_0

21:25:56  Testing: Portable CRIU Mode: Enable JITServer specified Pre-Checkpoint but not explicitly enabled Post-Restore
21:25:56  Test start time: 2024/12/20 02:25:56 Coordinated Universal Time
21:25:56  Running command: bash /home/jenkins/workspace/Test_openjdk23_j9_sanity.functional_ppc64le_linux_Personal_testList_0/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criuJitServerScript.sh /home/jenkins/workspace/Test_openjdk23_j9_sanity.functional_ppc64le_linux_Personal_testList_0/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu /home/jenkins/workspace/Test_openjdk23_j9_sanity.functional_ppc64le_linux_Personal_testList_0/jdkbinary/j2sdk-image/bin "  -XX:+UseJITServer -XX:-CRIURestoreNonPortableMode -Xjit:verbose={compilePerformance},verbose={JITServer},verbose={JITServerConns},vlog=preCheckpointVlog" org.openj9.criu.OptionsFileTest "JitOptionsTest -Xjit:verbose={compilePerformance},verbose={CheckpointRestore},verbose={JITServer},verbose={JITServerConns},vlog=postRestoreVlog" 1 false true
21:25:56  Time spent starting: 3 milliseconds
21:26:08  Time spent executing: 10483 milliseconds
21:26:08  Test result: FAILED
21:26:08  Output from test:
21:26:08   [OUT] start running script
21:26:08   [OUT] export GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVEC,-XSAVE,-AVX2,-ERMS,-AVX,-AVX_Fast_Unaligned_Load
21:26:08   [OUT] export LD_BIND_NOT=on
21:26:08   [OUT] Starting /home/jenkins/workspace/Test_openjdk23_j9_sanity.functional_ppc64le_linux_Personal_testList_0/jdkbinary/j2sdk-image/bin/jitserver -XX:JITServerPort=50744 -XX:JITServerHealthProbePort=44107 
21:26:08   [OUT] 2913942 ?        00:00:00 jitserver
21:26:08   [OUT] JITSERVER EXISTS
21:26:08   [OUT] Pre-checkpoint
21:26:08   [OUT] main: Fri Dec 20 02:25:58 UTC 2024, Performing CRIUSupport.checkpointJVM(), System.currentTimeMillis(): 1734661558559, System.nanoTime(): 7242242184539594
21:26:08   [OUT] JVMJITM048W AOT load and compilation disabled pre-checkpoint and post-restore.
21:26:08   [OUT] Post-checkpoint
21:26:08   [OUT] JITSERVER NO LONGER EXISTS
21:26:08   [OUT] Terminating /home/jenkins/workspace/Test_openjdk23_j9_sanity.functional_ppc64le_linux_Personal_testList_0/jdkbinary/j2sdk-image/bin/jitserver -XX:JITServerPort=50744 -XX:JITServerHealthProbePort=44107 
21:26:08   [OUT] finished script
21:26:08   [ERR] /home/jenkins/workspace/Test_openjdk23_j9_sanity.functional_ppc64le_linux_Personal_testList_0/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/jitserverconfig.sh: line 30: lsof: command not found
21:26:08   [ERR] /home/jenkins/workspace/Test_openjdk23_j9_sanity.functional_ppc64le_linux_Personal_testList_0/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/jitserverconfig.sh: line 30: lsof: command not found
21:26:08   [ERR] 
21:26:08   [ERR] JITServer is ready to accept incoming requests
21:26:08   [ERR] /home/jenkins/workspace/Test_openjdk23_j9_sanity.functional_ppc64le_linux_Personal_testList_0/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criuJitServerScript.sh: line 114: 2914020 Killed                  $TEST_JDK_BIN/java -XX:+EnableCRIUSupport -XX:JITServerPort=$JITSERVER_PORT $JVM_OPTIONS -cp "$TEST_ROOT/criu.jar" $MAINCLASS $APP_ARGS -XX:JITServerPort=$JITSERVER_PORT $NUM_CHECKPOINT > testOutput 2>&1
21:26:08   [ERR] Assertion failed at /home/jenkins/workspace/Build_JDK23_ppc64le_linux_Personal/openj9/runtime/compiler/env/JITServerPersistentCHTable.cpp:172: classInfo
21:26:08   [ERR] 	subclass info cannot be null: ensure subclasses are loaded before superclass
21:26:08   [ERR] #0: /home/jenkins/workspace/Test_openjdk23_j9_sanity.functional_ppc64le_linux_Personal_testList_0/jdkbinary/j2sdk-image/lib/default/libj9jit29.so(+0xb86b80) [0x7fffb6d86b80]
21:26:08   [ERR] #1: /home/jenkins/workspace/Test_openjdk23_j9_sanity.functional_ppc64le_linux_Personal_testList_0/jdkbinary/j2sdk-image/lib/default/libj9jit29.so(+0xb9859c) [0x7fffb6d9859c]
21:26:08   [ERR] #2: /home/jenkins/workspace/Test_openjdk23_j9_sanity.functional_ppc64le_linux_Personal_testList_0/jdkbinary/j2sdk-image/lib/default/libj9jit29.so(+0x7702dc) [0x7fffb69702dc]
21:26:08   [ERR] #3: /home/jenkins/workspace/Test_openjdk23_j9_sanity.functional_ppc64le_linux_Personal_testList_0/jdkbinary/j2sdk-image/lib/default/libj9jit29.so(+0x771f8c) [0x7fffb6971f8c]
21:26:08   [ERR] #4: /home/jenkins/workspace/Test_openjdk23_j9_sanity.functional_ppc64le_linux_Personal_testList_0/jdkbinary/j2sdk-image/lib/default/libj9jit29.so(+0x2dfe10) [0x7fffb64dfe10]
21:26:08   [ERR] #5: /home/jenkins/workspace/Test_openjdk23_j9_sanity.functional_ppc64le_linux_Personal_testList_0/jdkbinary/j2sdk-image/lib/default/libj9jit29.so(+0x2e033c) [0x7fffb64e033c]
21:26:08   [ERR] #6: /home/jenkins/workspace/Test_openjdk23_j9_sanity.functional_ppc64le_linux_Personal_testList_0/jdkbinary/j2sdk-image/lib/default/libj9jit29.so(+0x203848) [0x7fffb6403848]
21:26:08   [ERR] #7: /home/jenkins/workspace/Test_openjdk23_j9_sanity.functional_ppc64le_linux_Personal_testList_0/jdkbinary/j2sdk-image/lib/default/libj9jit29.so(+0x176560) [0x7fffb6376560]
21:26:08   [ERR] #8: /home/jenkins/workspace/Test_openjdk23_j9_sanity.functional_ppc64le_linux_Personal_testList_0/jdkbinary/j2sdk-image/lib/default/libj9jit29.so(+0x176b78) [0x7fffb6376b78]
21:26:08   [ERR] #9: /home/jenkins/workspace/Test_openjdk23_j9_sanity.functional_ppc64le_linux_Personal_testList_0/jdkbinary/j2sdk-image/lib/default/libj9jit29.so(+0x176c30) [0x7fffb6376c30]
21:26:08   [ERR] #10: /home/jenkins/workspace/Test_openjdk23_j9_sanity.functional_ppc64le_linux_Personal_testList_0/jdkbinary/j2sdk-image/lib/default/libj9prt29.so(+0x39cf4) [0x7fffb8169cf4]
21:26:08   [ERR] #11: /home/jenkins/workspace/Test_openjdk23_j9_sanity.functional_ppc64le_linux_Personal_testList_0/jdkbinary/j2sdk-image/lib/default/libj9jit29.so(+0x177198) [0x7fffb6377198]
21:26:08   [ERR] #12: /home/jenkins/workspace/Test_openjdk23_j9_sanity.functional_ppc64le_linux_Personal_testList_0/jdkbinary/j2sdk-image/lib/default/libj9thr29.so(+0xcc00) [0x7fffb7cccc00]
21:26:08   [ERR] #13: /lib64/glibc-hwcaps/power10/libc.so.6(+0xa3130) [0x7fffb82a3130]
21:26:08   [ERR] #14: /lib64/glibc-hwcaps/power10/libc.so.6(clone+0xa0) [0x7fffb834c848]
21:26:08   [ERR] 
21:26:08   [ERR] Unhandled exception
21:26:08   [ERR] Type=Unhandled trap vmState=0x00000000
21:26:08   [ERR] J9Generic_Signal_Number=00000108 Signal_Number=00000005 Error_Value=00000000 Signal_Code=fffffffa
21:26:08   [ERR] Handler1=00007FFFB7A43B40 Handler2=00007FFFB8168840
21:26:08   [ERR] R0=00000000000000FA R1=00007FFF9AEEB420 R2=00007FFFB8437D00 R3=0000000000000000
21:26:08   [ERR] R4=00000000002C769A R5=0000000000000005 R6=0000000000000000 R7=00007FFF9AEF68E0
21:26:08   [ERR] R8=000000000000004E R9=0000000000000000 R10=0000000000000000 R11=0000000000000000
21:26:08   [ERR] R12=0000000000000000 R13=00007FFF9AEF68E0 R14=0000000000000001 R15=0000000000CDD300
21:26:08   [ERR] R16=00007FFFB71C0778 R17=00007FFFB71EBB68 R18=0000000000000001 R19=00007FFF06529168
21:26:08   [ERR] R20=0000000000000000 R21=00007FFF9AEEB5D0 R22=00007FFF9AEEB618 R23=00007FFF9AEEB5E8
21:26:08   [ERR] R24=0000000000000006 R25=00007FFF8C01B7A0 R26=00007FFFB64DDD90 R27=00007FFF8C029B50
21:26:08   [ERR] R28=0000000000000005 R29=00007FFF8C029B78 R30=0000000000000001 R31=00000000002C769A
21:26:08   [ERR] NIP=00007FFFB82A5B68 MSR=800000000280D033 ORIG_GPR3=00000000002C7696 CTR=0000000000000000
21:26:08   [ERR] LINK=0000000000000000 XER=0000000000000000 CCR=0000000048884204 SOFTE=0000000000000001
21:26:08   [ERR] TRAP=0000000000003000 DAR=00007FFFB6D86B48 dsisr=0000000040000000 RESULT=0000000000000000
21:26:08   [ERR] FPR0=0000000000e59b00 (f: 15047424.000000, d: 7.434415e-317)
21:26:08   [ERR] FPR1=3fe655e520000000 (f: 536870912.000000, d: 6.979852e-01)
21:26:08   [ERR] FPR2=0000000000000000 (f: 0.000000, d: 0.000000e+00)
21:26:08   [ERR] FPR3=0000002e00000021 (f: 33.000000, d: 9.761181e-313)
21:26:08   [ERR] FPR4=0000000000000000 (f: 0.000000, d: 0.000000e+00)
21:26:08   [ERR] FPR5=0000002600000027 (f: 39.000000, d: 8.063584e-313)
21:26:08   [ERR] FPR6=0000000000000000 (f: 0.000000, d: 0.000000e+00)
21:26:08   [ERR] FPR7=0000000000000000 (f: 0.000000, d: 0.000000e+00)
21:26:08   [ERR] FPR8=0000000000000000 (f: 0.000000, d: 0.000000e+00)
21:26:08   [ERR] FPR9=0000000000000000 (f: 0.000000, d: 0.000000e+00)
21:26:08   [ERR] FPR10=0000000000000000 (f: 0.000000, d: 0.000000e+00)
21:26:08   [ERR] FPR11=306178302b656e6f (f: 728067712.000000, d: 1.206955e-75)
21:26:08   [ERR] FPR12=646a2f305f747369 (f: 1601467264.000000, d: 5.180945e+175)
21:26:08   [ERR] FPR13=6d692d6b6473326a (f: 1685271168.000000, d: 1.110959e+219)
21:26:08   [ERR] FPR14=0000000000000000 (f: 0.000000, d: 0.000000e+00)
21:26:08   [ERR] FPR15=0000000000000000 (f: 0.000000, d: 0.000000e+00)
21:26:08   [ERR] FPR16=0000000000000000 (f: 0.000000, d: 0.000000e+00)
21:26:08   [ERR] FPR17=0000000000000000 (f: 0.000000, d: 0.000000e+00)
21:26:08   [ERR] FPR18=0000000000000000 (f: 0.000000, d: 0.000000e+00)
21:26:08   [ERR] FPR19=0000000000000000 (f: 0.000000, d: 0.000000e+00)
21:26:08   [ERR] FPR20=0000000000000000 (f: 0.000000, d: 0.000000e+00)
21:26:08   [ERR] FPR21=0000000000000000 (f: 0.000000, d: 0.000000e+00)
21:26:08   [ERR] FPR22=0000000000000000 (f: 0.000000, d: 0.000000e+00)
21:26:08   [ERR] FPR23=0000000000000000 (f: 0.000000, d: 0.000000e+00)
21:26:08   [ERR] FPR24=0000000000000000 (f: 0.000000, d: 0.000000e+00)
21:26:08   [ERR] FPR25=0000000000000000 (f: 0.000000, d: 0.000000e+00)
21:26:08   [ERR] FPR26=0000000000000000 (f: 0.000000, d: 0.000000e+00)
21:26:08   [ERR] FPR27=0000000000000000 (f: 0.000000, d: 0.000000e+00)
21:26:08   [ERR] FPR28=0000000000000000 (f: 0.000000, d: 0.000000e+00)
21:26:08   [ERR] FPR29=0000000000000000 (f: 0.000000, d: 0.000000e+00)
21:26:08   [ERR] FPR30=0000000000000000 (f: 0.000000, d: 0.000000e+00)
21:26:08   [ERR] FPR31=0000000000000000 (f: 0.000000, d: 0.000000e+00)
21:26:08   [ERR] Module=/lib64/glibc-hwcaps/power10/libc.so.6
21:26:08   [ERR] Module_base_address=00007FFFB8200000
21:26:08   [ERR] Target=2_90_20241220_33 (Linux 5.14.0-511.el9.ppc64le)
21:26:08   [ERR] CPU=ppc64le (4 logical CPUs) (0x1dade0000 RAM)

@mpirvu
Copy link
Contributor

mpirvu commented Dec 30, 2024

I started a 50x grinder with failed test targets here: https://openj9-jenkins.osuosl.org/job/Grinder/4050/

@mpirvu
Copy link
Contributor

mpirvu commented Dec 30, 2024

jenkins test sanity xlinuxjit,plinuxjit,zlinuxjit,alinux64jit jdk21

@luke-li-2003
Copy link
Contributor Author

The vector tests are failing because they are checking for the #VECTOR API message in the client's vlog, rather than the JITServer's.

@mpirvu
Copy link
Contributor

mpirvu commented Dec 31, 2024

plinuxjit failed jdk_lang_0 because of a timeout
aarch64 failed cmdLineTester_jfr_0 with a timeout. I assume this is a new test.

21:08:56  ===============================================
21:08:56  Running test cmdLineTester_jfr_0 ...
21:08:56  ===============================================
21:08:56  cmdLineTester_jfr_0 Start Time: Fri Dec 20 02:08:56 2024 Epoch Time (ms): 1734660536648
21:08:56  variation: NoOptions
...
21:12:11  Testing: VM API Test - approx 2mins
21:12:11  Test start time: 2024/12/20 02:11:59 Coordinated Universal Time
21:12:11  Running command: "/home/jenkins/workspace/Test_openjdk23_j9_sanity.functional_aarch64_linux_Personal_testList_1/jdkbinary/j2sdk-image/bin/java"   --add-exports java.base/com.ibm.oti.vm=ALL-UNNAMED -cp /home/jenkins/workspace/Test_openjdk23_j9_sanity.functional_aarch64_linux_Personal_testList_1/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/jfr/jfr.jar org.openj9.test.VMAPITest
21:12:11  Time spent starting: 6 milliseconds
21:17:04  ***[TEST INFO 2024/12/20 02:16:59] ProcessKiller detected a timeout after 300000 milliseconds!***
21:17:04  ***[TEST INFO 2024/12/20 02:16:59] executing /usr/bin/gdb -batch -x /tmp/debugger15092537380659966925.txt /home/jenkins/workspace/Test_openjdk23_j9_sanity.functional_aarch64_linux_Personal_testList_1/jdkbinary/j2sdk-image/bin/java 3312066***

plinuxjit sanity has failed cmdLineTester_criu_jitserverAcrossCheckpoint_0 with a fatal assert. This is something we used to see from time to time on zLinux, but it crops up on plinux as well.

21:25:56  Testing: Portable CRIU Mode: Enable JITServer specified Pre-Checkpoint but not explicitly enabled Post-Restore
21:25:56  Test start time: 2024/12/20 02:25:56 Coordinated Universal Time
21:25:56  Running command: bash /home/jenkins/workspace/Test_openjdk23_j9_sanity.functional_ppc64le_linux_Personal_testList_0/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criuJitServerScript.sh /home/jenkins/workspace/Test_openjdk23_j9_sanity.functional_ppc64le_linux_Personal_testList_0/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu /home/jenkins/workspace/Test_openjdk23_j9_sanity.functional_ppc64le_linux_Personal_testList_0/jdkbinary/j2sdk-image/bin "  -XX:+UseJITServer -XX:-CRIURestoreNonPortableMode -Xjit:verbose={compilePerformance},verbose={JITServer},verbose={JITServerConns},vlog=preCheckpointVlog" org.openj9.criu.OptionsFileTest "JitOptionsTest -Xjit:verbose={compilePerformance},verbose={CheckpointRestore},verbose={JITServer},verbose={JITServerConns},vlog=postRestoreVlog" 1 false true
21:25:56  Time spent starting: 3 milliseconds
21:26:08  Time spent executing: 10483 milliseconds
21:26:08  Test result: FAILED
...
21:26:08   [ERR] _ZN2TR15fatal_assertionEPKciS1_S1_z+0x30 (0x00007FFFB6971F90 [libj9jit29.so+0x771f90])
21:26:08   [ERR] _ZN26JITServerPersistentCHTable19commitModificationsERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x2e0 (0x00007FFFB64DFE10 [libj9jit29.so+0x2dfe10])
21:26:08   [ERR] _ZN26JITServerPersistentCHTable8doUpdateEP11TR_J9VMBaseRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES9_+0x37c (0x00007FFFB64E033C [libj9jit29.so+0x2e033c])
21:26:08   [ERR] _ZN2TR30CompilationInfoPerThreadRemote12processEntryER21TR_MethodToBeCompiledRN2J917J9SegmentProviderE+0xa18 (0x00007FFFB6403848 [libj9jit29.so+0x203848])
21:26:08   [ERR] _ZN2TR24CompilationInfoPerThread14processEntriesEv+0x410 (0x00007FFFB6376560 [libj9jit29.so+0x176560])
21:26:08   [ERR] _ZN2TR24CompilationInfoPerThread3runEv+0xa8 (0x00007FFFB6376B78 [libj9jit29.so+0x176b78])
21:26:08   [ERR] _Z30protectedCompilationThreadProcP13J9PortLibraryPN2TR24CompilationInfoPerThreadE+0xa0 (0x00007FFFB6376C30 [libj9jit29.so+0x176c30])
21:26:08   [ERR] omrsig_protect+0x3e4 (0x00007FFFB8169CF4 [libj9prt29.so+0x39cf4])
21:26:08   [ERR] _Z21compilationThreadProcPv+0x1a8 (0x00007FFFB6377198 [libj9jit29.so+0x177198])
21:26:08   [ERR] thread_wrapper+0x190 (0x00007FFFB7CCCC00 [libj9thr29.so+0xcc00])
21:26:08   [ERR] start_thread+0x170 (0x00007FFFB82A3130 [libc.so.6+0xa3130])
21:26:08   [ERR] clone+0xa0 (0x00007FFFB834C848 [libc.so.6+0x14c848])

@mpirvu
Copy link
Contributor

mpirvu commented Dec 31, 2024

My 50x grinder on plinux has passed. The failures that appeared in testing are known (except for the JFR on aarch64 which is a new test), hence this PR can be merged.

@mpirvu mpirvu merged commit e4332e9 into eclipse-openj9:master Dec 31, 2024
26 of 36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:jit comp:jitserver Artifacts related to JIT-as-a-Service project
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants