You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have Tensorflow 2.6 installed according to recommendations given in 835#, i.e. together with Reticulate and Keras for R under RStudio together with R4.0.5. My notebook uses an Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz processor.
Prior to that, however, I had to update my NVIDIA GeForce RTX 2060 from CUDA10.1 to CUDA10.2. This finally succeeded, albeit with numerous problems caused by apparently incomplete DLLs in NIVIDIA installation files, but requested by TF2.6. Details are given in #577.
A test run under TF2.6 using the Iris toy program shown in #1172 produced the warning message
I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
at the beginning of the first training epoch, but, despite this warning, the training process continued smoothly with the usual results. Otherwise, during the first run, the GPU produced the usual messages that all required DLLs were found etc., from which I conclude that the GPU is working as it should.
However, if I try to run my slightly more elaborate own programm (4xCONV2d & 2xDENSE layers with x/y-dimensions (samples, 5,5,1)/(samples, 4) and very low signal-to-noise ratio), the R session is aborted (!) already during the first epoch right after producing the same MLIR warning shown above. This is extremely inconvenient, as there are no causal hints at all to be found after the crash.
The program was correctly working with the same GPU under CUDA10.1 and Tensorflow 2.4. In the meantime, as I am unable to find any deficiencies of the whole setup, I have deinstalled the GPU completely. The MLIR warning still appears, but the program is running flawlessly - though considerably slower, of course 😢.
I have the impression that perhaps the control functions surrounding the network part of my program are somehow not compatible with the MLIR optimization process (I do very early stopping to prevent overfitting - depending on temporal data, sometimes even single epochs have to be and are executed regularly). This raises the question whether it is possible to somehow deactivate the MLIR optimization completely.
Any hints are welcome.
The text was updated successfully, but these errors were encountered:
faltinl
changed the title
Tensorflow 2.6 with GPU aborts program with reference to MLIR optimization
Tensorflow 2.6 with GPU aborts R session with reference to MLIR optimization
Sep 21, 2021
I have Tensorflow 2.6 installed according to recommendations given in 835#, i.e. together with Reticulate and Keras for R under RStudio together with R4.0.5. My notebook uses an Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz processor.
Prior to that, however, I had to update my NVIDIA GeForce RTX 2060 from CUDA10.1 to CUDA10.2. This finally succeeded, albeit with numerous problems caused by apparently incomplete DLLs in NIVIDIA installation files, but requested by TF2.6. Details are given in #577.
A test run under TF2.6 using the Iris toy program shown in #1172 produced the warning message
I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
at the beginning of the first training epoch, but, despite this warning, the training process continued smoothly with the usual results. Otherwise, during the first run, the GPU produced the usual messages that all required DLLs were found etc., from which I conclude that the GPU is working as it should.
However, if I try to run my slightly more elaborate own programm (4xCONV2d & 2xDENSE layers with x/y-dimensions (samples, 5,5,1)/(samples, 4) and very low signal-to-noise ratio), the R session is aborted (!) already during the first epoch right after producing the same MLIR warning shown above. This is extremely inconvenient, as there are no causal hints at all to be found after the crash.
The program was correctly working with the same GPU under CUDA10.1 and Tensorflow 2.4. In the meantime, as I am unable to find any deficiencies of the whole setup, I have deinstalled the GPU completely. The MLIR warning still appears, but the program is running flawlessly - though considerably slower, of course 😢.
I have the impression that perhaps the control functions surrounding the network part of my program are somehow not compatible with the MLIR optimization process (I do very early stopping to prevent overfitting - depending on temporal data, sometimes even single epochs have to be and are executed regularly). This raises the question whether it is possible to somehow deactivate the MLIR optimization completely.
Any hints are welcome.
The text was updated successfully, but these errors were encountered: