ML keras #4172

bikagit · 2024-08-22T12:57:36Z

Integrating Keras capabilities to OPM.
Draft pull request to test/discuss implications of the changes in OPM.
This enables the straightforward and adaptable integration of neural networks into OPM scripts. These models are initially trained using the Keras library in Python, stored in a format readable for the OPM framework and subsequently deployed within OPM.
When the user initializes and loads a stored Keras model inside an OPM script, an automated deployment process handles all the translation. This process works by operating a series of steps handling model interpretation, layer conversion, optimization, and code generation steps to adapt the Keras model to a native OPM function.

totto82 · 2024-08-22T13:00:14Z

jenkins build this please

daavid00 · 2024-08-22T13:20:40Z

jenkins build this please

atgeirr · 2024-08-22T13:56:05Z

This does not add a dependency on a third party library, so then I assume it instead embeds Keras in some way? Or have I misunderstood the purpose of this PR?

bska · 2024-08-22T15:01:20Z

Is there a reason this is added to opm-common instead of being a separate repository? Do we, somehow, need to make (selected) objects in this repository, or any of its downstream repositories for that matter, "aware" of Keras?

bikagit · 2024-08-22T15:31:11Z

This does not add a dependency on a third party library, so then I assume it instead embeds Keras in some way? Or have I misunderstood the purpose of this PR?

We use Keras for the training process (it doesnt need to be done in OPM). The generated models are subsequently embedded and run in OPM. I have updated the description to provide some context.

totto82 · 2024-08-26T13:35:36Z

jenkins build this please

daavid00 · 2024-08-28T07:11:41Z

jenkins build this please

bska · 2024-08-28T10:53:10Z

Maybe I'm missing something, but as far as I can tell no-one have answered my question from last week

Is there a reason this is added to opm-common instead of being a separate repository? Do we, somehow, need to make (selected) objects in this repository, or any of its downstream repositories for that matter, "aware" of Keras?

I would really like an answer to this before I consider the details of the PR.

totto82 · 2024-08-28T12:42:05Z

I would really like an answer to this before I consider the details of the PR.

Sorry for not answering earlier. The idea is to apply the ML-Keras inside OPM for different tasks. This is only the first PR to add the Keras ML model. The applications will follow. For an example of a ML near well model using ML-Keras check out https://github.com/cssr-tools/ML_near_well. Since the ML-Keras model framework is general. We hope it would be useful for the OPM community and therefore suggest to add it to opm-common

bska · 2024-08-28T12:51:38Z

The idea is to apply the ML-Keras inside OPM for different tasks [...] Since the ML-Keras model framework is general, we hope it would be useful for the OPM community and therefore suggest to add it to opm-common

Okay, utility/convenience is clearly one reason for adding it here. Would it be impossible to make [your/certain use cases] work if it were located elsewhere? Do you, for instance, need access to the internals/private data members or member functions of Well or Connection objects or similar in your use cases?

totto82 · 2024-08-28T13:06:18Z

jenkins build this please

bikagit · 2024-09-01T05:45:28Z

The idea is to apply the ML-Keras inside OPM for different tasks [...] Since the ML-Keras model framework is general, we hope it would be useful for the OPM community and therefore suggest to add it to opm-common

Okay, utility/convenience is clearly one reason for adding it here. Would it be impossible to make [your/certain use cases] work if it were located elsewhere? Do you, for instance, need access to the internals/private data members or member functions of Well or Connection objects or similar in your use cases?

Exact! For instance, we need access to the automatic differentiation tools within OPM.

totto82 · 2024-09-06T13:11:49Z

jenkins build this please

atgeirr

There are a lot of changes needed here. I have only looked at the C++ code, and I probably missed some. I have not really checked that the activation functions or the layers do what a user of Keras would expect. I have not looked at any of the Python code, someone else must do that.

I have requested many changes, but I hope it provides a useful learning experience. Feel free to ask about anything that is unclear!

opm/ml/ml_tools/__init__.py

opm/ml/keras_model.hpp

opm/ml/keras_model.cpp

kjetilly

Some comments on the python bits. I think a major point is that the folder opm/ml_tools, which contains only python code, should probably be moved to say the python folder or similar. And since these scripts use external libraries (tf, numpy, keras) I would really like to see a requirements.txt file specifying the versions used. Especially tensorflow is known to be problematic her.

opm/ml/ml_tools/kerasify.py

opm/ml/ml_tools/__init__.py

opm/ml/ml_tools/scaler_layers.py

bikagit · 2024-09-27T14:37:55Z

Thanks all for the valuables comments and suggestions.
We have started adding most of the modifications.

daavid00 · 2024-10-11T15:05:26Z

jenkins build this please

daavid00 · 2024-10-11T15:30:58Z

jenkins build this please

atgeirr

I have looked at the c++ parts, which have been improved a lot. Still quite a few things to address, but I think this will converge eventually!

opm/ml/ml_model.hpp

atgeirr · 2024-10-15T11:41:32Z

opm/ml/keras_model.hpp

+        data_.resize(i * j * k * l);
+    }
+
+    inline void Flatten() {


This is still a problem.

atgeirr · 2024-10-15T11:46:02Z

opm/ml/ml_model.hpp

+
+
+  template <typename Type>
+    void resizeI(std::vector<Type> c) {


No reason to take a copy of c, use a const reference. Also, you do not need to force this to be a vector:

template <typename Sizes> void resizeI(const Sizes& sizes) { ... }

opm/ml/ml_model.hpp

opm/ml/ml_model.cpp

atgeirr · 2024-10-15T12:21:03Z

opm/ml/ml_model.cpp

+template<class Evaluation>
+bool NNLayerScaling<Evaluation>::loadLayer(std::ifstream& file) {
+    OPM_ERROR_IF(!readFile<float>(file, data_min), "Failed to read min");
+    OPM_ERROR_IF(!readFile<float>(file, data_max), "Failed to read min");


The error messages are wrong, except for the first.

atgeirr · 2024-10-15T12:23:34Z

opm/ml/ml_model.cpp

+        break;
+    case kHardSigmoid:
+        for (size_t i = 0; i < out.data_.size(); i++) {
+            Evaluation x = (out.data_[i] * sigmoid_scale) + 0.5;


Should be const.

atgeirr · 2024-10-15T12:24:11Z

opm/ml/ml_model.cpp

+        break;
+    case kSigmoid:
+        for (size_t i = 0; i < out.data_.size(); i++) {
+            Evaluation& x = out.data_[i];


Should be const. Always make variables const unless they have to be mutable.

atgeirr

This review block just contains a single comment, from a long time ago, but that is still relevant. I will make a new review block now.

opm/ml/keras_model.cpp

atgeirr

Apart from reformatting, I see none of the earlier requested code changes here.

opm/ml/ml_model.hpp

atgeirr · 2025-01-02T12:59:09Z

opm/ml/ml_model.hpp

+ */
+template <class T> class Tensor {
+ public:
+  Tensor() {}


Indentation and formatting is not in line with common OPM practice, although it seems internally consistent now! I assume you used clang-format to process this? Please do so again, using the .clang-format file at the top level of opm-common. (That would give 4-space indents for example.)

You could consider setting a slightly wider allowed width to make the fmt::format() calls look nicer (i.e. on a single line) such as on line 78.

atgeirr · 2025-01-02T13:00:46Z

opm/ml/ml_model.hpp

+        std::accumulate(begin(c), end(c), 1.0, std::multiplies<Type>()));
+  }
+
+  inline void flatten() {


Remove inline in front of functions that are defined inline in the class. It is unnecessary.

This was commented on already!

atgeirr · 2025-01-02T13:02:20Z

opm/ml/ml_model.hpp

+
+  Tensor(int i, int j, int k, int l) { resizeI<int>({i, j, k, l}); }
+
+  template <typename Type> void resizeI(std::vector<Type> c) {


See earlier comment on this.

If I recall correctly, you asked to use the dims_, not c?

This was the comment:

No reason to take a copy of c, use a const reference. Also, you do not need to force this to be a vector:

template <typename Sizes> void resizeI(const Sizes& sizes) { ... }

The use of dims_ was further down in an accumulate call.

We use now a const ref.

opm/ml/ml_model.hpp

atgeirr · 2025-01-02T13:11:15Z

opm/ml/ml_model.hpp

+                 "Cannot add tensors with different dimensions");
+    Tensor result;
+    result.dims_ = dims_;
+    result.data_.reserve(data_.size());


This should be a resize(), and then not use a back_inserter below, as discussed earlier! Same with the multiply() further down.

opm/ml/ml_model.hpp

atgeirr · 2025-01-02T13:15:03Z

opm/ml/ml_model.hpp

+template <class Evaluation>
+class NNLayerActivation : public NNLayer<Evaluation> {
+ public:
+  enum ActivationType {


It was requested to make this an enum class, and you commented that this was done, but it is not done.

atgeirr · 2025-01-02T13:16:59Z

opm/ml/ml_model.cpp

+template <class Evaluation>
+bool NNLayerActivation<Evaluation>::apply(const Tensor<Evaluation>& in,
+                                          Tensor<Evaluation>& out) {
+  constexpr double sigmoid_scale = 0.2;


Do not define this up here, but just before it is used (line 96). As already commented...

atgeirr · 2025-01-02T13:21:34Z

Note that in addition to re-commenting on several of the unchanged problems, I un-resolved several more since the existing comment was still valid. Please make code changes first, in a separate commit so I can easily see what changes you made, then after that add a separate commit to do the auto-reformatting.

bikagit · 2025-01-03T09:04:04Z

Note that in addition to re-commenting on several of the unchanged problems, I un-resolved several more since the existing comment was still valid. Please make code changes first, in a separate commit so I can easily see what changes you made, then after that add a separate commit to do the auto-reformatting.

Thanks for spotting the missing points. The changes were done in November. However, unfortunately, during re-basing we removed some commits inadvertently. The requested changes will all be added back.

bikagit force-pushed the mlKeras branch from aa1dceb to 1edf188 Compare August 22, 2024 13:16

bikagit force-pushed the mlKeras branch from 1edf188 to 230460c Compare August 22, 2024 14:47

bikagit force-pushed the mlKeras branch from 230460c to 0eb5e3d Compare August 26, 2024 13:34

bikagit force-pushed the mlKeras branch from fdc8a27 to 6e135b9 Compare August 27, 2024 16:33

bikagit force-pushed the mlKeras branch from 6e135b9 to e2f0a97 Compare August 28, 2024 11:12

bikagit marked this pull request as ready for review August 28, 2024 16:43

bikagit marked this pull request as draft August 29, 2024 10:38

bikagit marked this pull request as ready for review September 6, 2024 13:41

atgeirr requested changes Sep 24, 2024

View reviewed changes

kjetilly reviewed Sep 25, 2024

View reviewed changes

bikagit requested a review from atgeirr October 11, 2024 14:55

atgeirr reviewed Oct 15, 2024

View reviewed changes

bikagit force-pushed the mlKeras branch from 238a4a7 to ee43d11 Compare November 5, 2024 17:36

$@fractalmanifold$

Integrating Keras capabilities to OPM

9dabbe9

bikagit force-pushed the mlKeras branch from ee43d11 to 9dabbe9 Compare November 5, 2024 17:44

bikagit requested a review from atgeirr November 15, 2024 14:13

atgeirr reviewed Jan 2, 2025

View reviewed changes

opm/ml/keras_model.cpp Outdated Show resolved Hide resolved

atgeirr requested changes Jan 2, 2025

View reviewed changes

fractalmanifold added 3 commits January 4, 2025 04:41

$@fractalmanifold$

Adding changes requested for PR

d06f95f

$@fractalmanifold$

Adding changes requested for PR part 2, fixing typos

4f311e2

$@fractalmanifold$

Adding changes requested for PR Auto-reformatting

b6ba901

bikagit requested a review from atgeirr January 6, 2025 06:15

$@fractalmanifold$

Adding minor change in resizeI

d3fa551


		Tensor(int i, int j, int k, int l) { resizeI<int>({i, j, k, l}); }

		template <typename Type> void resizeI(std::vector<Type> c) {

ML keras #4172

Are you sure you want to change the base?

ML keras #4172

Conversation

bikagit commented Aug 22, 2024 • edited Loading

totto82 commented Aug 22, 2024

daavid00 commented Aug 22, 2024

atgeirr commented Aug 22, 2024

bska commented Aug 22, 2024

bikagit commented Aug 22, 2024

totto82 commented Aug 26, 2024

daavid00 commented Aug 28, 2024

bska commented Aug 28, 2024

totto82 commented Aug 28, 2024

bska commented Aug 28, 2024

totto82 commented Aug 28, 2024

bikagit commented Sep 1, 2024

totto82 commented Sep 6, 2024

atgeirr left a comment

Choose a reason for hiding this comment

kjetilly left a comment

Choose a reason for hiding this comment

bikagit commented Sep 27, 2024

daavid00 commented Oct 11, 2024

daavid00 commented Oct 11, 2024

atgeirr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

atgeirr left a comment

Choose a reason for hiding this comment

atgeirr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bikagit Jan 3, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bikagit Jan 3, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

atgeirr commented Jan 2, 2025

bikagit commented Jan 3, 2025

bikagit commented Aug 22, 2024 •

edited

Loading

bikagit Jan 3, 2025 •

edited

Loading

bikagit Jan 3, 2025 •

edited

Loading