Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert a folder with PyTorch Jupyter Nodebooks #100

Open
GeorgeS2019 opened this issue Mar 1, 2023 · 3 comments
Open

Convert a folder with PyTorch Jupyter Nodebooks #100

GeorgeS2019 opened this issue Mar 1, 2023 · 3 comments

Comments

@GeorgeS2019
Copy link

Currently I use the PyToCs.Gui to convert a folder with Pytorch.py codes to TorchSharp.cs

I wonder if there is a use case here for the users here to request support for converting a folder, NOT CONSISTING of python files BUT python Jupyter Notebooks

I could imagine a pre-parsing where the cells of the notebooks are extracted and concatenated into accepted python.py format in order for the PyToCs to convert that into c# codes.

Feedback appreciate.

@uxmal
Copy link
Owner

uxmal commented Mar 1, 2023

I'm not that familiar with Jupyter Notebooks and their file formats, but the format seems to be JSON. It shouldn't be that hard to write a front-end to pytocs that parses the JSON, extracts all the Python cells, and then either saves those to a file or invokes pytocs directly.

The Unix philosophy would favor the first a approach, a preprocessing tool that extract the cells into a notebook.py file, where upon pytocs is invoked on the result. This probably is the least invasive approach. It should be simple enough to write a simple program that parses JSON, locates the appropriate cells by inspecting the metadata , and emitting them.

@GeorgeS2019
Copy link
Author

GeorgeS2019 commented Mar 1, 2023

Below is a working PyToCs converted code with minimum changes to get it to work.

@uxmal
@toolgood

Suggestions

Perhaps the PyToCs.GUI will take each Jupyter notebook, for each cell that contains e.g. the python code,

   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "dotnet_interactive": {
     "language": "python"
    },
    "polyglot_notebook": {
     "kernelName": "python"
    }

Insert a NEW CELL below the cell with the python code and the OUTPUT of the PyToCS on the previous cell python code.

=> This will work when the .NET polyglot notebook supports mixing python and c# kernels.

Agile iterative refinement of PyToCs Accuracy.

With these converted notebooks, it is more possible in a collaborative way to work towards a more context-accurate PyToCs Conversion results

public static class PyTorch {
    
    public static ndarray x;
    
    public static ndarray y;
    
    public static double a;
    
    public static double b;
    
    public static double c;
    
    public static double d;
    
    public static double learning_rate;
    
    public static void PyTorchRun() {
        // -*- coding: utf-8 -*-
        // Create random input and output data

        double retstep = 2000;

        x = np.linspace(- (long)System.Math.PI, (long)System.Math.PI, ref retstep);
        y = np.sin(x);
        // Randomly initialize weights
        a = new np.random().randn(); 
        b = new np.random().randn(); 
        c = new np.random().randn(); 
        d = new np.random().randn(); 

        learning_rate = 1E-06;
        foreach (var t in Enumerable.Range(0, 2000)) {
            // Forward pass: compute predicted y
            // y = a + b x + c x^2 + d x^3
            var y_pred = a + b * x + c * Math.Pow((double)x, 2) + d * Math.Pow((double)x, 3);
            // Compute and print loss
            var loss = np.square(y_pred - y).Sum();
            if (t % 100 == 99) {
                Console.WriteLine(t.ToString(), loss);
            }
            // Backprop to compute gradients of a, b, c, d with respect to loss
            var grad_y_pred = 2.0 * (y_pred - y);
            var grad_a = (double)grad_y_pred.Sum();
            var grad_b = (double)(grad_y_pred * x).Sum();
            var grad_c = (double)(grad_y_pred * Math.Pow((double)x, 2)).Sum();
            var grad_d = (double)(grad_y_pred * Math.Pow((double)x, 3)).Sum();
            // Update weights
            a -= learning_rate * grad_a;
            b -= learning_rate * grad_b;
            c -= learning_rate * grad_c;
            d -= learning_rate * grad_d;
        }
        Console.WriteLine($"Result: y = {a} + {b} x + {c} x^2 + {d} x^3");
    }
}
{
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "dotnet_interactive": {
     "language": "csharp"
    },
    "polyglot_notebook": {
     "kernelName": "csharp"
    }
   },
   "outputs": [],
   "source": [
    "public static class PyTorch {\n",
    "    \n",
    "    public static ndarray x;\n",
    "    \n",
    "    public static ndarray y;\n",
    "    \n",
    "    public static double a;\n",
    "    \n",
    "    public static double b;\n",
    "    \n",
    "    public static double c;\n",
    "    \n",
    "    public static double d;\n",
    "    \n",
    "    public static double learning_rate;\n",
    "    \n",
    "    public static void PyTorchRun() {\n",
    "        // -*- coding: utf-8 -*-\n",
    "        // Create random input and output data\n",
    "\n",
    "        double retstep = 2000;\n",
    "\n",
    "        x = np.linspace(- (long)System.Math.PI, (long)System.Math.PI, ref retstep);\n",
    "        y = np.sin(x);\n",
    "        // Randomly initialize weights\n",
    "        a = new np.random().randn(); \n",
    "        b = new np.random().randn(); \n",
    "        c = new np.random().randn(); \n",
    "        d = new np.random().randn(); \n",
    "\n",
    "        learning_rate = 1E-06;\n",
    "        foreach (var t in Enumerable.Range(0, 2000)) {\n",
    "            // Forward pass: compute predicted y\n",
    "            // y = a + b x + c x^2 + d x^3\n",
    "            var y_pred = a + b * x + c * Math.Pow((double)x, 2) + d * Math.Pow((double)x, 3);\n",
    "            // Compute and print loss\n",
    "            var loss = np.square(y_pred - y).Sum();\n",
    "            if (t % 100 == 99) {\n",
    "                Console.WriteLine(t.ToString(), loss);\n",
    "            }\n",
    "            // Backprop to compute gradients of a, b, c, d with respect to loss\n",
    "            var grad_y_pred = 2.0 * (y_pred - y);\n",
    "            var grad_a = (double)grad_y_pred.Sum();\n",
    "            var grad_b = (double)(grad_y_pred * x).Sum();\n",
    "            var grad_c = (double)(grad_y_pred * Math.Pow((double)x, 2)).Sum();\n",
    "            var grad_d = (double)(grad_y_pred * Math.Pow((double)x, 3)).Sum();\n",
    "            // Update weights\n",
    "            a -= learning_rate * grad_a;\n",
    "            b -= learning_rate * grad_b;\n",
    "            c -= learning_rate * grad_c;\n",
    "            d -= learning_rate * grad_d;\n",
    "        }\n",
    "        Console.WriteLine($\"Result: y = {a} + {b} x + {c} x^2 + {d} x^3\");\n",
    "    }\n",
    "}"
   ]
  }

@GeorgeS2019
Copy link
Author

@claudiaregio

I think some of the users here following this discussion here could be potential candidates for providing feedback on the Python integration addition to the dotnet polyglot notebook

we are actually in progress of testing out our Python and R integration right now! If you're willing to sign an NDA and provide us some feedback, you can sign up here to get the VSIX and install instructions to try it out: https://forms.office.com/r/UQchfQSGa5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants