diff --git a/notebooks/bibliography.bib b/notebooks/bibliography.bib index 4bd594a..306ebea 100644 --- a/notebooks/bibliography.bib +++ b/notebooks/bibliography.bib @@ -285,6 +285,13 @@ @article{hewing_learningbased_2020 langid = {english} } +@article{kia_lecture_2019, + title = {Lecture Notes for {{MAE}} 274 ({{Optimal Control}}) {{Mechanical}} and {{Aerospace Eng}}. {{Dept}}., {{University}} of {{California Irvine}}}, + author = {Kia, Solmaz}, + date = {2019-04}, + langid = {english} +} + @article{korda_linear_2018, title = {Linear Predictors for Nonlinear Dynamical Systems: {{Koopman}} Operator Meets Model Predictive Control}, shorttitle = {Linear Predictors for Nonlinear Dynamical Systems}, @@ -504,6 +511,13 @@ @online{tedrake_underactuated_2023 urldate = {2022-12-08} } +@article{tits_notes_2020, + title = {Notes for {{ENEE}} 664: {{Optimal Control}}}, + author = {Tits, Andre L}, + date = {2020-08}, + langid = {english} +} + @article{tu_dynamic_2014, title = {On {{Dynamic Mode Decomposition}}: {{Theory}} and {{Applications}}}, shorttitle = {On {{Dynamic Mode Decomposition}}}, diff --git a/notebooks/nb_10_introduction.ipynb b/notebooks/nb_10_introduction.ipynb index 07e94e6..4b18930 100644 --- a/notebooks/nb_10_introduction.ipynb +++ b/notebooks/nb_10_introduction.ipynb @@ -2,10 +2,13 @@ "cells": [ { "cell_type": "code", - "execution_count": 5, + "execution_count": null, "metadata": { "editable": true, "init_cell": true, + "jupyter": { + "source_hidden": true + }, "scene__Default Scene": true, "slideshow": { "slide_type": "skip" @@ -23,9 +26,12 @@ }, { "cell_type": "code", - "execution_count": 6, + "execution_count": null, "metadata": { "init_cell": true, + "jupyter": { + "source_hidden": true + }, "scene__Default Scene": true, "slideshow": { "slide_type": "skip" @@ -35,177 +41,7 @@ "ActiveScene" ] }, - "outputs": [ - { - "data": { - "text/html": [ - "" - ], - "text/plain": [ - "" - ] - }, - "execution_count": 6, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "%presentation_style" ] @@ -222,13 +58,11 @@ ] }, "source": [ - ":::{figure} ./_static/images/aai-institute-cover.png\n", + "```{figure} ./_static/images/aai-institute-cover.png\n", ":width: 90%\n", ":align: center\n", - "---\n", - "name: aai-institute\n", - "---\n", - ":::" + ":name: aai-institute\n", + "```" ] }, { @@ -301,7 +135,6 @@ "cell_type": "markdown", "metadata": { "editable": true, - "jp-MarkdownHeadingCollapsed": true, "slideshow": { "slide_type": "slide" } @@ -319,12 +152,13 @@ } }, "source": [ - ":::{figure} _static/images/10_control_theory_map.png\n", + "```{figure} _static/images/10_control_theory_map.png\n", + ":width: 90%\n", "---\n", "name: map-control-theory\n", "---\n", "[The Map of Control Theory](https://engineeringmedia.com/map-of-control)\n", - ":::" + "```" ] }, { @@ -379,7 +213,7 @@ } }, "source": [ - "# Control Systems Classification\n", + "## Control Systems Classification\n", "\n", "There are two types of control loops:\n", "\n", @@ -416,7 +250,7 @@ } }, "source": [ - "# Controller Design\n", + "## Controller Design\n", "\n", "1. Problem Formulation\n", "2. Modeling\n", @@ -513,22 +347,9 @@ }, { "cell_type": "code", - "execution_count": 3, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "\n" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - } - ], + "outputs": [], "source": [ "%%html\n", "" @@ -591,19 +412,19 @@ "cell_type": "markdown", "metadata": {}, "source": [ - ":::{figure} _static/images/10_feedback_block_diagram.svg\n", + "```{figure} _static/images/10_feedback_block_diagram.svg\n", "---\n", "name: feedback-control-block\n", "---\n", "Feedback Control in Control Engineering\n", - ":::\n", + "```\n", "\n", - ":::{figure} _static/images/10_reinforcement_learning_block_diagram.svg\n", + "```{figure} _static/images/10_reinforcement_learning_block_diagram.svg\n", "---\n", "name: rl-block\n", "---\n", "Feedback Control in Reinforcement Learning\n", - ":::" + "```" ] }, { @@ -612,11 +433,11 @@ "source": [ "The two are not mutually exclusive. For example, enforcing safety constraints when using reinforcement learning can be achieved by combining it with model-predictive control (MPC).\n", "\n", - ":::{figure} _static/images/70_safety_filter.svg\n", + "```{figure} _static/images/80_safety_filter.svg\n", ":width: 50%\n", "Based on the current state $x$, a learning-based controller provides an input\n", "$u_L = \\pi_L(x) \\in \\mathbb{R}^m$, which is processed by the safety filter $u = \\pi_S(x, u_S)$ and applied to the real system {cite}`hewing_learningbased_2020`.\n", - ":::" + "```" ] }, { @@ -684,7 +505,7 @@ } }, "source": [ - "### State-Transition Systems\n", + "## State-Transition Systems\n", "\n", "A state-transition system is a 3-tuple $\\Sigma = (S, U,\\gamma)$, where:\n", "\n", @@ -719,7 +540,7 @@ } }, "source": [ - "### Plans\n", + "## Plans\n", "\n", "- a plan $\\pi$ is a finite sequence of actions, $\\pi = \\{u_1, \\dots, u_n\\}$\n", "- a planning problem is a triple $P = (\\Sigma, s_0, S_g)$, where $\\Sigma$ is a\n", @@ -768,7 +589,7 @@ "Optimal control is one of the most useful systematic methods for controller design. It has several advantages:\n", "- It gives a systematic approach to the solution of control problems.\n", "- There are normally many possible solutions to a control problem. Some are good, others are poor.\n", - " Optimal control reduces this redundancy by selecting a controller that isoptimalbest according to some cost function.\n", + " Optimal control reduces this redundancy by selecting a controller that is optimal according to some cost function.\n", "\n", "The optimal control problem is to find a control $u^* \\in \\mathbf{U}$ which causes the system $\\dot{x}(t) = f(x(t), u(t))$ to follow a trajectory $x^* \\in \\mathbf{X}$ that minimizes the cost (performance measure)." ] @@ -782,7 +603,7 @@ } }, "source": [ - "### Continuous-time\n", + "## Continuous-time\n", "\n", "$$\n", "\\begin{array}{lll}\n", @@ -794,7 +615,7 @@ "\\end{array}\n", "$$\n", "\n", - "### Discrete-time\n", + "## Discrete-time\n", "\n", "$$\n", "\\begin{array}{lll}\n", @@ -811,7 +632,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### Variants\n", + "## Variants\n", "\n", "There are many variants of the optimal control problem:\n", "\n", @@ -925,16 +746,15 @@ "source": [ "::::{exercise} RC-Circuit Exercise\n", ":label: rc-circuit-exercise\n", - ":width: 80%\n", + "\n", "\n", "Given the following RC circuit with an external voltage source:\n", "\n", - ":::{figure} _static/images/10_rc_circuit.svg\n", - "---\n", - "name: rc-circuit\n", - "---\n", + "```{figure} _static/images/10_rc_circuit.svg\n", + ":width: 60%\n", + ":name: rc-circuit\n", "Schematic created using [CircuitLab](https://www.circuitlab.com)\n", - ":::\n", + "```\n", "\n", "The differential equation governing the charge of capacitor is given by:\n", "\n", @@ -948,6 +768,51 @@ "\n", "- Suppose our only goal is to charge the capacitor as quickly as possible without worrying about anything else. What would be your choice for a cost function?\n", "- Suppose that we additionally want to limit the current running throught the circuit. What would then be your choice for a cost function?\n", + "\n", + ":::{tip} State-space representation of circuit (For the curious ones)\n", + ":class: dropdown\n", + "\n", + "If we wanted to represent the system in a state-space form, we could define the following state vector:\n", + "\n", + "$$\n", + "\\mathbf{X} = \\begin{bmatrix}\n", + "x_1 \\\\ x_2\n", + "\\end{bmatrix} \n", + "= \\begin{bmatrix}\n", + "y(t) \\\\ \\dot{y}(t)\n", + "\\end{bmatrix} \n", + "$$\n", + "\n", + "Taking the derivate of the state vector gives us:\n", + "\n", + "$$\n", + "\\dot{\\mathbf{X}} = \\begin{bmatrix}\n", + "\\dot{x}_1 \\\\ \\dot{x}_2\n", + "\\end{bmatrix} \n", + "= \\begin{bmatrix}\n", + "x_2 \\\\ \\frac{1}{R} u - \\frac{1}{RC} x_1 \n", + "\\end{bmatrix} \n", + "$$\n", + "\n", + "Which is a linear system with the following matrices:\n", + "\n", + "$$\n", + "\\dot{\\mathbf{X}} = \\begin{bmatrix}\n", + "\\dot{x}_1 \\\\ \\dot{x}_2\n", + "\\end{bmatrix} \n", + "= \\begin{bmatrix}\n", + "0 & 1 \\\\ - \\frac{1}{RC} & 0\n", + "\\end{bmatrix} \n", + "\\begin{bmatrix}\n", + "x_1 \\\\ x_2\n", + "\\end{bmatrix} \n", + "+\n", + "\\begin{bmatrix}\n", + "0 \\\\ \\frac{1}{R}\n", + "\\end{bmatrix} \n", + "u\n", + "$$\n", + ":::\n", "::::" ] }, @@ -957,8 +822,8 @@ "source": [ ":::{solution} rc-circuit-exercise\n", ":class: dropdown\n", - "- $J = - y(t)$\n", - "- $J = - y(t) + \\lambda i(t), \\quad \\lambda > 0$\n", + "- $c = - y(t)$\n", + "- $c = - y(t) + \\lambda i(t), \\quad \\lambda > 0$\n", ":::" ] } diff --git a/notebooks/nb_20_dynamic_programming.ipynb b/notebooks/nb_20_dynamic_programming.ipynb index 7b67c1e..85a3dbc 100644 --- a/notebooks/nb_20_dynamic_programming.ipynb +++ b/notebooks/nb_20_dynamic_programming.ipynb @@ -7,6 +7,9 @@ "editable": true, "hide_input": false, "init_cell": true, + "jupyter": { + "source_hidden": true + }, "scene__Initialization": true, "slideshow": { "slide_type": "skip" @@ -35,6 +38,9 @@ "metadata": { "editable": true, "init_cell": true, + "jupyter": { + "source_hidden": true + }, "scene__Initialization": true, "slideshow": { "slide_type": "skip" @@ -53,6 +59,9 @@ "execution_count": null, "metadata": { "init_cell": true, + "jupyter": { + "source_hidden": true + }, "scene__Initialization": true, "tags": [ "ActiveScene" @@ -71,6 +80,9 @@ "metadata": { "editable": true, "init_cell": true, + "jupyter": { + "source_hidden": true + }, "scene__Initialization": true, "slideshow": { "slide_type": "skip" @@ -112,13 +124,13 @@ ] }, "source": [ - ":::{figure} ./_static/images/aai-institute-cover.png\n", + "```{figure} ./_static/images/aai-institute-cover.png\n", ":width: 90%\n", ":align: center\n", "---\n", "name: aai-institute\n", "---\n", - ":::" + "```" ] }, { @@ -129,7 +141,7 @@ } }, "source": [ - "## Dynamic Programming\n", + "# Dynamic Programming\n", "\n", "Dynamic programming (DP) is a method that in general solves optimization problems that involve making a sequence of decisions (multi-stage decision making problems) by determining, for each decision, subproblems that can be solved similarily, such that an optimal solution of the original problem can be found from optimal solutions of subproblems. This method is based on *Bellman’s Principle of Optimality*:\n", "\n", @@ -263,6 +275,10 @@ " $$\n", " V(\\mathbf{x}_k) = \\displaystyle \\min_{u \\in \\mathbf{U}} \\left[ c(\\mathbf{x}_k, \\mathbf{u}_k) + V(f(\\mathbf{x}_k, \\mathbf{u}_k)) \\right]\n", " $$\n", + "\n", + " $$\n", + " V(\\mathbf{x}_k) = \\displaystyle \\min_{u \\in \\mathbf{U}} \\left[ c(\\mathbf{x}_k, \\mathbf{u}_k) + V(\\mathbf{x}_{k+1}) \\right]\n", + " $$\n", " \n", "Once the values $V(\\mathbf{x}_0), \\dots , V(\\mathbf{x}_N)$ have been obtained, we can use a forward algorithm to construct an optimal control sequence $\\{u_0^*, \\dots, u_{N-1}^*\\}$ and corresponding state trajectory $\\{x_1^∗, \\dots, x_{N}^*\\}$ for the given initial state $x_0$.\n", "\n", @@ -380,7 +396,7 @@ "V(\\text{D}) &= \\min \\left[ c(\\text{D}, \\text{G}) + V({\\text{G}}), c(\\text{D}, \\text{F}) + V(\\text{F}) \\right]\n", "&= \\min \\left[ 8 + 0, 5 + 1 \\right] &= 6\n", "\\\\\n", - "V(\\text{C}) &= c(\\text{C}, \\text{F}) + V({\\text{F}}) &= 2 + 1 &= 1\n", + "V(\\text{C}) &= c(\\text{C}, \\text{F}) + V({\\text{F}}) &= 2 + 1 &= 3\n", "\\end{array}\n", "$$\n", "\n", @@ -448,21 +464,21 @@ "\n", "$$\n", "\\begin{array}{l}\n", - " \\textbf{Input}:\\ \\text{MDP}\\ M = \\langle S, s_0, U, c(s, u)\\rangle\\\\\n", + " \\textbf{Input}:\\ \\langle S, s_0, U, c(s, u)\\rangle\\\\\n", " \\textbf{Output}:\\ \\text{Value function}\\ V\\\\[2mm]\n", " \\text{Set}\\ V\\ \\text{to arbitrary value function; e.g., }\\ V(s) = 0\\ \\text{for all}\\ s\\\\[2mm]\n", " \\text{repeat}\\ \\\\\n", " \\quad\\quad \\Delta \\leftarrow 0 \\\\\n", " \\quad\\quad \\text{foreach}\\ s \\in S \\\\\n", " \\quad\\quad\\quad\\quad \\underbrace{V'(s) \\leftarrow \\min_{u \\in U(s)} \\left[c(s, u) + \\ V(s') \\right]}_{\\text{Bellman equation}} \\\\\n", - " \\quad\\quad\\quad\\quad \\Delta \\leftarrow \\min(\\Delta, |V'(s) - V(s)|) \\\\\n", + " \\quad\\quad\\quad\\quad \\Delta \\leftarrow \\max(\\Delta, |V'(s) - V(s)|) \\\\\n", " \\quad\\quad V \\leftarrow V' \\\\\n", " \\text{until}\\ \\Delta \\leq 0.0 \n", "\\end{array}\n", "$$\n", "\n", - ":::{note} Stochastic Value Iteration\n", - ":class:dropdown\n", + ":::{tip} Stochastic Value Iteration\n", + ":class: dropdown\n", "\n", "It can also be used in stochastic systems:\n", "\n", @@ -474,9 +490,9 @@ " \\text{repeat}\\ \\\\\n", " \\quad\\quad \\Delta \\leftarrow 0 \\\\\n", " \\quad\\quad \\text{foreach}\\ s \\in S \\\\\n", - " \\quad\\quad\\quad\\quad \\underbrace{V'(s) \\leftarrow \\min_{u \\in U(s)} \\sum_{s' \\in S} P_a(s' \\mid s)\\ [r(s,a,s') + \n", + " \\quad\\quad\\quad\\quad \\underbrace{V'(s) \\leftarrow \\min_{u \\in U(s)} \\sum_{s' \\in S} P_a(s' \\mid s)\\ [c(s,a,s') + \n", " \\gamma\\ V(s') ]}_{\\text{Bellman equation}} \\\\\n", - " \\quad\\quad\\quad\\quad \\Delta \\leftarrow \\min(\\Delta, |V'(s) - V(s)|) \\\\\n", + " \\quad\\quad\\quad\\quad \\Delta \\leftarrow \\max(\\Delta, |V'(s) - V(s)|) \\\\\n", " \\quad\\quad V \\leftarrow V' \\\\\n", " \\text{until}\\ \\Delta \\leq \\theta \n", "\\end{array}\n", @@ -497,9 +513,16 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "````{exercise-start} Grid World\n", + "### Exercise" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + ":::{exercise-start} Grid World\n", ":label: grid-world\n", - "````" + ":::" ] }, { @@ -576,6 +599,7 @@ "\n", ":::{tip} Hint 1\n", ":class: dropdown\n", + "\n", "Determine all possible paths first.\n", "\n", "You can use `plot_grid_all_paths_graph(G)` for that.\n", @@ -583,6 +607,7 @@ "\n", ":::{tip} Hint 2\n", ":class: dropdown\n", + "\n", "Compute the optimal cost-to-go at each node.\n", "\n", "You can use `dict(G.nodes(data=True))` to get a dictionary that maps the nodes to their attributes\n", @@ -594,7 +619,15 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "````{solution-start} grid-world\n", + ":::{exercise-end}\n", + ":::" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "````{solution} grid-world\n", "````" ] }, @@ -607,52 +640,162 @@ "# Your solution here" ] }, + { + "cell_type": "markdown", + "metadata": { + "editable": true, + "jp-MarkdownHeadingCollapsed": true, + "slideshow": { + "slide_type": "" + }, + "tags": [ + "remove-cell" + ] + }, + "source": [ + "#### Solution" + ] + }, { "cell_type": "markdown", "metadata": {}, "source": [ - ":::{solution} grid-world\n", + ":::{solution-start} grid-world\n", ":class: dropdown\n", - "\n", - "For this solution we first need to import some functions:\n", - "\n", - "```{code-cell}\n", + ":::" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "For this solution we first need to import some functions:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ "from training_ml_control.environments import (\n", " value_iteration,\n", " compute_best_path_and_actions_from_values,\n", ")\n", - "import networkx as nx\n", - "```\n", - "\n", - "To determine the number of paths from start node to target node we use:\n", - "\n", - "```{code-cell}\n", - "len(list(nx.all_simple_paths(G.to_undirected(), source=G.start_node, target=G.target_node, cutoff=len(G))))\n", - "```\n", - "\n", - "After that, to plot all paths from start to end we use:\n", - "\n", - "```{code-cell} python3\n", - "plot_grid_all_paths_graph(G)\n", - "```\n", - "\n", - "To compute the optimal cost-to-go we use:\n", - "\n", - "```{code-cell}\n", - "values = value_iteration(G)\n", - "```\n", - "\n", - "Once that's computed, we can determine the best path and correponding actions:\n", - "\n", - "```{code-cell}\n", - "best_path, actions = compute_best_path_and_actions_from_values(G, start_node=G.start_node, target_node=G.target_node, values=values)\n", + "import networkx as nx" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To determine the number of paths from start node to target node we use:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "len(\n", + " list(\n", + " nx.all_simple_paths(\n", + " G.to_undirected(), source=G.start_node, target=G.target_node, cutoff=len(G)\n", + " )\n", + " )\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "After that, to plot all paths from start to end we use:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "plot_grid_all_paths_graph(G)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To compute the optimal cost-to-go we use:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "values = value_iteration(G)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Once that's computed, we can determine the best path and correponding actions:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "best_path, actions = compute_best_path_and_actions_from_values(\n", + " G, start_node=G.start_node, target_node=G.target_node, values=values\n", + ")\n", "print(f\"{best_path=}\")\n", - "print(f\"{actions=}\")\n", - "```\n", - "\n", - "We finally apply the computed plan to the environment and check that it indeed works:\n", - "\n", - "```{code-cell} python3\n", + "print(f\"{actions=}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We finally apply the computed plan to the environment and check that it indeed works:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ "env.reset()\n", "for action in actions:\n", " observation, _, terminated, truncated, _ = env.step(action)\n", @@ -661,9 +804,14 @@ " env.reset()\n", " break\n", "env.close()\n", - "show_video(frames, fps=3)\n", - "```\n", - "\n", + "show_video(frames, fps=3)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + ":::{solution-end}\n", ":::" ] }, diff --git a/notebooks/nb_30_systems.ipynb b/notebooks/nb_30_systems.ipynb index 3204a83..bdee839 100644 --- a/notebooks/nb_30_systems.ipynb +++ b/notebooks/nb_30_systems.ipynb @@ -7,6 +7,9 @@ "editable": true, "hide_input": false, "init_cell": true, + "jupyter": { + "source_hidden": true + }, "scene__Initialization": true, "slideshow": { "slide_type": "skip" @@ -35,6 +38,9 @@ "metadata": { "editable": true, "init_cell": true, + "jupyter": { + "source_hidden": true + }, "scene__Initialization": true, "slideshow": { "slide_type": "skip" @@ -52,6 +58,9 @@ "cell_type": "code", "execution_count": null, "metadata": { + "jupyter": { + "source_hidden": true + }, "scene__Initialization": true, "tags": [ "ActiveScene" @@ -70,6 +79,9 @@ "metadata": { "editable": true, "init_cell": true, + "jupyter": { + "source_hidden": true + }, "scene__Initialization": true, "slideshow": { "slide_type": "skip" @@ -128,13 +140,13 @@ ] }, "source": [ - ":::{figure} ./_static/images/aai-institute-cover.png\n", + "```{figure} ./_static/images/aai-institute-cover.png\n", ":width: 90%\n", ":align: center\n", "---\n", "name: aai-institute\n", "---\n", - ":::" + "```" ] }, { @@ -352,12 +364,12 @@ "source": [ "## Inverted Pendulum\n", "\n", - ":::{figure} _static/images/10_inverted_pendulum_photo.png\n", + "```{figure} _static/images/10_inverted_pendulum_photo.png\n", "---\n", "name: inverted-pendulum\n", "---\n", "Balancing cart, a simple robotics system circa 1976 - [Wikipedia](https://en.wikipedia.org/wiki/Inverted_pendulum).\n", - ":::\n", + "```\n", "\n", "An inverted pendulum is a pendulum that has its center of mass above its pivot point. It is unstable and without additional help will fall over.\n", "\n", diff --git a/notebooks/nb_40_LQR.ipynb b/notebooks/nb_40_LQR.ipynb index f875554..9565f0a 100644 --- a/notebooks/nb_40_LQR.ipynb +++ b/notebooks/nb_40_LQR.ipynb @@ -7,6 +7,9 @@ "editable": true, "hide_input": false, "init_cell": true, + "jupyter": { + "source_hidden": true + }, "scene__Initialization": true, "slideshow": { "slide_type": "skip" @@ -35,6 +38,9 @@ "metadata": { "editable": true, "init_cell": true, + "jupyter": { + "source_hidden": true + }, "scene__Initialization": true, "slideshow": { "slide_type": "skip" @@ -52,6 +58,9 @@ "cell_type": "code", "execution_count": null, "metadata": { + "jupyter": { + "source_hidden": true + }, "scene__Initialization": true, "tags": [ "ActiveScene" @@ -68,8 +77,13 @@ "cell_type": "code", "execution_count": null, "metadata": { + "collapsed": true, "editable": true, "init_cell": true, + "jupyter": { + "outputs_hidden": true, + "source_hidden": true + }, "scene__Initialization": true, "slideshow": { "slide_type": "skip" @@ -107,7 +121,7 @@ "from training_ml_control.control import (\n", " build_lqr_controller,\n", ")\n", - "from training_ml_control.model import (\n", + "from training_ml_control.models import (\n", " build_cart_model,\n", " build_inverted_pendulum_linear_model,\n", ")\n", @@ -133,13 +147,13 @@ ] }, "source": [ - ":::{figure} ./_static/images/aai-institute-cover.png\n", + "```{figure} ./_static/images/aai-institute-cover.png\n", ":width: 90%\n", ":align: center\n", "---\n", "name: aai-institute\n", "---\n", - ":::" + "```" ] }, { @@ -150,7 +164,7 @@ } }, "source": [ - "## Linear Quadratic Regulator\n", + "# Linear Quadratic Regulator\n", "\n", "While solving the optimal control problem (OCP) is very hard in general, there are a few very important special cases where the solutions are very accessible. Most of these involve variants on the case of linear dynamics and quadratic cost. The simplest case, called the linear quadratic regulator (LQR), is formulated as stabilizing a time-invariant linear system to the origin can be solved analytically." ] @@ -163,7 +177,7 @@ } }, "source": [ - "For a discrete-time linear invariant system (LTI) described by:\n", + "For a discrete-time linear time-invariant system (LTI) described by:\n", "\n", "$$\n", "\\mathbf{x}_{t+1} = A \\mathbf{x}_t + B \\mathbf{u}_t\n", @@ -192,6 +206,10 @@ "\\mathbf{u}_k = K_k \\mathbf{x}_k\n", "$$\n", "\n", + "$$\n", + "\\mathbf{u}_k = K_k (\\mathbf{x}_T - \\mathbf{x}_k)\n", + "$$\n", + "\n", "With: $K_k = -(R + B^T P_k B)^{-1} B^T P_k B$\n", "\n", "and $P_k$ is found by solving the discrete time dynamic Riccati equation:\n", @@ -270,7 +288,7 @@ } }, "source": [ - "### Cart" + "# Cart" ] }, { @@ -349,7 +367,7 @@ } }, "source": [ - "#### SImulation\n", + "## SImulation\n", "\n", "Now, we simulate the closed-loop system for 50 steps:" ] @@ -394,7 +412,7 @@ } }, "source": [ - "#### Evaluation\n", + "## Evaluation\n", "\n", "Finally, we evaluate the controller on the actual environment." ] @@ -453,6 +471,20 @@ ")" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Inverted Pendulum" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Exercise" + ] + }, { "cell_type": "markdown", "metadata": { @@ -490,6 +522,15 @@ "# Your solution here" ] }, + { + "cell_type": "markdown", + "metadata": { + "jp-MarkdownHeadingCollapsed": true + }, + "source": [ + "## Solution" + ] + }, { "cell_type": "markdown", "metadata": { @@ -500,33 +541,77 @@ "solution2_first": true }, "source": [ - "::::{solution} lqr-controller\n", + "::::{solution-start} lqr-controller\n", ":class: dropdown\n", - "\n", - "We first need the environment, linear model and simulator:\n", - "\n", - ":::{code-cell} ipython3\n", + "::::" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We first need the environment, linear model and simulator:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ "inverted_pendulum_env = create_inverted_pendulum_environment(max_steps=200)\n", "inverted_pendulum_model = build_inverted_pendulum_linear_model(inverted_pendulum_env)\n", "inverted_pendulum_simulator = Simulator(inverted_pendulum_model)\n", "inverted_pendulum_simulator.set_param(t_step=inverted_pendulum_env.dt)\n", - "inverted_pendulum_simulator.setup()\n", - ":::\n", - "\n", - "The goal is to keep the inverted pendulum upright. For that we define the following objective matrices and setpoint:\n", - "\n", - ":::{code-cell} ipython3\n", + "inverted_pendulum_simulator.setup()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The goal is to keep the inverted pendulum upright. For that we define the following objective matrices and setpoint:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ "Q = np.diag([100, 1])\n", "R = np.diag([10])\n", "setpoint = np.zeros((2, 1))\n", "display_array(\"Q\", Q)\n", "display_array(\"R\", R)\n", - "display_array(\"Setpoint\", setpoint)\n", - ":::\n", - "\n", - "We then create the controller:\n", - "\n", - ":::{code-cell} ipython3\n", + "display_array(\"Setpoint\", setpoint)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We then create the controller:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ "inverted_pendulum_lqr_controller = build_lqr_controller(\n", " model=inverted_pendulum_model,\n", " t_step=inverted_pendulum_env.dt,\n", @@ -534,12 +619,26 @@ " setpoint=setpoint,\n", " Q=Q,\n", " R=R,\n", - ")\n", - ":::\n", - "\n", - "**Simulation**\n", - "\n", - ":::{code-cell} ipython3\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Simulation**" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ "inverted_pendulum_lqr_controller.reset_history()\n", "inverted_pendulum_simulator.reset_history()\n", "x0 = np.zeros((2, 1))\n", @@ -548,35 +647,87 @@ "\n", "for k in range(50):\n", " u0 = inverted_pendulum_lqr_controller.make_step(x0)\n", - " x0 = inverted_pendulum_simulator.make_step(u0)\n", - "\n", - "animate_inverted_pendulum_simulation(inverted_pendulum_lqr_controller.data)\n", - ":::\n", - "\n", - "**Evaluation**\n", - "\n", - ":::{code-cell} ipython3\n", + " x0 = inverted_pendulum_simulator.make_step(u0)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "animate_inverted_pendulum_simulation(inverted_pendulum_lqr_controller.data)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Evaluation**" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ "class LQRController:\n", - " def __init__(self, K: K) -> None:\n", + " def __init__(self, K: NDArray) -> None:\n", " self.K = K\n", "\n", " def act(self, observation: NDArray) -> NDArray:\n", - " return self.K @ observation[[2, 3]]\n", - ":::\n", - "\n", - ":::{code-cell} ipython3\n", + " return self.K @ observation[[2, 3]]" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ "lqr_controller = LQRController(inverted_pendulum_lqr_controller.K)\n", - "results = simulate_environment(inverted_pendulum_env, max_steps=200, controller=lqr_controller)\n", - "show_video(results.frames, fps=1 / inverted_pendulum_env.dt)\n", - ":::\n", - "\n", - ":::{code-cell} ipython3\n", + "results = simulate_environment(\n", + " inverted_pendulum_env, max_steps=200, controller=lqr_controller\n", + ")\n", + "show_video(results.frames, fps=1 / inverted_pendulum_env.dt)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ "plt.close()\n", "T = np.arange(len(results.observations)) * inverted_pendulum_env.dt\n", - "plot_inverted_pendulum_results(T=T, observations=results.observations, actions=results.actions, reference=np.inf);\n", - ":::\n", - "\n", - "::::" + "plot_inverted_pendulum_results(\n", + " T=T, observations=results.observations, actions=results.actions, reference=np.inf\n", + ");" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + ":::{solution-end}\n", + ":::" ] }, { @@ -587,7 +738,7 @@ } }, "source": [ - "## Iterative Linear Quadratic Regulator\n", + "# Iterative Linear Quadratic Regulator\n", "\n", "Iterative Linear Quadratic Regulator (iLQR) is an extension of LQR control to non-linear system with non-linear quadratic costs.\n", "\n", diff --git a/notebooks/nb_50_MPC.ipynb b/notebooks/nb_50_MPC.ipynb index f7b33ba..2fa19fa 100644 --- a/notebooks/nb_50_MPC.ipynb +++ b/notebooks/nb_50_MPC.ipynb @@ -7,6 +7,9 @@ "editable": true, "hide_input": false, "init_cell": true, + "jupyter": { + "source_hidden": true + }, "scene__Initialization": true, "slideshow": { "slide_type": "skip" @@ -35,6 +38,9 @@ "metadata": { "editable": true, "init_cell": true, + "jupyter": { + "source_hidden": true + }, "scene__Initialization": true, "slideshow": { "slide_type": "skip" @@ -53,6 +59,9 @@ "execution_count": null, "metadata": { "init_cell": true, + "jupyter": { + "source_hidden": true + }, "scene__Initialization": true, "tags": [ "ActiveScene" @@ -69,8 +78,13 @@ "cell_type": "code", "execution_count": null, "metadata": { + "collapsed": true, "editable": true, "init_cell": true, + "jupyter": { + "outputs_hidden": true, + "source_hidden": true + }, "scene__Initialization": true, "slideshow": { "slide_type": "skip" @@ -93,10 +107,6 @@ "from do_mpc.model import Model\n", "from numpy.typing import NDArray\n", "\n", - "from training_ml_control.control import (\n", - " build_mpc_controller,\n", - ")\n", - "\n", "from training_ml_control.plots import (\n", " plot_cart_results,\n", " plot_inverted_pendulum_results,\n", @@ -114,8 +124,9 @@ ")\n", "from training_ml_control.control import (\n", " build_lqr_controller,\n", + " build_mpc_controller,\n", ")\n", - "from training_ml_control.model import (\n", + "from training_ml_control.models import (\n", " build_cart_model,\n", " build_inverted_pendulum_linear_model,\n", " build_inverted_pendulum_nonlinear_model,\n", @@ -142,13 +153,13 @@ ] }, "source": [ - ":::{figure} ./_static/images/aai-institute-cover.png\n", + "```{figure} ./_static/images/aai-institute-cover.png\n", ":width: 90%\n", ":align: center\n", "---\n", "name: aai-institute\n", "---\n", - ":::" + "```" ] }, { @@ -319,6 +330,13 @@ "cart_simulator.setup()" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Controller" + ] + }, { "cell_type": "markdown", "metadata": { @@ -336,7 +354,7 @@ "metadata": {}, "outputs": [], "source": [ - "setpoint = np.array([cart_env.goal_position, 1.0])\n", + "setpoint = np.array([cart_env.goal_position, 10.0])\n", "distance_cost = 100 * casadi.norm_2(cart_model.x.cat - setpoint)\n", "terminal_cost = distance_cost\n", "stage_cost = distance_cost\n", @@ -366,7 +384,7 @@ }, "outputs": [], "source": [ - "force_penalty = 1e-2" + "force_penalty = 0.0" ] }, { @@ -402,7 +420,7 @@ "cart_mpc_controller = build_mpc_controller(\n", " model=cart_model,\n", " t_step=cart_env.dt,\n", - " n_horizon=10,\n", + " n_horizon=20,\n", " stage_cost=stage_cost,\n", " terminal_cost=terminal_cost,\n", " force_penalty=force_penalty,\n", @@ -419,7 +437,7 @@ } }, "source": [ - "#### Simulation\n", + "### Simulation\n", "\n", "We set the initial state and simulate the closed-loop for 100 steps." ] @@ -468,7 +486,7 @@ } }, "source": [ - "#### Evaluation\n", + "### Evaluation\n", "\n", "Finally, we evaluate the controller on the actual environment." ] @@ -539,6 +557,20 @@ ")" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Inverted Pendulum" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Exercise (Linear)" + ] + }, { "cell_type": "markdown", "metadata": { @@ -569,6 +601,14 @@ "::::" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + ":::{solution} inverted-pendulum-linear-mpc\n", + ":::" + ] + }, { "cell_type": "code", "execution_count": null, @@ -578,6 +618,22 @@ "# Your solution here" ] }, + { + "cell_type": "markdown", + "metadata": { + "editable": true, + "jp-MarkdownHeadingCollapsed": true, + "slideshow": { + "slide_type": "" + }, + "tags": [ + "remove-cell" + ] + }, + "source": [ + "#### Solution" + ] + }, { "cell_type": "markdown", "metadata": { @@ -588,73 +644,178 @@ "solution2_first": true }, "source": [ - "::::{solution} inverted-pendulum-linear-mpc\n", + ":::{solution-start} inverted-pendulum-linear-mpc\n", ":class: dropdown\n", - "\n", - "We first need the environment, linear model and simulator:\n", - "\n", - ":::{code-cell} ipython3\n", + ":::" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We first need the environment, linear model and simulator:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ "inverted_pendulum_env = create_inverted_pendulum_environment(max_steps=200)\n", - "inverted_pendulum_lin_model = build_inverted_pendulum_linear_model(inverted_pendulum_env)\n", + "inverted_pendulum_lin_model = build_inverted_pendulum_linear_model(\n", + " inverted_pendulum_env\n", + ")\n", "inverted_pendulum_lin_simulator = Simulator(inverted_pendulum_lin_model)\n", "inverted_pendulum_lin_simulator.set_param(t_step=inverted_pendulum_env.dt)\n", - "inverted_pendulum_lin_simulator.setup()\n", - ":::\n", - "\n", - "The goal is to keep the inverted pendulum upright. For that we define the following costs, setpoint and force penalty:\n", - "\n", - ":::{code-cell} ipython3\n", + "inverted_pendulum_lin_simulator.setup()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The goal is to keep the inverted pendulum upright. For that we define the following costs, setpoint and force penalty:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ "setpoint = np.zeros((2, 1))\n", - "distance_cost = casadi.bilin(np.diag([100, 1]), inverted_pendulum_lin.x.cat - setpoint)\n", + "distance_cost = casadi.bilin(\n", + " np.diag([100, 1]), inverted_pendulum_lin_model.x.cat - setpoint\n", + ")\n", "terminal_cost = distance_cost\n", "stage_cost = distance_cost\n", - "force_penalty = 1e-2\n", + "u_penalty = {\"force\": 1e-2}\n", "display_array(\"Setpoint\", setpoint)\n", "print(f\"{stage_cost=}\")\n", - "print(f\"{terminal_cost=}\")\n", - ":::\n", - "\n", - "We define as well upper and lower limits for the state and force\n", - "\n", - ":::{code-cell} ipython3\n", - "x_limits = np.array([-np.inf, np.inf])\n", - "u_limits = np.array([-inverted_pendulum_env.force_max, inverted_pendulum_env.force_max])\n", - ":::\n", - "\n", - "After that, we create the controller:\n", - "\n", - ":::{code-cell} ipython3\n", + "print(f\"{terminal_cost=}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We define as well upper and lower limits for the state and force" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "x_limits = {\"position\": np.array([-10, 10])}\n", + "u_limits = {\n", + " \"force\": np.array(\n", + " [-inverted_pendulum_env.force_max, inverted_pendulum_env.force_max]\n", + " )\n", + "}" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "After that, we create the controller:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ "inverted_pendulum_mpc_controller = build_mpc_controller(\n", " model=inverted_pendulum_lin_model,\n", " t_step=inverted_pendulum_env.dt,\n", " n_horizon=100,\n", " stage_cost=stage_cost,\n", " terminal_cost=terminal_cost,\n", - " force_penalty=force_penalty,\n", + " u_penalty=u_penalty,\n", " x_limits=x_limits,\n", " u_limits=u_limits,\n", - ")\n", - ":::\n", - "\n", - "#### Simulation\n", - " \n", - ":::{code-cell} ipython3\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Simulation**" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ "inverted_pendulum_mpc_controller.reset_history()\n", "inverted_pendulum_lin_simulator.reset_history()\n", "x0 = np.zeros((2, 1))\n", "x0[0] = 0.01\n", - "inverted_pendulum_simulator.x0 = x0\n", + "inverted_pendulum_lin_simulator.x0 = x0\n", "\n", "for k in range(50):\n", " u0 = inverted_pendulum_mpc_controller.make_step(x0)\n", - " x0 = inverted_pendulum_lin_simulator.make_step(u0)\n", - "\n", - "animate_inverted_pendulum_simulation(inverted_pendulum_mpc_controller.data)\n", - ":::\n", - "\n", - "#### Evaluation\n", - "\n", - ":::{code-cell} ipython3\n", + " x0 = inverted_pendulum_lin_simulator.make_step(u0)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "animate_inverted_pendulum_simulation(inverted_pendulum_mpc_controller.data)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Evaluation**" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ "class MPCController:\n", " def __init__(self, mpc: do_mpc.controller.MPC) -> None:\n", " self.mpc = mpc\n", @@ -663,22 +824,69 @@ " self.mpc.set_initial_guess()\n", "\n", " def act(self, observation: NDArray) -> NDArray:\n", - " return mpc.make_step(observation[[2, 3]].reshape(-1, 1)).ravel()\n", - ":::\n", - "\n", - ":::{code-cell} ipython3\n", + " return self.mpc.make_step(observation[[2, 3]].reshape(-1, 1)).ravel()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "%%capture\n", "mpc_controller = MPCController(inverted_pendulum_mpc_controller)\n", - "results = simulate_environment(inverted_pendulum_mpc_controller, max_steps=200, controller=mpc_controller)\n", - "show_video(results.frames, fps=1 / inverted_pendulum_env.dt)\n", - ":::\n", - "\n", - ":::{code-cell} ipython3\n", + "results = simulate_environment(\n", + " inverted_pendulum_env, max_steps=200, controller=mpc_controller\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "show_video(results.frames, fps=1 / inverted_pendulum_env.dt)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ "plt.close()\n", "T = np.arange(len(results.observations)) * inverted_pendulum_env.dt\n", - "plot_inverted_pendulum_results(T=T, observations=results.observations, actions=results.actions, reference=np.inf);\n", - ":::\n", - "\n", - "::::" + "plot_inverted_pendulum_results(\n", + " T=T, observations=results.observations, actions=results.actions, reference=np.inf\n", + ");" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + ":::{solution-end}\n", + ":::" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Exercise (Non-linear)" ] }, { @@ -701,7 +909,7 @@ ":::{hint}\n", "Use the following to create the environment with initial angle set to -np.pi and cutoff angle to np.inf for second > case.\n", "\n", - "```{code-cell} ipython3\n", + "```{code}\n", "env = create_inverted_pendulum_environment(theta_threshold=np.inf, initial_angle=-np.pi)\n", "```\n", ":::\n", @@ -730,46 +938,115 @@ "# Your solution here" ] }, + { + "cell_type": "markdown", + "metadata": { + "jp-MarkdownHeadingCollapsed": true + }, + "source": [ + "#### Solution" + ] + }, { "cell_type": "markdown", "metadata": {}, "source": [ - "::::{solution} non-linear-inverted-pendulum-mpc\n", + ":::{solution-start} non-linear-inverted-pendulum-mpc\n", ":class: dropdown\n", - "\n", - "#### Fast Rotation\n", - "\n", - "We first need the environment, non-linear model and simulator:\n", - "\n", - ":::{code-cell} ipython3\n", - "inverted_pendulum_env = create_inverted_pendulum_environment(max_steps=200)\n", - "inverted_pendulum_nonlin_model = build_inverted_pendulum_nonlinear_model(inverted_pendulum_env)\n", + ":::" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Fast Rotation**" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We first need the environment, non-linear model and simulator:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "inverted_pendulum_env = create_inverted_pendulum_environment(\n", + " max_steps=200, theta_threshold=np.inf\n", + ")\n", + "inverted_pendulum_nonlin_model = build_inverted_pendulum_nonlinear_model(\n", + " inverted_pendulum_env\n", + ")\n", "inverted_pendulum_nonlin_simulator = Simulator(inverted_pendulum_nonlin_model)\n", "inverted_pendulum_nonlin_simulator.set_param(t_step=inverted_pendulum_env.dt)\n", - "inverted_pendulum_nonlin_simulator.setup()\n", - ":::\n", - "\n", - "The goal is to keep the inverted pendulum upright. For that we define the following costs and force penalty:\n", - "\n", - ":::{code-cell} ipython3\n", - "rotation_cost = - 1000 * inverted_pendulum_nonlin_model.x[3]\n", + "inverted_pendulum_nonlin_simulator.setup()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The goal is to keep the inverted pendulum upright. For that we define the following costs and force penalty:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "rotation_cost = -1000 * inverted_pendulum_nonlin_model.x[\"dtheta\"]\n", "terminal_cost = rotation_cost\n", "stage_cost = rotation_cost\n", "force_penalty = 0.0\n", "print(f\"{stage_cost=}\")\n", - "print(f\"{terminal_cost=}\")\n", - ":::\n", - "\n", - "We define as well upper and lower limits for the state and force\n", - "\n", - ":::{code-cell} ipython3\n", - "x_limits = np.array([-np.inf, np.inf])\n", - "u_limits = np.array([-inverted_pendulum_env.force_max, inverted_pendulum_env.force_max])\n", - ":::\n", - "\n", - "After that, we create the controller:\n", - "\n", - ":::{code-cell} ipython3\n", + "print(f\"{terminal_cost=}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We define as well upper and lower limits for the force:" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "u_limits = {\"force\": np.array([-inverted_pendulum_env.force_max, inverted_pendulum_env.force_max])}" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "After that, we create the controller:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ "inverted_pendulum_mpc_controller = build_mpc_controller(\n", " model=inverted_pendulum_nonlin_model,\n", " t_step=inverted_pendulum_env.dt,\n", @@ -777,54 +1054,133 @@ " stage_cost=stage_cost,\n", " terminal_cost=terminal_cost,\n", " force_penalty=force_penalty,\n", - " x_limits=x_limits,\n", " u_limits=u_limits,\n", - ")\n", - ":::\n", - "\n", - "#### Simulation\n", - " \n", - ":::{code-cell} ipython3\n", - "inverted_pendulum_mpc_controller.reset_history()\n", - "inverted_pendulum_nonlin_simulator.reset_history()\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Simulation**" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "%%capture\n", "x0 = np.zeros((4, 1))\n", "x0[2] = 0.01\n", + "inverted_pendulum_mpc_controller.reset_history()\n", + "inverted_pendulum_nonlin_simulator.reset_history()\n", "inverted_pendulum_nonlin_simulator.x0 = x0\n", "\n", "for k in range(50):\n", " u0 = inverted_pendulum_mpc_controller.make_step(x0)\n", - " x0 = inverted_pendulum_nonlin_simulator.make_step(u0)\n", - "\n", - "animate_inverted_pendulum_simulation(inverted_pendulum_mpc_controller.data)\n", - ":::\n", - "\n", - "#### Evaluation\n", - "\n", - ":::{code-cell} ipython3\n", + " x0 = inverted_pendulum_nonlin_simulator.make_step(u0)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "animate_full_inverted_pendulum_simulation(inverted_pendulum_mpc_controller.data)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Evaluation**" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ "class MPCController:\n", " def __init__(self, mpc: do_mpc.controller.MPC) -> None:\n", " self.mpc = mpc\n", " self.mpc.reset_history()\n", - " self.mpc.x0 = np.zeros(2)\n", + " self.mpc.x0 = np.zeros(4)\n", " self.mpc.set_initial_guess()\n", "\n", " def act(self, observation: NDArray) -> NDArray:\n", - " return mpc.make_step(observation.reshape(-1, 1)).ravel()\n", - ":::\n", - "\n", - ":::{code-cell} ipython3\n", + " return self.mpc.make_step(observation.reshape(-1, 1)).ravel()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "%%capture\n", "mpc_controller = MPCController(inverted_pendulum_mpc_controller)\n", - "results = simulate_environment(inverted_pendulum_env, max_steps=200, controller=mpc_controller)\n", - "show_video(results.frames, fps=1 / inverted_pendulum_env.dt)\n", - ":::\n", - "\n", - ":::{code-cell} ipython3\n", + "results = simulate_environment(\n", + " inverted_pendulum_env, max_steps=200, controller=mpc_controller\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "show_video(results.frames, fps=1 / inverted_pendulum_env.dt)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ "plt.close()\n", "T = np.arange(len(results.observations)) * inverted_pendulum_env.dt\n", - "plot_inverted_pendulum_results(T=T, observations=results.observations, actions=results.actions, reference=np.inf);\n", - ":::\n", - "\n", - "::::" + "plot_inverted_pendulum_results(\n", + " T=T, observations=results.observations, actions=results.actions, reference=np.inf\n", + ");" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + ":::{solution-end}\n", + ":::" ] }, { @@ -853,7 +1209,7 @@ } }, "source": [ - "## Robust MPC\n", + "# Robust MPC\n", "\n", "Robust Model Predictive Control (RMPC) is related to a variety of methods designed to guarantee control performance using optimization algorithms while considering systems with uncertainties. It is concerned with the control of system that are only approximately known. Usually, it is assumed that the system lies in a set of possible systems and this set can be quantitatively characterized\n", "\n", @@ -932,7 +1288,7 @@ } }, "source": [ - "### Model" + "#### Model" ] }, { @@ -985,7 +1341,6 @@ "metadata": {}, "outputs": [], "source": [ - "x_limits = np.array([-np.inf, np.inf])\n", "u_limits = np.array([-inverted_pendulum_env.force_max, inverted_pendulum_env.force_max])" ] }, @@ -1002,7 +1357,6 @@ " stage_cost=stage_cost,\n", " terminal_cost=terminal_cost,\n", " force_penalty=force_penalty,\n", - " x_limits=x_limits,\n", " u_limits=u_limits,\n", " uncertainty_values=uncertainty_values,\n", ")" @@ -1016,7 +1370,7 @@ } }, "source": [ - "### Simulation" + "#### Evaluation" ] }, { diff --git a/notebooks/nb_60_MCTS.ipynb b/notebooks/nb_60_MCTS.ipynb index 37270a7..50e2e71 100644 --- a/notebooks/nb_60_MCTS.ipynb +++ b/notebooks/nb_60_MCTS.ipynb @@ -7,6 +7,9 @@ "editable": true, "hide_input": false, "init_cell": true, + "jupyter": { + "source_hidden": true + }, "scene__Initialization": true, "slideshow": { "slide_type": "skip" @@ -35,6 +38,9 @@ "metadata": { "editable": true, "init_cell": true, + "jupyter": { + "source_hidden": true + }, "scene__Initialization": true, "slideshow": { "slide_type": "skip" @@ -52,6 +58,9 @@ "cell_type": "code", "execution_count": null, "metadata": { + "jupyter": { + "source_hidden": true + }, "scene__Initialization": true, "tags": [ "ActiveScene" @@ -68,6 +77,9 @@ "cell_type": "code", "execution_count": null, "metadata": { + "jupyter": { + "source_hidden": true + }, "scene__Initialization": true, "tags": [ "ActiveScene" @@ -100,13 +112,13 @@ ] }, "source": [ - ":::{figure} ./_static/images/aai-institute-cover.png\n", + "```{figure} ./_static/images/aai-institute-cover.png\n", ":width: 90%\n", ":align: center\n", "---\n", "name: aai-institute\n", "---\n", - ":::" + "```" ] }, { @@ -246,9 +258,16 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "````{exercise-start} Grid World Again\n", - ":label: grid-world\n", - "````" + "## Exercise" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + ":::{exercise-start} Grid World Again\n", + ":label: grid-world-again\n", + ":::" ] }, { @@ -284,30 +303,45 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "We wish to travel to the goal cell in green. If the cost represents time then we want to find the shortest path to the goal.\n", + "We convert the graph to a directed graph with all possible paths from start to target that do not contain cycles" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "G = convert_graph_to_directed(G)\n", + "plot_grid_graph(G)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We wish for the car to travel from its starting cell in red to the target cell in green. If the cost represents time and each step has the same cost then we want to find the shortest path to the goal.\n", "\n", - "- Arrows (edges) indicate the possible movements. An action is associated with each edge.\n", + "- Arrows (edges) indicate the possible movements.\n", "- Numbers on edges indicate the cost of moving along an edge.\n", "\n", - "Use Dynamic Programming to solve this problem:\n", - "\n", - "- Compute the optimal cost-to-go for each state.\n", - "- Determine the optimal plan using the computed optimal cost-to-go.\n", - "- Execute the plan in the environment.\n", + "Use MCTS to solve this problem and then implement the plan in the environment.\n", "\n", ":::{tip} Hint 1\n", ":class: dropdown\n", + "\n", "Determine all possible paths first.\n", "\n", - "You can use `plot_grid_all_paths_graph(G)`.\n", + "You can use `plot_grid_all_paths_graph(G)` for that.\n", ":::\n", "\n", ":::{tip} Hint 2\n", ":class: dropdown\n", + "\n", "Compute the optimal cost-to-go at each node.\n", "\n", "You can use `dict(G.nodes(data=True))` to get a dictionary that maps the nodes to their attributes\n", - "and you can use `G.start_node` and `G.end_node` to access the start and end (i.e. goal) nodes, respectively.\n", + "and you can use `G.start_node` and `G.target_node` to access the start and end (i.e. goal) nodes, respectively.\n", ":::" ] }, @@ -315,15 +349,15 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "````{exercise-end}\n", - "````" + ":::{exercise-end}\n", + ":::" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "````{solution-start} grid-world\n", + "````{solution} grid-world-again\n", "````" ] }, @@ -335,16 +369,6 @@ "source": [ "# Your solution here" ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - ":::{solution} grid-world\n", - ":class: dropdown\n", - "Work in Progress\n", - ":::" - ] } ], "metadata": { diff --git a/notebooks/nb_70_machine_learning_control.ipynb b/notebooks/nb_70_machine_learning_control.ipynb index e57720b..449fe6b 100644 --- a/notebooks/nb_70_machine_learning_control.ipynb +++ b/notebooks/nb_70_machine_learning_control.ipynb @@ -7,6 +7,9 @@ "metadata": { "editable": true, "init_cell": true, + "jupyter": { + "source_hidden": true + }, "scene__Default Scene": true, "slideshow": { "slide_type": "skip" @@ -32,6 +35,9 @@ "id": "f7339d3d-eae4-4e05-a7fb-050df96782e7", "metadata": { "init_cell": true, + "jupyter": { + "source_hidden": true + }, "scene__Default Scene": true, "slideshow": { "slide_type": "skip" @@ -52,6 +58,9 @@ "id": "6e892149-098a-42c7-b554-bbb83918888d", "metadata": { "init_cell": true, + "jupyter": { + "source_hidden": true + }, "scene__Default Scene": true, "tags": [ "ActiveScene" @@ -69,7 +78,11 @@ "execution_count": null, "id": "cd78d373-34af-4b57-8d16-dbf6f662331d", "metadata": { + "editable": true, "init_cell": true, + "jupyter": { + "source_hidden": true + }, "scene__Default Scene": true, "slideshow": { "slide_type": "skip" @@ -90,6 +103,8 @@ "import pysindy as ps\n", "from scipy.signal import periodogram\n", "from scipy.fft import rfft, rfftfreq\n", + "from sklearn.metrics import mean_squared_error\n", + "from sklearn.model_selection import train_test_split\n", "\n", "from training_ml_control.control import (\n", " ConstantController,\n", @@ -123,13 +138,11 @@ ] }, "source": [ - ":::{figure} ./_static/images/aai-institute-cover.png\n", + "```{figure} ./_static/images/aai-institute-cover.png\n", ":width: 90%\n", ":align: center\n", - "---\n", - "name: aai-institute\n", - "---\n", - ":::" + ":name: aai-institute\n", + "```" ] }, { @@ -151,7 +164,7 @@ } }, "source": [ - "## Learning-Based Control" + "# Learning-Based Control" ] }, { @@ -178,7 +191,7 @@ } }, "source": [ - "## System Identification\n", + "# System Identification\n", "\n", "System Identification is the process of constructing a model of a (dynamical) system from observations (measurements) of its inputs and outputs. System identification also includes the optimal design of experiments for efficiently generating informative data for fitting such models as well as model reduction.\n", "\n", @@ -260,7 +273,12 @@ { "cell_type": "markdown", "id": "fb7dc38b-7042-4fdb-b6d9-afa4f53773a4", - "metadata": {}, + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "source": [ "### Cart" ] @@ -269,7 +287,12 @@ "cell_type": "code", "execution_count": null, "id": "fb805d78-a8c1-4dd6-9aa9-24acf6fa7c34", - "metadata": {}, + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "outputs": [], "source": [ "cart_env = create_cart_environment(max_steps=500, goal_position=20, max_position=30)" @@ -279,7 +302,12 @@ "cell_type": "code", "execution_count": null, "id": "0f0d34fb-9237-4150-8fdd-465358e510cd", - "metadata": {}, + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "outputs": [], "source": [ "cart_observations = {}\n", @@ -291,7 +319,7 @@ " np.asarray([cart_env.max_action / 2])\n", " ),\n", " \"Sinusoid @ 10Hz\": SineController(\n", - " cart_env, np.asarray([cart_env.max_action]), frequency=100\n", + " cart_env, np.asarray([cart_env.max_action]), frequency=10\n", " ),\n", " \"Sinusoid @ 0.5Hz\": SineController(\n", " cart_env, np.asarray([cart_env.max_action]), frequency=0.5\n", @@ -318,7 +346,12 @@ "cell_type": "code", "execution_count": null, "id": "cc1ea39d-5a37-49f2-bcca-d8d875ebcd36", - "metadata": {}, + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "outputs": [], "source": [ "fig, axes = plt.subplots(len(cart_actions), figsize=(12, 10))\n", @@ -336,7 +369,12 @@ "cell_type": "code", "execution_count": null, "id": "f2b4a5b6-ab5e-4d90-9c9e-fe4c6884badc", - "metadata": {}, + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "outputs": [], "source": [ "fig, axes = plt.subplots(1, 2, sharex=True)\n", @@ -355,7 +393,12 @@ { "cell_type": "markdown", "id": "cce69707-84b6-4fac-a296-6b60cdd6c2f8", - "metadata": {}, + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "source": [ "Let's look at the phase portrait[^*] instead.\n", "\n", @@ -366,7 +409,12 @@ "cell_type": "code", "execution_count": null, "id": "728e13ac-0b8e-4934-b1ab-bb83a442dfde", - "metadata": {}, + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "outputs": [], "source": [ "fig, ax = plt.subplots(1, 1, sharex=True)\n", @@ -383,7 +431,12 @@ { "cell_type": "markdown", "id": "96163380-18c3-49e0-81f9-e397c60e3fbc", - "metadata": {}, + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "source": [ "Let's visualize the Power Spectral Density (PSD) of the input and state signals" ] @@ -392,7 +445,12 @@ "cell_type": "code", "execution_count": null, "id": "e4af8b67-b339-45d6-bd78-aa15822c23f4", - "metadata": {}, + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "outputs": [], "source": [ "fig, axes = plt.subplots(1, 2, sharex=True)\n", @@ -409,12 +467,30 @@ "plt.show();" ] }, + { + "cell_type": "markdown", + "id": "a8318f8b-bee3-4715-9cbb-958444807a47", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, + "source": [ + "## Model Fitting" + ] + }, { "cell_type": "markdown", "id": "aefea19a-595f-428a-9358-1ae48cb91c4d", - "metadata": {}, + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "source": [ - "## Naive Deep Learning" + "### Naive Deep Learning" ] }, { @@ -445,7 +521,7 @@ "id": "f80e3b56-a968-46c9-810d-fca68eea3cce", "metadata": {}, "source": [ - "## Deep Learning with Nominal Model\n", + "### Deep Learning with Nominal Model\n", "\n", "A real-world dynamical system can be described as:\n", "\n", @@ -476,7 +552,8 @@ "source": [ ":::{figure} _static/images/60_neural_lander_diagram.png\n", ":width: 60%\n", - ":label: neural-lander-diagram\n", + ":name: neural-lander-diagram\n", + "\n", "Visual depiction of the ground effect, the complex aerodynamic effect between the drone and the ground, which is nonlinear, nonstationary, and very hard to model using standard system identification approaches. *Adapted from [Neural Control blog post](https://www.gshi.me/blog/NeuralControl/)*. \n", ":::" ] @@ -488,7 +565,8 @@ "source": [ ":::{figure} _static/images/60_neural_lander.gif\n", ":width: 60%\n", - ":label: neural-lander-gif\n", + ":name: neural-lander-gif\n", + "\n", "Comparison of drone landing with and without neural network modelling of residual ground effect. *Taken from [Neural Control blog post](https://www.gshi.me/blog/NeuralControl/)*. \n", ":::" ] @@ -509,7 +587,7 @@ "id": "7994e6e7-f2c7-426b-8f78-8d4897de313c", "metadata": {}, "source": [ - "## Sparse Identification of Nonlinear Dynamics (SINDy)\n", + "### Sparse Identification of Nonlinear Dynamics (SINDy)\n", "\n", "The SINDY algorithm identifies fully nonlinear dynamical systems from measurement data. This relies on the fact that many dynamical systems have relatively few terms in the right hand side of the governing equations:\n", "\n", @@ -597,7 +675,7 @@ "source": [ ":::{figure} _static/images/70_sindy_diagram.svg\n", ":width: 80%\n", - ":label: sindy-diagram\n", + ":name: sindy-diagram\n", "\n", "Schematic of the SINDy algorithm, demonstrated on the Lorenz equations. Data are collected from the system, including a time history of the states $\\mathbf{X}$ and derivatives $\\dot{\\mathbf{X}}$. Next, a library of nonlinear functions of the states, $\\Theta(\\mathbf{X})$, is constructed. This is then used to find the fewest terms needed to satisfy $\\dot{\\mathbf{X}} = \\Theta(\\mathbf{X})\\Xi$. The few entries in the vectors of $\\Xi$, solved for by sparse regression, denote the relevant\n", "terms in the right-hand side of the dynamics. *Taken from {cite}`brunton_discovering_2016`*\n", @@ -609,7 +687,7 @@ "id": "6390573f-cc23-4a2d-a6ea-b3a4dc6c63cc", "metadata": {}, "source": [ - "### SINDy with Control (SINDyc)\n", + "#### SINDy with Control (SINDyc)\n", "\n", "SINDYc generalizes the SINDY method to include inputs and control. In particular, it considers the nonlinear dynamical system with inputs $\\mathbf{u}$:\n", "\n", @@ -639,17 +717,16 @@ "id": "8bad92c6-e587-41a9-9370-11f3f0cc7329", "metadata": {}, "source": [ - "## Koopman Operator\n", + "### Koopman Operator\n", "\n", - ":::{figure} _static/images/60_koopman_operators_summary.svg\n", + "```{figure} _static/images/60_koopman_operators_summary.svg\n", ":width: 70%\n", ":align: center\n", - "---\n", - "name: Koopman Operators Summary\n", - "---\n", + ":alt: Koopman Operators Summary\n", + "\n", "Summary of the idea of Koopman operators. By lifting to a space of observables, we\n", "trade a nonlinear finite-dimensional system for a linear infinite-dimensional system. *Taken from {cite}`colbrook_multiverse_2023`*.\n", - ":::\n", + "```\n", "\n", "Given $\\mathcal{F}$ a space of functions $g: \\Omega \\rightarrow \\mathbb{C}$, and $\\Omega$ the state space of our dynamical system. The Koopman operator is defined on a suitable domain $\\mathcal{D}(\\mathcal{K}) \\subset \\mathcal{F}$ via the composition formula:\n", "\n", @@ -685,7 +762,7 @@ "id": "319c7c50-bb03-46ce-a268-2281cd9c1f0b", "metadata": {}, "source": [ - "## Dynamic Mode Decomposition (DMD)\n", + "### Dynamic Mode Decomposition (DMD)\n", "\n", "Dynamic Mode Decomposition (DMD) is a popular data-driven analysis technique used to decompose complex, nonlinear systems into a set of coherent structures (also called DMD modes), that grow, decay, and/or oscillate in time revealing underlying patterns and dynamics through spectral analysis. In other words, the DMD converts a dynamical system into a superposition of modes whose dynamics are governed by eigenvalues.\n", "\n", @@ -722,10 +799,10 @@ "\\underset{\\mathbf{K}_{\\text{DMD}} \\in \\mathbb{C}^{d\\times d} }{\\min} \\left\\lVert \\mathbf{Y} − \\mathbf{K}_{\\text{DMD}} \\mathbf{X} \\right\\rVert_F ,\n", "$$ (dmd-minimization)\n", "\n", - "where $\\left\\lVert . \\right\\rVert_F$ denotes the Frobenius norm[^*]. Similar optimization problems will be at the heart of the\n", + "where $\\left\\lVert . \\right\\rVert_F$ denotes the Frobenius norm[^frobenius]. Similar optimization problems will be at the heart of the\n", "various DMD-type algorithms we consider in this review. A solution to the problem in {eq}`dmd-minimization` is:\n", "\n", - "[^*]: The Frobenius norm of a matrix $\\mathbf{X} \\in \\mathbb{C}^{m\\times n}$ is defined as $\\left\\lVert \\mathbf{X} \\right\\rVert_F = \\sqrt{\\sum_{i=0}^{m} \\sum_{j=0}^{n} x_{i,j}}$\n", + "[^frobenius]: The Frobenius norm of a matrix $\\mathbf{X} \\in \\mathbb{C}^{m\\times n}$ is defined as $\\left\\lVert \\mathbf{X} \\right\\rVert_F = \\sqrt{\\sum_{i=0}^{m} \\sum_{j=0}^{n} x_{i,j}}$\n", "\n", "$$\n", "\\mathbf{K}_{\\text{DMD}} = \\mathbf{Y} \\mathbf{X}^{+} ∈ \\mathbb{C}^{d\\times d},\n", @@ -764,7 +841,7 @@ "id": "a349eefe-6ce2-4178-b76b-480e6f58936a", "metadata": {}, "source": [ - "### Variants\n", + "#### Variants\n", "\n", ":::{table} Summary of some DMD methods. *Adapted from {cite}`colbrook_multiverse_2023`*.\n", ":widths: auto\n", @@ -789,12 +866,12 @@ "id": "1e75e113-135f-4c73-bcb8-7083984f0e01", "metadata": {}, "source": [ - "### DMD with Control (DMDc)\n", + "#### DMD with Control (DMDc)\n", "\n", - ":::{figure} _static/images/70_dmdc_overview.svg\n", + "```{figure} _static/images/70_dmdc_overview.svg\n", ":width: 90%\n", "Overview of Dynamic Mode Decomposition with Control (DMDc). *Adapted from {cite}`proctor_dynamic_2016`*.\n", - ":::\n", + "```\n", "\n", "One of the most successful applications of the Koopman operator framework lies in control with demonstrated successes in various challenging appli-\n", "cations. These include fluid dynamics, robotics, power grids, biology, and chemical processes.\n", @@ -834,7 +911,7 @@ "id": "5341126d-f79f-40c6-9461-33e8d953b4d2", "metadata": {}, "source": [ - "## Comparison" + "### Comparison" ] }, { @@ -860,17 +937,27 @@ { "cell_type": "markdown", "id": "e2ec06ac-e9e8-4918-9ea8-8107a16f7104", - "metadata": {}, + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "source": [ "## Cart\n", "\n", - "We will use the DMDc method to fit a model on the data collected from the Cart environment. For that we will make sure of the [pykoopman](https://pykoopman.readthedocs.io/en/master/index.html) package." + "We will use the DMDc and the SINDYc methods to fit a model on the data collected from the Cart environment." ] }, { "cell_type": "markdown", "id": "b0213428-60d2-467a-ae31-c22e6dd14094", - "metadata": {}, + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "source": [ "### Data" ] @@ -879,7 +966,12 @@ "cell_type": "code", "execution_count": null, "id": "b8e8b5eb-62db-494c-95d1-1f1eda218655", - "metadata": {}, + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "outputs": [], "source": [ "cart_env = create_cart_environment(max_steps=500, goal_position=29, max_position=30)\n", @@ -918,28 +1010,34 @@ "cell_type": "code", "execution_count": null, "id": "78e0f204-0e80-4a8e-854e-1926ad6c3843", - "metadata": {}, + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "outputs": [], "source": [ "# We use the data from PRBS\n", "X = cart_observations[\"PRBS\"][:-1].copy()\n", "U = cart_actions[\"PRBS\"].copy()\n", "t = np.arange(0, len(X)) * cart_env.dt\n", - "# Train with first half of the data\n", - "train_size = len(X) // 2\n", - "X_train = X[:train_size]\n", - "U_train = U[:train_size]\n", - "t_train = t[:train_size]\n", - "# Test with rest of data\n", - "X_test = X[train_size:]\n", - "U_test = U[train_size:]\n", - "t_test = t[train_size:]" + "# Train with 80% of the data and test with 20%\n", + "test_size = 0.2\n", + "X_train, X_test, U_train, U_test, t_train, t_test = train_test_split(\n", + " X, U, t, test_size=test_size, shuffle=False\n", + ")" ] }, { "cell_type": "markdown", "id": "d3bb4af8-22b2-4857-98c4-81d9d978d2d8", - "metadata": {}, + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "source": [ "### SINDYc\n", "\n", @@ -948,98 +1046,203 @@ "Refer to its [introduction](https://pysindy.readthedocs.io/en/latest/examples/2_introduction_to_sindy/example.html) page for usage information." ] }, + { + "cell_type": "markdown", + "id": "4ffdd71c-3c18-452d-93ba-0e977dd2a188", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, + "source": [ + "We define an optimizer, a feature library and a differentiation method with appropriate hyper-parameters and then fit the model on the training data" + ] + }, { "cell_type": "code", "execution_count": null, "id": "af3d0377-fadd-4844-8217-a4da41d72d51", - "metadata": {}, + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "outputs": [], "source": [ - "opt = ps.STLSQ(threshold=0.1, max_iter=100)\n", + "optimizer = ps.STLSQ(threshold=0.1, max_iter=100)\n", "feature_library = ps.IdentityLibrary()\n", "differentiation_method = ps.FiniteDifference(order=1)\n", "model = ps.SINDy(\n", - " optimizer=opt,\n", + " optimizer=optimizer,\n", " feature_library=feature_library,\n", " differentiation_method=differentiation_method,\n", ")\n", "model.fit(X_train, u=U_train, t=t_train)" ] }, + { + "cell_type": "markdown", + "id": "070bfc9f-97ad-4904-b7a8-ad3f4d586c30", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, + "source": [ + "Once that's done, we can print the fitted model" + ] + }, { "cell_type": "code", "execution_count": null, "id": "3f4878e9-df19-4541-93cc-e205d6fdc173", - "metadata": {}, + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "outputs": [], "source": [ "model.print()" ] }, + { + "cell_type": "markdown", + "id": "1be9978c-a37c-4a38-b336-314c0446fa81", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, + "source": [ + "We can also compute a score for the fitted model.\n", + "\n", + "By default this is computes the $R^2$ score[^R2], also called coefficient of determination, but we will instead using the Mean Squared Error metric.\n", + "\n", + "[^R2]: $R^2$ values range from 0 to 1. An R-Squared value of 0 means that the model explains or predicts 0% of the relationship between the dependent and independent variables. A value of 1 indicates that the model predicts 100% of the relationship." + ] + }, { "cell_type": "code", "execution_count": null, "id": "2f6a25d4-694a-4d3a-bf8a-10c3929692aa", - "metadata": {}, + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "outputs": [], "source": [ - "print(\"Model score: %f\" % model.score(X_test, u=U_test, t=cart_env.dt))" + "print(\n", + " \"Model score: %f\"\n", + " % model.score(X_test, u=U_test, t=cart_env.dt, metric=mean_squared_error)\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "086d7ca1-42f3-4515-9a20-37ad1c470861", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, + "source": [ + "Let us now use the model to simulate the system's response with the test inputs and as initial state the first state in the test data" ] }, { "cell_type": "code", "execution_count": null, "id": "616e27df-8e26-4a16-b96b-211f8ed4646e", - "metadata": {}, + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "outputs": [], "source": [ - "X_test_sim = model.simulate(X_test[0], t_test, u=U_test)" + "X_sindy = model.simulate(X_test[0], t_test, u=U_test)\n", + "X_sindy = np.vstack([X_test[0][np.newaxis, :], X_sindy])" ] }, { "cell_type": "code", "execution_count": null, "id": "05a43345-f83d-4a4c-9f96-1b202abab5a3", - "metadata": {}, + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "outputs": [], "source": [ "fig, axs = plt.subplots(1, X_test.shape[1], sharex=True)\n", "for i in range(X_test.shape[1]):\n", " axs[i].plot(t_test, X_test[:, i], \"k\", label=\"Measured\")\n", - " axs[i].plot(t_test[1:], X_test_sim[:, i], \"r--\", label=\"Model\")\n", + " axs[i].plot(t_test, X_sindy[:, i], \"r--\", label=\"Model\")\n", " axs[i].legend()\n", " axs[i].set(xlabel=\"t\", ylabel=\"$x_{}$\".format(i + 1))\n", "\n", - "fig = plt.figure()\n", - "ax1 = fig.add_subplot(1, 2, 1)\n", - "ax1.plot(X_test[:, 0], X_test[:, 1], \"k\")\n", - "ax1.set(xlabel=\"$x_1$\", ylabel=\"$x_2$\", title=\"true simulation\")\n", - "\n", - "ax2 = fig.add_subplot(1, 2, 2)\n", - "ax2.plot(X_test_sim[:, 0], X_test_sim[:, 1], \"r--\")\n", - "ax2.set(xlabel=\"$x_1$\", ylabel=\"$x_2$\", title=\"model simulation\")\n", - "\n", - "plt.tight_layout()\n", - "fig.show();" + "fig, ax = plt.subplots()\n", + "ax.plot(X_test[:, 0], X_test[:, 1], \"k\", label=\"Measured\")\n", + "ax.plot(X_sindy[:, 0], X_sindy[:, 1], \"r--\", label=\"Model\")\n", + "ax.legend()\n", + "fig.tight_layout()\n", + "plt.show();" ] }, { "cell_type": "markdown", "id": "82c5089b-49e7-49e6-8824-2fc1c71bd08b", - "metadata": {}, + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "source": [ "### DMDc\n", "\n", - "For the SINDYc method we use the [pykoopman](https://github.com/dynamicslab/pykoopman/) package.\n", + "For the DMDc method we use the [pykoopman](https://github.com/dynamicslab/pykoopman/) package.\n", "\n", "Refer to its [documentation](https://pykoopman.readthedocs.io/en/master/) page for usage information." ] }, + { + "cell_type": "markdown", + "id": "cad026b6-746d-49a7-8f4e-4524e703f756", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, + "source": [ + "We start by defining a regression method, DMDs in this case, with appropriate hyper-parameters.\n", + "\n", + "Afterwards we use that to create and then fit a model on the training data." + ] + }, { "cell_type": "code", "execution_count": null, "id": "4f0884d8-de8d-4129-a91a-ddc0342d6a52", - "metadata": {}, + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "outputs": [], "source": [ "DMDc = pk.regression.DMDc(svd_output_rank=4, svd_rank=6)\n", @@ -1050,7 +1253,12 @@ { "cell_type": "markdown", "id": "ef41acbf-7e9e-45b5-b366-0e041c51219d", - "metadata": {}, + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "source": [ "Once we fit the model we can access the linear state-space models matrices:" ] @@ -1059,7 +1267,12 @@ "cell_type": "code", "execution_count": null, "id": "c77d8a23-4f5a-4154-bbf9-446c274d893f", - "metadata": {}, + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "outputs": [], "source": [ "display_array(\"A\", model.A)\n", @@ -1068,91 +1281,156 @@ "display_array(\"W\", model.W)" ] }, + { + "cell_type": "markdown", + "id": "bd66461e-259d-4ade-a687-56f72ed0de93", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, + "source": [ + "These matrices are different from the ones that we used in our model previously because they are acting on measurements and not on the actual states." + ] + }, { "cell_type": "markdown", "id": "b488a1c9-cc85-44cc-a3fc-8d61aaef119a", - "metadata": {}, + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "source": [ - "After that we can use the model to simulate the system using the remaining data" + "After that we can use the model to simulate the system using the test data." ] }, { "cell_type": "code", "execution_count": null, "id": "ae184358-9169-4d5e-ae56-b96f162a884f", - "metadata": {}, + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "outputs": [], "source": [ - "Xkoop = model.simulate(X_test[0], U_test, n_steps=X_test.shape[0] - 1)\n", - "Xkoop = np.vstack([X_test[0][np.newaxis, :], Xkoop])" + "X_dmd = model.simulate(X_test[0], U_test, n_steps=X_test.shape[0] - 1)\n", + "X_dmd = np.vstack([X_test[0][np.newaxis, :], X_dmd])" ] }, { - "cell_type": "code", - "execution_count": null, - "id": "b1342ac6-85b1-4d15-a95b-26a3ec0541d6", - "metadata": {}, - "outputs": [], - "source": [ - "fig, axs = plt.subplots(1, X_test.shape[1], sharex=True)\n", - "for i in range(X_test.shape[1]):\n", - " axs[i].plot(t_test, X_test[:, i], \"k\", label=\"Measured\")\n", - " axs[i].plot(t_test, Xkoop[:, i], \"r--\", label=\"Model\")\n", - " axs[i].legend()\n", - " axs[i].set(xlabel=\"t\", ylabel=\"$x_{}$\".format(i + 1))\n", - "\n", - "fig = plt.figure()\n", - "ax1 = fig.add_subplot(1, 2, 1)\n", - "ax1.plot(X_test[:, 0], X_test[:, 1], \"k\")\n", - "ax1.set(xlabel=\"$x_1$\", ylabel=\"$x_2$\", title=\"true simulation\")\n", - "\n", - "ax2 = fig.add_subplot(1, 2, 2)\n", - "ax2.plot(Xkoop[:, 0], Xkoop[:, 1], \"r--\")\n", - "ax2.set(xlabel=\"$x_1$\", ylabel=\"$x_2$\", title=\"model simulation\")\n", - "\n", - "plt.tight_layout()\n", - "fig.show();" + "cell_type": "markdown", + "id": "06d44fda-7c7d-49ce-a72f-a9b8dc4a81f6", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, + "source": [ + "We can also compute score similarily to what we did for SINDyc" ] }, { - "cell_type": "markdown", - "id": "b827634c-9cf7-47e9-bc54-393190770c2d", - "metadata": {}, + "cell_type": "code", + "execution_count": null, + "id": "cbc33e63-ec95-4d80-b3d8-bc373ac1e538", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, + "outputs": [], "source": [ - "#### Eigenvalues" + "print(\"Model score: %f\" % mean_squared_error(X_test, X_dmd))" ] }, { "cell_type": "code", "execution_count": null, - "id": "7739dae2-6e86-4d1c-8119-85463c2e2823", - "metadata": {}, + "id": "b1342ac6-85b1-4d15-a95b-26a3ec0541d6", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "outputs": [], "source": [ - "eigval, eigvec = np.linalg.eig(model.A)\n", - "display_array(\"Eigenvalues\", eigval)\n", + "fig, axs = plt.subplots(1, X_test.shape[1], sharex=True)\n", + "for i in range(X_test.shape[1]):\n", + " axs[i].plot(t_test, X_test[:, i], \"k\", label=\"Measured\")\n", + " axs[i].plot(t_test, X_dmd[:, i], \"r--\", label=\"Model\")\n", + " axs[i].legend()\n", + " axs[i].set(xlabel=\"t\", ylabel=\"$x_{}$\".format(i + 1))\n", + "\n", "fig, ax = plt.subplots()\n", - "ax.plot(np.real(eigval), np.imag(eigval), \"o\", color=\"lightgrey\", label=\"DMDc\")\n", - "ax.set(title=\"Eigenvalues\")\n", - "ax.legend();" + "ax.plot(X_test[:, 0], X_test[:, 1], \"k\", label=\"Measured\")\n", + "ax.plot(X_dmd[:, 0], X_dmd[:, 1], \"r--\", label=\"Model\")\n", + "ax.set(xlabel=r\"$x_1$\", ylabel=r\"$x_2$\")\n", + "ax.legend()\n", + "fig.tight_layout()\n", + "plt.show();" + ] + }, + { + "cell_type": "markdown", + "id": "e06b7361-8f3c-43cc-807b-22e46c544abe", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, + "source": [ + "## Inverted Pendulum" + ] + }, + { + "cell_type": "markdown", + "id": "e7474239-a063-4b73-87cb-09287eab23cb", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, + "source": [ + "### Exercise" ] }, { "cell_type": "markdown", "id": "5d0f5a15-4544-4001-8a59-d280bc21f929", - "metadata": {}, + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "source": [ ":::{exercise} Inverted Pendulum\n", ":label: inverted-pendulum-data-model\n", "\n", - "Use the SINDYc or DMDc method to fit a model on the data collected from the inverted pendulum environment.\n", + "Use the SINDYc and/or DMDc method to fit a model on the data collected from the inverted pendulum environment.\n", ":::" ] }, { "cell_type": "markdown", "id": "391a6e1f-86c5-4511-b026-011f2385279e", - "metadata": {}, + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "source": [ ":::{solution} inverted-pendulum-data-model\n", ":::" @@ -1162,36 +1440,421 @@ "cell_type": "code", "execution_count": null, "id": "06b83c62-8f91-4006-bfc7-9ec691e4e995", - "metadata": {}, + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "outputs": [], "source": [ "# Your solution here" ] }, + { + "cell_type": "markdown", + "id": "c88f2aa8-9c4b-43d1-9880-748a74e9d3e2", + "metadata": { + "editable": true, + "jp-MarkdownHeadingCollapsed": true, + "slideshow": { + "slide_type": "" + }, + "tags": [ + "remove-cell" + ] + }, + "source": [ + "### Solution" + ] + }, { "cell_type": "markdown", "id": "3830d079-29d8-45ea-ad1b-c40d288ec7a7", - "metadata": {}, + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "source": [ - ":::{solution} inverted-pendulum-data-model\n", + ":::{solution-start} inverted-pendulum-data-model\n", ":class: dropdown\n", - "**Work in Progress**\n", - "\n", - "```{code-cell}\n", - "EDMDc = pk.regression.EDMDc()\n", - "centers = np.random.uniform(-1.5, 1.5, (4, 4))\n", - "RBF = pk.observables.RadialBasisFunction(\n", - " rbf_type=\"thinplate\",\n", - " n_centers=centers.shape[1],\n", + ":::" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4c246d5d-c0e1-426c-b769-1fca2cb60d18", + "metadata": { + "editable": true, + "jupyter": { + "source_hidden": true + }, + "slideshow": { + "slide_type": "" + } + }, + "outputs": [], + "source": [ + "inverted_pendulum_env = create_inverted_pendulum_environment(\n", + " max_steps=100, theta_threshold=np.inf\n", + ")\n", + "\n", + "inverted_pendulum_observations = dict()\n", + "inverted_pendulum_actions = dict()\n", + "controllers = dict()\n", + "\n", + "for i in range(5):\n", + " controllers[f\"PRBS {i}\"] = PRBSController(\n", + " np.asarray([(-1) ** i * inverted_pendulum_env.force_max / (2**i)]), seed=i\n", + " )\n", + "\n", + "for controller_name, controller in controllers.items():\n", + " result = simulate_environment(inverted_pendulum_env, controller=controller)\n", + " inverted_pendulum_observations[controller_name] = result.observations\n", + " inverted_pendulum_actions[controller_name] = result.actions" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "15f547aa-b7fb-41bc-9d76-e3f6ca5b760b", + "metadata": { + "editable": true, + "jupyter": { + "source_hidden": true + }, + "slideshow": { + "slide_type": "" + } + }, + "outputs": [], + "source": [ + "fig, axes = plt.subplots(2, 2, sharex=True)\n", + "axes = axes.ravel()\n", + "for i, label in zip(range(4), [\"$x$\", r\"$\\dot{x}$\", r\"$\\theta$\", r\"$\\dot{\\theta}$\"]):\n", + " for j, (controller_name, observations) in enumerate(\n", + " inverted_pendulum_observations.items()\n", + " ):\n", + " t = np.arange(len(observations[:])) * cart_env.dt\n", + " axes[i].plot(t, observations[:, i], label=controller_name)\n", + " axes[i].set_xlabel(\"Time\")\n", + " axes[i].set_title(label)\n", + "plt.legend()\n", + "fig.tight_layout()\n", + "plt.show();" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "83211e32-c935-4c9f-b217-14cbed5fbcac", + "metadata": { + "editable": true, + "jupyter": { + "source_hidden": true + }, + "slideshow": { + "slide_type": "" + } + }, + "outputs": [], + "source": [ + "controller_name = \"PRBS 0\"\n", + "\n", + "X = inverted_pendulum_observations[controller_name][:-1].copy()\n", + "U = inverted_pendulum_actions[controller_name].copy()\n", + "t = np.arange(len(X)) * cart_env.dt\n", + "# Train with 80% of the data and test with 20%\n", + "test_size = 0.2\n", + "X_train, X_test, U_train, U_test, t_train, t_test = train_test_split(\n", + " X, U, t, test_size=test_size, shuffle=False\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "ae2bbd01-866f-4455-bc35-143f01fab3f4", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, + "source": [ + "**SINDYc**" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "68ff6109-3080-41d0-937c-ad63a2d0da81", + "metadata": { + "editable": true, + "jupyter": { + "source_hidden": true + }, + "slideshow": { + "slide_type": "" + } + }, + "outputs": [], + "source": [ + "optimizer = ps.STLSQ(threshold=0.3, max_iter=200)\n", + "\n", + "feature_library = ps.IdentityLibrary()\n", + "\n", + "differentiation_method = ps.FiniteDifference(order=1)\n", + "model = ps.SINDy(\n", + " optimizer=optimizer,\n", + " feature_library=feature_library,\n", + " differentiation_method=differentiation_method,\n", + ")\n", + "model.fit(X_train, u=U_train, t=inverted_pendulum_env.dt)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a9fde755-05c7-4315-b674-941cf469bde9", + "metadata": { + "editable": true, + "jupyter": { + "source_hidden": true + }, + "slideshow": { + "slide_type": "" + } + }, + "outputs": [], + "source": [ + "model.print()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "472463d9-a095-479e-9c85-a4e8d5140744", + "metadata": { + "editable": true, + "jupyter": { + "source_hidden": true + }, + "slideshow": { + "slide_type": "" + } + }, + "outputs": [], + "source": [ + "print(\n", + " \"Model score: %f\"\n", + " % model.score(\n", + " X_test_list[test_index],\n", + " u=U_test_list[test_index],\n", + " t=cart_env.dt,\n", + " metric=mean_squared_error,\n", + " )\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "32705aa7-2778-46a9-ba77-63f3f6c12624", + "metadata": { + "editable": true, + "jupyter": { + "source_hidden": true + }, + "slideshow": { + "slide_type": "" + } + }, + "outputs": [], + "source": [ + "X_sindy = model.simulate(X_test[0], t_test, u=U_test)\n", + "X_sindy = np.vstack([X_test[0][np.newaxis, :], X_sindy])" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "54fdd009-873b-45ab-a81b-b7b9e5440d23", + "metadata": { + "editable": true, + "jupyter": { + "source_hidden": true + }, + "slideshow": { + "slide_type": "" + } + }, + "outputs": [], + "source": [ + "fig, axes = plt.subplots(2, X_test_list[test_index].shape[1] // 2, sharex=True)\n", + "axes = axes.ravel()\n", + "for i in range(X_test.shape[1]):\n", + " axes[i].plot(t_test, X_test[:, i], \"k\", label=\"Measured\")\n", + " axes[i].plot(t_test, X_sindy[:, i], \"r--\", label=\"Model\")\n", + " axes[i].set(xlabel=\"t\", title=\"$x_{}$\".format(i + 1))\n", + " axes[i].legend()\n", + "fig.tight_layout()\n", + "\n", + "fig, ax = plt.subplots()\n", + "ax.plot(X_test[:, 0], X_test[:, 2], \"k\", label=\"Measured\")\n", + "ax.plot(X_sindy[:, 0], X_sindy[:, 2], \"r--\", label=\"Model\")\n", + "ax.set(xlabel=\"$x_1$\", ylabel=\"$x_3$\")\n", + "ax.legend()\n", + "fig.tight_layout()\n", + "plt.show();" + ] + }, + { + "cell_type": "markdown", + "id": "8eaba678-2246-43d5-aa69-6470a144f768", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, + "source": [ + "**DMDc**" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4c96f242-72de-4d71-a502-ebaeebd53bf3", + "metadata": { + "editable": true, + "jupyter": { + "source_hidden": true + }, + "slideshow": { + "slide_type": "" + } + }, + "outputs": [], + "source": [ + "regressor = pk.regression.EDMDc()\n", + "\n", + "# Compute centers from training data\n", + "rng = np.random.default_rng(seed=16)\n", + "centers = rng.random((X_train.shape[1], X_train.shape[0]))\n", + "# Scale to input features' range\n", + "x_max = np.max(X_train, axis=0)\n", + "x_min = np.min(X_train, axis=0)\n", + "centers = centers * (x_max[:, np.newaxis] - x_min[:, np.newaxis]) + x_max[:, np.newaxis]\n", + "\n", + "obsv = pk.observables.RadialBasisFunction(\n", + " rbf_type=\"polyharmonic\",\n", " centers=centers,\n", - " kernel_width=1,\n", - " polyharmonic_coeff=1.0,\n", + " n_centers=centers.shape[0],\n", + " polyharmonic_coeff=1.1,\n", " include_state=True,\n", ")\n", - "\n", - "model = pk.Koopman(observables=RBF, regressor=EDMDc)\n", - "model.fit(X, u=U, dt=env.dt)\n", - "```\n", + "model = pk.Koopman(observables=obsv, regressor=regressor)\n", + "model.fit(X_train, u=U_train, dt=cart_env.dt)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "11d47585-2e88-40be-8411-bbd537ae90f5", + "metadata": { + "editable": true, + "jupyter": { + "source_hidden": true + }, + "slideshow": { + "slide_type": "" + } + }, + "outputs": [], + "source": [ + "display_array(\"A\", model.A)\n", + "display_array(\"B\", model.B)\n", + "display_array(\"C\", model.C)\n", + "display_array(\"W\", model.W)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5d5a5804-e854-43c7-bcf4-9ea69d80f871", + "metadata": { + "editable": true, + "jupyter": { + "source_hidden": true + }, + "slideshow": { + "slide_type": "" + } + }, + "outputs": [], + "source": [ + "X_dmd = model.simulate(X_test[0], U_test, n_steps=X_test.shape[0] - 1)\n", + "X_dmd = np.vstack([X_test[0][np.newaxis, :], X_dmd])" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "13dbe45f-6260-4e10-b2df-4f6d595ddd2c", + "metadata": { + "editable": true, + "jupyter": { + "source_hidden": true + }, + "slideshow": { + "slide_type": "" + } + }, + "outputs": [], + "source": [ + "print(\"Model score: %f\" % mean_squared_error(X_test, X_dmd))" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "69731f40-ae57-4bf0-9e66-e430d7b84e32", + "metadata": { + "editable": true, + "jupyter": { + "source_hidden": true + }, + "slideshow": { + "slide_type": "" + } + }, + "outputs": [], + "source": [ + "fig, axes = plt.subplots(2, X_test.shape[1] // 2, sharex=True)\n", + "axes = axes.ravel()\n", + "for i in range(X_test.shape[1]):\n", + " axes[i].plot(t_test, X_test[:, i], \"k\", label=\"Measured\")\n", + " axes[i].plot(t_test, X_dmd[:, i], \"r--\", label=\"Model\")\n", + " axes[i].set(xlabel=\"t\", title=\"$x_{}$\".format(i + 1))\n", + " axes[i].legend()\n", + "fig.tight_layout()\n", + "plt.show();" + ] + }, + { + "cell_type": "markdown", + "id": "a7bf847d-a810-4396-90fe-d62e71264b37", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, + "source": [ + ":::{solution-end}\n", ":::" ] } diff --git a/notebooks/nb_80_safe_learning_control.ipynb b/notebooks/nb_80_safe_learning_control.ipynb index 0812f66..f2f6735 100644 --- a/notebooks/nb_80_safe_learning_control.ipynb +++ b/notebooks/nb_80_safe_learning_control.ipynb @@ -2,9 +2,12 @@ "cells": [ { "cell_type": "code", - "execution_count": 1, + "execution_count": null, "metadata": { "init_cell": true, + "jupyter": { + "source_hidden": true + }, "scene__Default Scene": true, "slideshow": { "slide_type": "skip" @@ -21,9 +24,12 @@ }, { "cell_type": "code", - "execution_count": 2, + "execution_count": null, "metadata": { "init_cell": true, + "jupyter": { + "source_hidden": true + }, "scene__Default Scene": true, "slideshow": { "slide_type": "skip" @@ -32,177 +38,7 @@ "ActiveScene" ] }, - "outputs": [ - { - "data": { - "text/html": [ - "" - ], - "text/plain": [ - "" - ] - }, - "execution_count": 2, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "%presentation_style" ] @@ -215,13 +51,11 @@ } }, "source": [ - ":::{figure} ./_static/images/aai-institute-cover.png\n", + "```{figure} ./_static/images/aai-institute-cover.png\n", ":width: 90%\n", ":align: center\n", - "---\n", - "name: aai-institute\n", - "---\n", - ":::" + ":alt: aai-institute\n", + "```" ] }, { @@ -279,6 +113,7 @@ "source": [ ":::{figure} _static/images/80_comparison_model_driven_data_driven.svg\n", ":width: 80%\n", + "\n", "A comparison of model-driven, data-driven, and combined approaches. *Taken from {cite}`brunke_safe_2022`*.\n", ":::" ] @@ -311,6 +146,7 @@ "source": [ ":::{figure} _static/images/80_safe_control_block_diagram.svg\n", ":width: 80%\n", + "\n", "Block diagram representing safe learning control approaches. *Taken from {cite}`brunke_safe_2022`*.\n", ":::" ] @@ -336,6 +172,7 @@ "source": [ ":::{figure} _static/images/80_safety_levels.svg\n", ":width: 100%\n", + "\n", "Illustration of Safety Levels. *Taken from {cite}`brunke_safe_2022`*.\n", ":::" ] @@ -504,6 +341,7 @@ "source": [ ":::{figure} _static/images/80_safety_filter.svg\n", ":width: 60%\n", + "\n", "Based on the current state $x$, a learning-based controller provides an input\n", "$u_L = \\pi_L(x) \\in \\mathbb{R}^m$, which is processed by the safety filter $u = \\pi_S(x, u_S)$ and applied to the real system. *Taken from {cite}`hewing_learningbased_2020`*.\n", ":::" @@ -560,6 +398,7 @@ "source": [ ":::{figure} _static/images/80_safe_learning_approaches.svg\n", ":width: 70%\n", + "\n", "Summary of safe learning control approaches. *Taken from {cite}`brunke_safe_2022`*.\n", ":::" ] diff --git a/notebooks/nb_90_practice.ipynb b/notebooks/nb_90_practice.ipynb index 949f536..a8d783d 100644 --- a/notebooks/nb_90_practice.ipynb +++ b/notebooks/nb_90_practice.ipynb @@ -2,11 +2,14 @@ "cells": [ { "cell_type": "code", - "execution_count": 1, + "execution_count": null, "id": "8497a4b2-a020-4fba-8539-bc1361a34023", "metadata": { "editable": true, "init_cell": true, + "jupyter": { + "source_hidden": true + }, "scene__Default Scene": true, "slideshow": { "slide_type": "skip" @@ -28,10 +31,13 @@ }, { "cell_type": "code", - "execution_count": 2, + "execution_count": null, "id": "f7339d3d-eae4-4e05-a7fb-050df96782e7", "metadata": { "init_cell": true, + "jupyter": { + "source_hidden": true + }, "scene__Default Scene": true, "slideshow": { "slide_type": "skip" @@ -41,188 +47,25 @@ "ActiveScene" ] }, - "outputs": [ - { - "data": { - "text/html": [ - "" - ], - "text/plain": [ - "" - ] - }, - "execution_count": 2, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "%presentation_style" ] }, { "cell_type": "code", - "execution_count": 3, + "execution_count": null, "id": "6e892149-098a-42c7-b554-bbb83918888d", "metadata": { + "editable": true, "init_cell": true, + "jupyter": { + "source_hidden": true + }, "scene__Default Scene": true, + "slideshow": { + "slide_type": "" + }, "tags": [ "ActiveScene" ] @@ -231,15 +74,20 @@ "source": [ "import warnings\n", "\n", - "warnings.simplefilter(\"ignore\", UserWarning)" + "warnings.simplefilter(\"ignore\", UserWarning)\n", + "warnings.simplefilter(\"ignore\", FutureWarning)" ] }, { "cell_type": "code", - "execution_count": 4, + "execution_count": null, "id": "cd78d373-34af-4b57-8d16-dbf6f662331d", "metadata": { + "editable": true, "init_cell": true, + "jupyter": { + "source_hidden": true + }, "scene__Default Scene": true, "slideshow": { "slide_type": "skip" @@ -249,39 +97,43 @@ "ActiveScene" ] }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "pygame 2.5.2 (SDL 2.28.2, Python 3.10.12)\n", - "Hello from the pygame community. https://www.pygame.org/contribute.html\n" - ] - } - ], + "outputs": [], "source": [ "%autoreload\n", "\n", + "import casadi\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import seaborn as sns\n", "import pykoopman as pk\n", "import pysindy as ps\n", + "from do_mpc.simulator import Simulator\n", + "from sklearn.metrics import mean_squared_error\n", + "from scipy.signal import periodogram\n", + "from scipy.fft import rfft, rfftfreq\n", "\n", "from training_ml_control.control import (\n", " ConstantController,\n", " SineController,\n", " SumOfSineController,\n", + " PRBSController,\n", + " SchroederSweepController,\n", " RandomController,\n", ")\n", + "from training_ml_control.control import (\n", + " build_mpc_controller,\n", + ")\n", "from training_ml_control.environments import (\n", - " create_acrobot_environment,\n", + " create_pendulum_environment,\n", " simulate_environment,\n", ")\n", "from training_ml_control.nb_utils import show_video, display_array\n", + "from training_ml_control.plots import animate_pendulum_simulation\n", + "from training_ml_control.models import build_sindy_model\n", "\n", "sns.set_theme()\n", - "plt.rcParams[\"figure.figsize\"] = [12, 8]" + "plt.rcParams[\"figure.figsize\"] = [12, 8]\n", + "warnings.simplefilter(\"ignore\", ExperimentalWarning)" ] }, { @@ -297,19 +149,22 @@ ] }, "source": [ - ":::{figure} ./_static/images/aai-institute-cover.png\n", + "```{figure} ./_static/images/aai-institute-cover.png\n", ":width: 90%\n", ":align: center\n", - "---\n", - "name: aai-institute\n", - "---\n", - ":::" + ":name: aai-institute\n", + "```" ] }, { "cell_type": "markdown", "id": "77372629-782f-44b6-ad24-baedbca2fbc2", - "metadata": {}, + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, "source": [ "# Practice\n", "\n", @@ -323,45 +178,95 @@ "id": "c63180f2-e769-4c43-999e-5e46924c1fbb", "metadata": {}, "source": [ - ":::{exercise-start} Acrobot\n", - ":label: acrobot-exercise\n", + "We will be using the [Pendulum](https://gymnasium.farama.org/environments/classic_control/pendulum/) environment from [gymnasium](https://gymnasium.farama.org/).\n", "\n", - "We will be using a modified version of the [Acrobot](https://gymnasium.farama.org/environments/classic_control/acrobot/) environment from [gymnasium](https://gymnasium.farama.org/).\n", + "The system consists of a pendulum attached at one end to a fixed point, and the other end being free. The pendulum starts in a random position and we can apply torque to rotate the free end.\n", "\n", - "The system consists of two links connected linearly to form a chain, with one end of the chain fixed. The joint between the two links is actuated. \n", + "As seen below, the pendulum is represented in red and the joint is represented in black." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6fd6d0c6-1fd0-4b74-8205-330cfebb5609", + "metadata": {}, + "outputs": [], + "source": [ + "env = create_pendulum_environment()\n", + "result = simulate_environment(env)\n", + "show_video(result.frames, fps=env.metadata[\"render_fps\"])" + ] + }, + { + "cell_type": "markdown", + "id": "f8c9343e-b3fa-43d1-8551-3aa73abe2e08", + "metadata": {}, + "source": [ + "The environments allows the use of the following control (action):\n", "\n", - "As seen below, the links are represented in blue and are connected by two joints represented in green (or yellow?).\n", + "| Index | Action | Unit | Min | Max |\n", + "|-----|---------------------------------------|-----------|-----|-----|\n", + "| 0 | apply torque to the actuated joint | torque (N m) |-2 | 2 |\n", "\n", - "If you would like to use a mathematical model of the system, then please refer to [the following page](http://incompleteideas.net/book/first/ebook/node110.html) for the equations.\n", + "and the following measurements (observation):\n", + "\t\n", "\n", - "**First Goal**\n", + "| Index | Observation | Min | Max |\n", + "|-----|--------------------------------|---------------------|-------------------|\n", + "| 0 | $\\cos(\\theta)$ | $-1$ | $1$ |\n", + "| 1 | $\\sin(\\theta)$ | $-1$ | $1$ |\n", + "| 2 | $\\dot{\\theta}$ | $-8$ | $8$ |\n" + ] + }, + { + "cell_type": "markdown", + "id": "a2d7249e-fc1e-42e6-81eb-4a77775c9ac4", + "metadata": {}, + "source": [ + "## First Goal\n", "\n", - "The first goal is to apply torques on the actuated joint to swing the free end of the linear chain above a (configurable) target height (black horizontal line above system) while starting from the initial state of hanging downwards.\n", + "The first goal is to apply torques on the actuated joint to swing the pendulum into an upright position and keep it there.\n", "\n", - "**Second Goal**\n", + "## Second Goal\n", "\n", - "The second goal is to apply torques on the actuated joint to swing the free end of the linear chain as fast as possible while remaining below the target height.\n", - ":::" + "The second goal is to apply torques on the actuated joint to swing the pendulum as fast as possible." ] }, { - "cell_type": "code", - "execution_count": null, - "id": "20556f5c-61e5-46b4-9cf1-9b8097ec430e", + "cell_type": "markdown", + "id": "4f0a9489-11e2-4642-a291-393d9eb00245", "metadata": {}, - "outputs": [], "source": [ - "env = create_acrobot_environment()\n", - "result = simulate_environment(env)\n", - "show_video(result.frames, fps=env.metadata[\"render_fps\"])" + "# Exercise 1" ] }, { "cell_type": "markdown", "id": "1141f1a7-64d9-436b-bd08-45bc823232a6", + "metadata": { + "editable": true, + "slideshow": { + "slide_type": "" + } + }, + "source": [ + "::::{exercise} Pendulum Model\n", + ":label: pendulum-model-exercise\n", + "\n", + "Use one of the previously seen methods to learn a model of the system from data.\n", + "\n", + ":::{hint}\n", + "If you would like to use a mathematical model of the system either for the control or just as help when learning a model, then please refer to [the following page](https://en.wikipedia.org/wiki/Pendulum_(mechanics)) for the equations.\n", + ":::\n", + "::::" + ] + }, + { + "cell_type": "markdown", + "id": "febdaf0c-b8dc-412c-af17-0f685cd2d81e", "metadata": {}, "source": [ - ":::{exercise-end}\n", + ":::{solution} pendulum-model-exercise\n", ":::" ] }, @@ -375,17 +280,754 @@ "# Your solution here" ] }, + { + "cell_type": "markdown", + "id": "4be8b371-b823-4f74-ba3c-2b415632dfe4", + "metadata": { + "editable": true, + "jp-MarkdownHeadingCollapsed": true, + "slideshow": { + "slide_type": "" + }, + "tags": [ + "remove-cell" + ] + }, + "source": [ + "## Solution" + ] + }, { "cell_type": "markdown", "id": "d503889d-7d14-4d7b-9f55-78099eb5e442", "metadata": {}, "source": [ - ":::{solution} acrobot-exercise\n", + ":::{solution-start} pendulum-model-exercise\n", ":class: dropdown\n", "\n", "**Work in Progress**\n", ":::" ] + }, + { + "cell_type": "markdown", + "id": "ec1d822b-743b-4f2a-823d-0f4b6c59b3b5", + "metadata": {}, + "source": [ + "**Data**" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "31a17c08-d300-4fe0-81df-d732c2a6ee44", + "metadata": { + "editable": true, + "jupyter": { + "source_hidden": true + }, + "slideshow": { + "slide_type": "" + } + }, + "outputs": [], + "source": [ + "env = create_pendulum_environment(max_steps=1000)\n", + "\n", + "observations = dict()\n", + "actions = dict()\n", + "frames = dict()\n", + "\n", + "controllers = {\n", + " \"Sinusoid\": SineController(env, np.asarray([0.5 * env.max_torque]), frequency=1),\n", + " \"Schroeder Sweep\": SchroederSweepController(\n", + " env,\n", + " n_time_steps=1000,\n", + " n_harmonics=10,\n", + " frequency=10,\n", + " ),\n", + " \"PRBS\": PRBSController(np.asarray([env.max_torque])),\n", + "}\n", + "\n", + "for controller_name, controller in controllers.items():\n", + " result = simulate_environment(env, controller=controller)\n", + " observations[controller_name] = result.observations\n", + " actions[controller_name] = result.actions\n", + " frames[controller_name] = result.frames" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f110a03b-c0a1-46bc-a34f-288d07e91dc3", + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "fig, axes = plt.subplots(2, 2, sharex=True)\n", + "axes = axes.ravel()\n", + "for i, label in enumerate([r\"$\\cos(\\theta)$\", r\"$\\sin(\\theta)$\", r\"$\\dot{\\theta}$\"]):\n", + " for j, (controller_name, obs) in enumerate(observations.items()):\n", + " t = np.arange(len(obs[:])) * env.dt\n", + " axes[i].plot(t, obs[:, i], label=controller_name)\n", + " axes[i].set_xlabel(\"Time\")\n", + " axes[i].set_title(label)\n", + " axes[i].legend()\n", + "fig.tight_layout()\n", + "fig.show();" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d1b7513e-5fca-4ff6-8415-717eca4ee8d6", + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "fig, axes = plt.subplots(2, 2, sharex=True)\n", + "axes = axes.ravel()\n", + "for i, label in enumerate([r\"$\\cos(\\theta)$\", r\"$\\sin(\\theta)$\", r\"$\\dot{\\theta}$\"]):\n", + " for j, (controller_name, obs) in enumerate(observations.items()):\n", + " f, Pxx_den = periodogram(obs[:, i], 1 / env.dt)\n", + " axes[i].semilogy(f, Pxx_den, label=controller_name)\n", + " axes[i].set_xlabel(\"frequency [Hz]\")\n", + " axes[i].set_ylabel(\"Power Spectral Density [V**2/Hz]\")\n", + " axes[i].set_title(label)\n", + " axes[i].legend()\n", + "fig.tight_layout()\n", + "plt.show();" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1449f263-edaf-4ad2-bc56-0e09af7e50ba", + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "training_controller_name = \"PRBS\"\n", + "testing_controller_name = \"Sinusoid\"\n", + "\n", + "X_train = observations[training_controller_name][:-1].copy()\n", + "U_train = actions[training_controller_name].copy()\n", + "t_train = np.arange(0, len(X_train)) * env.dt\n", + "\n", + "X_test = observations[testing_controller_name][:-1].copy()\n", + "U_test = actions[testing_controller_name].copy()\n", + "t_test = np.arange(0, len(X_test)) * env.dt" + ] + }, + { + "cell_type": "markdown", + "id": "9aa6b096-eb99-4393-9494-8856129795a0", + "metadata": {}, + "source": [ + "**SINDYc**" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6241990a-4daf-40c5-a7ca-00e7e6ec5dbc", + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "optimizer = ps.STLSQ(threshold=0.9)\n", + "feature_library = ps.PolynomialLibrary(degree=2)\n", + "sindy_model = ps.SINDy(optimizer=optimizer, feature_library=feature_library)\n", + "sindy_model.fit(X_train, u=U_train, t=t_train)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9abd56c2-8d09-4477-8675-0830aa570553", + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "sindy_model.print()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0e2f9af6-7e66-43b2-b37a-22652772110f", + "metadata": { + "editable": true, + "jupyter": { + "source_hidden": true + }, + "slideshow": { + "slide_type": "" + } + }, + "outputs": [], + "source": [ + "print(\n", + " \"Model score: %f\"\n", + " % sindy_model.score(X_test, u=U_test, t=env.dt, metric=mean_squared_error)\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4272a27f-d077-49f6-89bd-d01315a18a3e", + "metadata": { + "editable": true, + "jupyter": { + "source_hidden": true + }, + "scrolled": true, + "slideshow": { + "slide_type": "" + } + }, + "outputs": [], + "source": [ + "X_sindy = sindy_model.simulate(X_test[0], t_test, u=U_test)\n", + "X_sindy = np.vstack([X_test[0][np.newaxis, :], X_sindy])" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "55a5a11d-7737-4084-af3a-c44c912133af", + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "fig, axes = plt.subplots(2, X_test.shape[1] // 2 + X_test.shape[1] % 2, sharex=True)\n", + "axes = axes.ravel()\n", + "for i in range(X_test.shape[1]):\n", + " axes[i].plot(t_test, X_test[:, i], \"k\", label=\"Measured\")\n", + " axes[i].plot(t_test, X_sindy[:, i], \"r--\", label=\"Model\")\n", + " axes[i].legend()\n", + " axes[i].set(xlabel=\"t\", ylabel=\"$x_{}$\".format(i + 1))\n", + "\n", + "fig.tight_layout()\n", + "fig.show();" + ] + }, + { + "cell_type": "markdown", + "id": "764fc80f-2d09-461e-b657-04601435e81b", + "metadata": {}, + "source": [ + "The results seem to be good, at least for up to a certain number of steps which should be good enough for our purpose. We could also use hyper-parameter optimization to find the best model. However, we have to be careful with overfitting." + ] + }, + { + "cell_type": "markdown", + "id": "640b876f-18ad-4552-80c3-9d5165687308", + "metadata": {}, + "source": [ + "**EDMDc**\n", + "\n", + "Let's also use DMD to fit a model in order to select the best model of the two methods." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6f7bb112-b7ed-459c-ad70-21007e1380f7", + "metadata": { + "editable": true, + "jupyter": { + "source_hidden": true + }, + "slideshow": { + "slide_type": "" + } + }, + "outputs": [], + "source": [ + "regressor = pk.regression.EDMDc()\n", + "observables = pk.observables.Polynomial(degree=2)\n", + "dmd_model = pk.Koopman(observables=observables, regressor=regressor)\n", + "dmd_model.fit(X_train, u=U_train, dt=env.dt)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6d8438ce-2e41-40ee-830e-f55bbd2b3a58", + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "X_dmd = dmd_model.simulate(X_test[0], U_test, n_steps=X_test.shape[0] - 1)\n", + "X_dmd = np.vstack([X_test[0][np.newaxis, :], X_dmd])" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f0f9f8b5-e79d-4537-9667-297f3ab7e3a9", + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "print(\"Model score: %f\" % mean_squared_error(X_test, X_dmd))" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4f2c9d85-a27e-4a28-a7f4-684f467676e3", + "metadata": { + "editable": true, + "jupyter": { + "source_hidden": true + }, + "slideshow": { + "slide_type": "" + } + }, + "outputs": [], + "source": [ + "fig, axes = plt.subplots(2, X_test.shape[1] // 2 + X_test.shape[1] % 2, sharex=True)\n", + "axes = axes.ravel()\n", + "for i in range(X_test.shape[1]):\n", + " axes[i].plot(t_test, X_test[:, i], \"k\", label=\"Measured\")\n", + " axes[i].plot(t_test, X_dmd[:, i], \"r--\", label=\"Model\")\n", + " axes[i].legend()\n", + " axes[i].set(xlabel=\"t\", ylabel=\"$x_{}$\".format(i + 1))\n", + "\n", + "fig.tight_layout()\n", + "fig.show();" + ] + }, + { + "cell_type": "markdown", + "id": "249e7fb9-7d2e-4dee-bbf9-3fd329ec9a9f", + "metadata": {}, + "source": [ + "The results don't seem good enough. We could also use hyper-parameter optimization to find the best model. However, we have to be careful with overfitting." + ] + }, + { + "cell_type": "markdown", + "id": "c9a96be2-c516-48cb-8fb4-6cc09990d9ee", + "metadata": {}, + "source": [ + "We will use the sindy model for the second exercise." + ] + }, + { + "cell_type": "markdown", + "id": "988f7987-cb08-4067-a28f-9c263b4cc345", + "metadata": {}, + "source": [ + ":::{solution-end}\n", + ":::" + ] + }, + { + "cell_type": "markdown", + "id": "03590496-8456-4bee-a749-89ecef1fa891", + "metadata": {}, + "source": [ + "# Exercise 2" + ] + }, + { + "cell_type": "markdown", + "id": "6ff68598-62c6-4bfb-a1d7-eb45263b0b59", + "metadata": {}, + "source": [ + "::::{exercise} Pendulum Control\n", + ":label: pendulum-control-exercise\n", + "\n", + "Use the learned model and synthesize a controller to achieve the goals described above.\n", + "\n", + "::::" + ] + }, + { + "cell_type": "markdown", + "id": "b819b2c4-0633-4a85-88dc-31336ac876f1", + "metadata": {}, + "source": [ + ":::{solution} pendulum-control-exercise\n", + ":::" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "94bc22d1-5018-4dff-b1b1-935d546f23d5", + "metadata": {}, + "outputs": [], + "source": [ + "# Your solution here" + ] + }, + { + "cell_type": "markdown", + "id": "fe7b29e0-64c7-4383-b16f-c28dc32c76d8", + "metadata": { + "editable": true, + "jp-MarkdownHeadingCollapsed": true, + "slideshow": { + "slide_type": "" + }, + "tags": [ + "remove-cell" + ] + }, + "source": [ + "## Solution" + ] + }, + { + "cell_type": "markdown", + "id": "056c24b8-1a73-4555-a1bd-f5deeae860e6", + "metadata": {}, + "source": [ + ":::{solution-start} pendulum-control-exercise\n", + ":class: dropdown\n", + ":::" + ] + }, + { + "cell_type": "markdown", + "id": "64ee8611-1732-40de-ac31-25c5167acae5", + "metadata": {}, + "source": [ + "For this exercise, we will use the SINDYc model with along with an MPC controller to achieve our control objectives.\n", + "\n", + "For that we have to first convert the SINDYc model to a CasADi model in order to use it with do-mpc." + ] + }, + { + "cell_type": "markdown", + "id": "07c8e2d4-f0d6-46d1-9a60-93bf3d8efc8f", + "metadata": {}, + "source": [ + "**Model**" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6b2243a9-1391-48e3-ba9d-855a1997795c", + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "mpc_model = build_sindy_model(sindy_model)" + ] + }, + { + "cell_type": "markdown", + "id": "4f8b95d2-b704-4816-8dcb-06445d8fd299", + "metadata": {}, + "source": [ + "To make sure that our model is correct, we simulate the system using it" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b7f3ddc3-0735-4aab-b0bb-ebea13bb9ca4", + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "simulator = Simulator(mpc_model)\n", + "params_simulator = {\n", + " \"integration_tool\": \"idas\",\n", + " \"abstol\": 1e-8,\n", + " \"reltol\": 1e-8,\n", + " \"t_step\": env.dt,\n", + "}\n", + "simulator.set_param(**params_simulator)\n", + "simulator.setup()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6fbc369c-4d04-4a99-ab52-f48b840d30c4", + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "%%capture\n", + "x0 = X_test[0]\n", + "\n", + "simulator.reset_history()\n", + "simulator.x0 = x0\n", + "\n", + "for u in U_test:\n", + " simulator.make_step(u.reshape((-1, 1)))" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "dba337e7-fe82-4d66-9c5a-58274073e956", + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "animate_pendulum_simulation(simulator.data)" + ] + }, + { + "cell_type": "markdown", + "id": "be622c5f-b67a-4289-811b-43127b042fe2", + "metadata": {}, + "source": [ + "**Controller**" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b256a42a-cbfc-4a91-93f3-323caaf992c1", + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "setpoint = np.array([1.0, 0.0, 0.0])\n", + "cost = casadi.norm_2(mpc_model.x.cat - setpoint) - 100 * mpc_model.x[\"x0\"]\n", + "\n", + "terminal_cost = cost\n", + "stage_cost = cost\n", + "print(f\"Stage Cost = {stage_cost}\")\n", + "print(f\"Terminal Cost = {terminal_cost}\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f3d05114-2831-4fcb-acaa-2ba3f2be3d59", + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "u_limits = {\"u0\": np.array([-2, 2])}\n", + "u_penalty = {\"u0\": 0.00}\n", + "x_limits = {\"x0\": np.array([-1, 1]), \"x1\": np.array([-1, 1]), \"x2\": np.array([-8, 8])}" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a881458d-1bc9-4cf8-9143-ee37a4f13fee", + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "mpc_controller = build_mpc_controller(\n", + " model=mpc_model,\n", + " t_step=env.dt,\n", + " n_horizon=50,\n", + " stage_cost=stage_cost,\n", + " terminal_cost=terminal_cost,\n", + " x_limits=x_limits,\n", + " u_penalty=u_penalty,\n", + " u_limits=u_limits,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "e0f2fb24-ec1b-4f87-aa37-908d220ff4e0", + "metadata": {}, + "source": [ + "**Simulation**" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6c4d867c-1eb0-4859-9694-314103585c0d", + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "%%capture\n", + "\n", + "mpc_controller.reset_history()\n", + "simulator.reset_history()\n", + "\n", + "x = np.zeros((3, 1))\n", + "# random angle\n", + "theta0 = np.random.uniform(low=-np.pi, high=np.pi)\n", + "# cosine and sine\n", + "x[0] = np.cos(theta0)\n", + "x[1] = np.sin(theta0)\n", + "# angular velocity\n", + "x[2] = np.random.uniform(low=-8, high=8)\n", + "\n", + "simulator.x0 = x\n", + "mpc_controller.x0 = x\n", + "mpc_controller.set_initial_guess()\n", + "\n", + "for k in range(100):\n", + " u = mpc_controller.make_step(x)\n", + " x = simulator.make_step(u)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e46bd3ed-8a22-48f2-beab-b63e8c69b3fe", + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "animate_pendulum_simulation(mpc_controller.data)" + ] + }, + { + "cell_type": "markdown", + "id": "4936afad-5781-49eb-b755-4ff106e5c428", + "metadata": {}, + "source": [ + "**Environment**" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a3d8f3d4-0a0d-425e-9d08-326e15de4c69", + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "class MPCController:\n", + " def __init__(self, mpc: MPC) -> None:\n", + " self.mpc = mpc\n", + " self.mpc.reset_history()\n", + " x0 = np.zeros((3, 1))\n", + " # random angle\n", + " theta0 = np.random.uniform(low=-np.pi, high=np.pi)\n", + " # cosine and sine\n", + " x0[0] = np.cos(theta0)\n", + " x0[1] = np.sin(theta0)\n", + " # angular velocity\n", + " x0[2] = np.random.uniform(low=-8, high=8)\n", + " self.mpc.x0 = x0\n", + " self.mpc.set_initial_guess()\n", + "\n", + " def act(self, observation: NDArray) -> NDArray:\n", + " return self.mpc.make_step(observation.reshape(-1, 1)).ravel()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "adaab80b-75dd-48b4-8c02-e24316b9e1c4", + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "%%capture\n", + "controller = MPCController(mpc_controller)\n", + "results = simulate_environment(env, max_steps=100, controller=controller)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "fc0a22f0-c146-4b85-bfe4-665c3147ea9c", + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "show_video(results.frames, fps=env.metadata[\"render_fps\"])" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "391ff9a2-6b31-4597-b81a-26202ed62cde", + "metadata": { + "jupyter": { + "source_hidden": true + } + }, + "outputs": [], + "source": [ + "animate_pendulum_simulation(mpc_controller.data)" + ] + }, + { + "cell_type": "markdown", + "id": "9edc9cb5-4011-4faa-817c-8446b89b0010", + "metadata": {}, + "source": [ + ":::{solution-end}\n", + ":::" + ] } ], "metadata": {