From 083f391bfa1702bcdde029fa85e69561adef83b5 Mon Sep 17 00:00:00 2001 From: Marika-K <112631247+Marika-K@users.noreply.github.com> Date: Mon, 1 Jul 2024 08:45:20 +0200 Subject: [PATCH] Line Optimization: add missing parameter documentation (#151) * add missing parameter to docstring * update documentation page * remove default value parameter from function call --------- Co-authored-by: Simon Bowly --- docs/source/mods/line-optimization.rst | 171 +++++++++++++---------- src/gurobi_optimods/line_optimization.py | 29 ++-- 2 files changed, 118 insertions(+), 82 deletions(-) diff --git a/docs/source/mods/line-optimization.rst b/docs/source/mods/line-optimization.rst index fc4155aa..a094f5ab 100644 --- a/docs/source/mods/line-optimization.rst +++ b/docs/source/mods/line-optimization.rst @@ -2,21 +2,22 @@ Line Optimization in Public Transport ===================================== The line optimization problem in public transport is to choose a set of lines -(routes in the transportation network) with associated frequencies (how often the -lines are operated) such that a given transportation demand can be satisfied. -This problem is an example of a classical network design problem. -There are different approaches and models to solve the line optimization problem. -A general overview on models and methods is given by Schoebel :footcite:p:`schoebel2012`. - -For this optimod we assume we are given a public transportation network: -a set of stations and the direct links between them. We are given the +(routes in the transportation network) with associated frequencies (how often +the lines are operated) such that a given transportation demand can be +satisfied. This problem is an example of a classical network design problem. +There are different approaches and models to solve the line optimization +problem. A general overview on models and methods is given by Schoebel +:footcite:p:`schoebel2012`. + +For this optimod we assume we are given a public transportation network: a set +of stations and the direct links between them. We are given the origin-destination (OD) *demand*, i.e., it is known how many passengers want to -travel from one station to another in the network within the considered time horizon. -We are also given a set of possible *lines*. A line is a path in a public -transportation network. We call *line plan* a subset of these lines where each line is -associated with a frequency. The optimod computes a line plan with -minimum cost such that the capacity of the chosen lines is sufficient to transport -all passengers. +travel from one station to another in the network within the considered time +horizon. We are also given a set of possible *lines*. A line is a path in a +public transportation network. We call *line plan* a subset of these lines where +each line is associated with a frequency. The optimod computes a line plan with +minimum cost such that the capacity of the chosen lines is sufficient to +transport all passengers. We provide two different strategies to find a line plan with minimum cost: @@ -26,26 +27,27 @@ We provide two different strategies to find a line plan with minimum cost: allowed to route passengers along these paths. #. The second approach does not restrict the passenger paths, but instead applies a multi-objective approach. In the first objective the cost of the - line plan is minimized such that all passenger demand can be satisfied. In the - second objective the travel time for the passengers is minimized such that the - cost of the line plan increases by at most 20% over the minimum cost line plan. + line plan is minimized such that all passenger demand can be satisfied. In + the second objective the travel time for the passengers is minimized such + that the cost of the line plan increases by at most 20% over the minimum cost + line plan. Problem Specification --------------------- Let a graph :math:`G=(V,E)` represent the transportation network. The set of -vertices :math:`V` represent the stations and the set of edges -:math:`E` represent all possibilities to travel from one station to another without -an intermediate station. -A directed edge :math:`(u,v)\in E` has the attribute time :math:`\tau_{uv}\geq 0` that -represents the amount of time needed to travel from :math:`u` to :math:`v`. -For each pair of nodes :math:`u,v\in V` a demand :math:`d_{uv}\geq 0` is given. -The demand represents the number of passengers who want to travel from :math:`u` -to :math:`v` in the considered time horizon. Let :math:`D` be the set of all node pairs with -positive demand. This set is also called OD pairs. -Further given is a set of lines :math:`L`. A line :math:`l\in L` contains the stations it traverses -in the given order. If a line runs in both directions, it needs to be repeated in reverse order. +vertices :math:`V` represent the stations and the set of edges :math:`E` +represent all possibilities to travel from one station to another without an +intermediate station. A directed edge :math:`(u,v)\in E` has the attribute time +:math:`\tau_{uv}\geq 0` that represents the amount of time needed to travel from +:math:`u` to :math:`v`. For each pair of nodes :math:`u,v\in V` a demand +:math:`d_{uv}\geq 0` is given. The demand represents the number of passengers +who want to travel from :math:`u` to :math:`v` in the considered time horizon. +Let :math:`D` be the set of all node pairs with positive demand. This set is +also called OD pairs. Further given is a set of lines :math:`L`. A line +:math:`l\in L` contains the stations it traverses in the given order. If a line +runs in both directions, it needs to be repeated in reverse order. A line has the following additional attributes: @@ -53,14 +55,14 @@ A line has the following additional attributes: - operating cost: :math:`c_{l}\geq 0` for operating the line once in the given time horizon - capacity: :math:`\kappa_{l}\geq 0` when operating the line :math:`l` once in the given time horizon -Additionally, we are given a list of frequencies. The frequencies define the possible -number of operations for the lines in the given time horizon. -If a line :math:`l` is operated with frequency :math:`f` the overall cost for the line is -:math:`C_{lf}=C_l + c_{lf}\cdot f` and the total capacity provided by the line is -:math:`\kappa_{l,f}=\kappa_l\cdot f`. +Additionally, we are given a list of frequencies. The frequencies define the +possible number of operations for the lines in the given time horizon. If a line +:math:`l` is operated with frequency :math:`f` the overall cost for the line is +:math:`C_{lf}=C_l + c_{lf}\cdot f` and the total capacity provided by the line +is :math:`\kappa_{l,f}=\kappa_l\cdot f`. -The problem can be stated as finding a subset of lines and associated frequencies with minimal total cost -such that all passengers can be transported. +The problem can be stated as finding a subset of lines and associated +frequencies with minimal total cost such that all passengers can be transported. .. dropdown:: Background: Optimization Model with Approach 1 @@ -72,8 +74,8 @@ such that all passengers can be transported. We have binary variables :math:`x_{l,f}` for each line frequency combination indicating if line :math:`l` is operated with frequency :math:`f`. - This Mod is implemented by formulating a Linear Program (LP) and solving it - using Gurobi. The formulation can be stated as follows: + This Mod is implemented by formulating a mixed integer linear program and + solving it using Gurobi. The formulation can be stated as follows: .. math:: @@ -97,11 +99,12 @@ such that all passengers can be transported. .. dropdown:: Background: Optimization Model with Approach 2 - This Mod is implemented by formulating a Linear Program (LP) and solving it - using Gurobi. We have binary variables :math:`x_{l,f}` for each line frequency combination indicating if line :math:`l` - is operated with frequency :math:`f`. - We define continuous variables :math:`y_{suv}\geq 0` for the number of passengers starting their trip - in station :math:`s` and traveling on edge :math:`(u,v)`. + This Mod is implemented by formulating a mixed integer linear program and + solving it using Gurobi. We have binary variables :math:`x_{l,f}` for each + line frequency combination indicating if line :math:`l` is operated with + frequency :math:`f`. We define continuous variables :math:`y_{suv}\geq 0` + for the number of passengers starting their trip in station :math:`s` and + traveling on edge :math:`(u,v)`. The formulation can be stated as follows: @@ -143,7 +146,9 @@ An example of the inputs with the respective requirements is shown below. :options: +NORMALIZE_WHITESPACE >>> from gurobi_optimods import datasets - >>> node_data, edge_data, line_data, linepath_data, demand_data = datasets.load_siouxfalls_network_data() + >>> node_data, edge_data, line_data, linepath_data, demand_data = ( + ... datasets.load_siouxfalls_network_data() + ... ) >>> node_data.head(4) number posx posy 0 1 50000.0 510000.0 @@ -176,20 +181,23 @@ An example of the inputs with the respective requirements is shown below. 3 1 5 10 >>> frequencies = [1,3] - For the example we used data of the Sioux-Falls network. It is not considered as a realistic one. However, - this network can be found on different websites when considering traffic problems - (originally by Hillel Bar-Gera http://www.bgu.ac.il/~bargera/tntp/). We added a set of line routes. - Note that the output shown above contains some additional information that is not required for computation, for example - the property length in the edge data. - Also, ``posx`` and ``posy`` in the ``node_data`` is not used for computation. But it can be used to visualize the - network as done below. - It is important that all data is consistant. For example, ``edge_source``, ``edge_target`` - in the ``linepath_data`` must correspond to a ``number`` in the node_data. The same holds - for ``source`` and ``target`` in ``edge_data`` and ``demand_data``. - In the code it is checked that all tables provide the relevant columns. - Note that the edges are assumed to be directed and both direction need to be defined if an edge - can be traversed in both directions. In the same way, a line is a directed path. If a line is - turning at the end point and goes back the same way, the nodes need to be added again in reverse order. + For the example we used data of the Sioux-Falls network. It is not + considered as a realistic one. However, this network can be found on + different websites when considering traffic problems (originally by Hillel + Bar-Gera http://www.bgu.ac.il/~bargera/tntp/). We added a set of line + routes. Note that the output shown above contains some additional + information that is not required for computation, for example the property + length in the edge data. Also, ``posx`` and ``posy`` in the ``node_data`` + is not used for computation. But it can be used to visualize the network + as done below. It is important that all data is consistent. For example, + ``edge_source``, ``edge_target`` in the ``linepath_data`` must correspond + to a ``number`` in the node_data. The same holds for ``source`` and + ``target`` in ``edge_data`` and ``demand_data``. In the code it is checked + that all tables provide the relevant columns. Note that the edges are + assumed to be directed and both direction need to be defined if an edge + can be traversed in both directions. In the same way, a line is a directed + path. If a line is turning at the end point and goes back the same way, + the nodes need to be added again in reverse order. Solution -------- @@ -199,10 +207,11 @@ The solution consists of two information - the total cost of the optimal line plan - the optimal line plan as a list of linename-frequency tuples. -The strategy can be defined via a Boolean parameter **shortestPaths**. This parameter has a default value (True) -which uses approach 1, i.e., routing passengers on shortest paths only. -Note that strategy 1 needs the python package networkx. If this is not available, the second approach is used. -The second approach is also used if the parameter shortestPaths is set to False. +The strategy can be defined via a Boolean parameter ``shortest_paths``. This +parameter has a default value (True) which uses approach 1, i.e., routing +passengers on shortest paths only. Note that strategy 1 needs the python package +networkx. If this is not available, the second approach is used. The second +approach is also used if the parameter ``shortest_paths`` is set to False. .. tabs:: @@ -213,28 +222,48 @@ The second approach is also used if the parameter shortestPaths is set to False. >>> from gurobi_optimods import datasets >>> from gurobi_optimods.line_optimization import line_optimization - >>> node_data, edge_data, line_data, linepath_data, demand_data = datasets.load_siouxfalls_network_data() + >>> node_data, edge_data, line_data, linepath_data, demand_data = ( + ... datasets.load_siouxfalls_network_data() + ... ) >>> frequencies = [1,3] - >>> obj_cost, final_lines = line_optimization(node_data, edge_data, line_data, linepath_data, demand_data, frequencies, True, verbose=False) + >>> obj_cost, final_lines = line_optimization( + ... node_data, + ... edge_data, + ... line_data, + ... linepath_data, + ... demand_data, + ... frequencies, + ... verbose=False, + ... ) >>> obj_cost 211.0 >>> final_lines - [('new271_B', 1), ('new31_B', 1), ('new407_B', 1), ('new415_B', 3), ('new423_B', 3), ('new535_B', 3), ('new551_B', 3), ('new71_B', 1)] - -We provide a basic method to plot a line plan that has at most 20 lines using networkx and matplotlib. -In order to use this functionality, it is necessary to install both packages if not already available as follows:: + [('new271_B', 1), + ('new31_B', 1), + ('new407_B', 1), + ('new415_B', 3), + ('new423_B', 3), + ('new535_B', 3), + ('new551_B', 3), + ('new71_B', 1)] + +We provide a basic method to plot a line plan that has at most 20 lines using +networkx and matplotlib. In order to use this functionality, it is necessary to +install both packages if not already available as follows:: pip install matplotlib pip install networkx -Additionally, the node_data must include coordinates for the node positions, i.e., the columns ``posx`` and ``posy`` -must be available. The plot function generates a matplot that is opened in a browser:: +Additionally, the node_data must include coordinates for the node positions, +i.e., the columns ``posx`` and ``posy`` must be available. The plot function +generates a matplot that is opened in a browser:: from gurobi_optimods.line_optimization import plot_lineplan plot_lineplan(node_data, edge_data, linepath_data, final_lines) -The Sioux-Falls transportation network (left) and the optimal line plan (right) for this example is shown in the figure below. The lines are shown as -different colored paths in the network. +The Sioux-Falls transportation network (left) and the optimal line plan (right) +for this example is shown in the figure below. The lines are shown as different +colored paths in the network. .. image:: figures/lop_siouxfalls_solution.png :width: 600 diff --git a/src/gurobi_optimods/line_optimization.py b/src/gurobi_optimods/line_optimization.py index a27da731..d7be848c 100644 --- a/src/gurobi_optimods/line_optimization.py +++ b/src/gurobi_optimods/line_optimization.py @@ -42,22 +42,29 @@ def line_optimization( Parameters ---------- node_data : DataFrame - DataFrame with information on the nodes/stations. The frame must include "source". + DataFrame with information on the nodes/stations. The frame must include + "source". edge_data : DataFrame - DataFrame with edges / connections between stations and associated attributes. - The frame must include "source", "target", and "time" + DataFrame with edges / connections between stations and associated + attributes. The frame must include "source", "target", and "time" demand_data : DataFrame - DataFrame with node/station demand information. - It must include "source", "target", and "demand". The demand value must be non-negative. + DataFrame with node/station demand information. It must include + "source", "target", and "demand". The demand value must be non-negative. line_data : DataFrame - DataFrame with general line information. - It must include "linename", "capacity", "fix_cost", and "operating_cost". + DataFrame with general line information. It must include "linename", + "capacity", "fix_cost", and "operating_cost". linepath_data : DataFrame - DataFrame with information on the line routes/paths. - It must include "linename", "edge_source", and "edge_target". + DataFrame with information on the line routes/paths. It must include + "linename", "edge_source", and "edge_target". frequency: List - List with possible frequencies: How often the line can be operated in the considered - time horizon. + List with possible frequencies: How often the line can be operated in + the considered time horizon. + shortest_paths : bool + Parameter to choose the strategy. Default value is true and strategy 1 + is used, i.e., passengers travel along shortest paths. Set to False if + all possible paths are allowed. In that case, a multi-objective approach + is used to first minimize line operating cost and then minimize + passengers travel time. Returns -------