Add functionality to merge each isolated bus to the backbone network #903

ekatef · 2023-10-18T21:26:29Z

Closes #852

Changes proposed in this Pull Request

Added merge_into_network( ) function to merge each isolated bus into the network following an approach of drop_isolated_nodes( ) and merge_isolated_nodes( ).

Checklist

I consent to the release of this PR's code under the AGPLv3 license and non-code contributions under CC0-1.0 and CC-BY-4.0.
I tested my contribution locally and it seems to work fine.
Code and workflow changes are sufficiently documented.
Newly introduced dependencies are added to envs/environment.yaml and doc/requirements.txt.
Changes in configuration options are added in all of config.default.yaml and config.tutorial.yaml.
Add a test config or line additions to test/ (note tests are changing the config.tutorial.yaml)
Changes in configuration options are also documented in doc/configtables/*.csv and line references are adjusted in doc/configuration.rst and doc/tutorial.rst.
A note for the release notes doc/release_notes.rst is amended in the format of previous release notes, including reference to the requested PR.

ekatef · 2023-10-18T21:32:05Z

Currently, the code works if a number of isolated nodes is more than one. It would be nice to generalise on a case with a single isolated nodes, preferably without need to using conditional statements.

Functionality has not been tested yet on geometries.

What would be great is to better understand requirements towards the implementation:

do we need a voltage threshold for merge_into_network( )?
how should be this option combines with drop of the isolated nodes and merging into a single node?

…fetch_isolated_to_network

ekatef · 2023-10-26T23:39:06Z

Currently, fetching isolated nodes into the closest network in implemented. Tested on a CL and seems to work fine:

base.nc

elec_s.nc, current implementation

elec_s.nc, PR implementation

Red highlights buses which don't belong to the largest sub-network.

Fetching the isolated nodes keeps invariant values of the overall load and generation.

ekatef · 2023-10-27T00:10:12Z

What would be nice to understand, is which strategies are sensible for drop/merge/fetch isolated nodes and isolated networks and adjust configuration parameters accordingly. My feeling is that it feels like alternatives to merge isolated nodes in a single isolated node or to fetch them into network.

Another point is handling isolated fragments of the network (like those red buses on first picture above). I suspect, in this particular case they mostly get reduced to a single node and dropped or merged after. However, the may be cases when isolated networks remain as two-three "micro-networks" after simplification, and can't be merged into the "main" network with a current approach. Probably, we can find some solution for this case, as well.

@davide-f would be very grateful if you could find time for looking into it 🙂

Edudiro · 2023-10-27T08:15:36Z

Hi @ekatef, thank you for your response on google groups and for developing this PR. I have tested your approach and it works perfectly in the Chilean case, as you have proven above.

I am assuming the large amount of unconnected nodes from the base network comes from OSM data gaps? In that case I think following this solution is a valid approach. Otherwise, unfeasibilities might arise for a dispatch only problem.

ekatef · 2023-10-27T13:02:54Z

Hi @ekatef, thank you for your response on google groups and for developing this PR. I have tested your approach and it works perfectly in the Chilean case, as you have proven above.

I am assuming the large amount of unconnected nodes from the base network comes from OSM data gaps? In that case I think following this solution is a valid approach. Otherwise, unfeasibilities might arise for a dispatch only problem.

Hello @Edudiro! Great to hear that it worked for you. Thanks for giving an inspiration for pushing these developments :D

I also have a feeling that the unconnected nodes in base.nc result from some data gaps in OSM data base. At least, looking on GENI description of a national power system, it seems that there shouldn't be any reason for having these isolated power systems. However, some regional expertise may be very helpful to give a more definitive answer.

…fetch_isolated_to_network

ekatef · 2023-10-31T21:54:06Z

@davide-f I think, the PR is ready for the first review round :)

Currently, isolated nodes are merged to the closest non-isolated ones. Further improvements are obviously possible, for example to account for the subnetwork type or implement a power threshold.

Haven't added yet configuration parameters for fetching isolated nodes, as it feels a good idea to decide on a general interfacing concept. Would be very grateful for any input on that :)

davide-f

Cool PR :D I've added some possible improvements.
What do you think?

davide-f · 2023-11-01T12:18:20Z

scripts/simplify_network.py

+    points_buses = np.array(
+        list(zip(n.buses.loc[i_connected].x, n.buses.loc[i_connected].y))
+    )
+    islands_points = np.array(
+        list(zip(n.buses.loc[i_islands].x, n.buses.loc[i_islands].y))
+    )
+
+    btree = cKDTree(points_buses)
+    dist0, idx0 = btree.query(islands_points, k=1)
+
+    points_nearest = [points_buses[i] for i in idx0]
+
+    nearest_bus_list = [
+        n.buses.loc[(n.buses.x == x) & (n.buses.y == y)] for x, y in points_nearest
+    ]
+    nearest_bus_df = pd.concat(nearest_bus_list)


Probably a more efficient way to do this is with the geopandas spatial join function:
gdf_map = gpd.sjoin_nearest(gdf1, gdf2, houw="left")
where gdf1 is the selection of isolated to merge and the right gdf2 frame is the set of non-isolated buses like df_islands.query("carrier == 'DC'")[["geometry"]].
Internally, it exploits the KDTree and it saves lines of code.

The output is a geodataframe where for every row of gdf1 (isolated bus) we identify the index of gdf2 that is the closest, using the column "index_right" (default).
Therefore, the value gdf_map["index_right"] is almost the busmap we are looking for (once filled with the missing indices that shall not change and the values for the DC nodes)
Since we cannot exclude duplicated indices in gdf_map in case of equidistances, duplicated indices can be dropped.

Note that it is better to do the processing also for DC nodes: we merge isolated AC nodes with the closest non-isolated AC nodes and similarly the DC nodes.
An auxiliary function may be defined to repeat the process on AC and DC similarly, but it may be an overkill

@davide-f thanks a lot for the review!

Actually, I have completely forgotten about a possible polarity difference... I'll think about implementation.

A great idea with sjoin_nearest(). I had a (wrong) impression that it may be not perfect in terms of performance, but it looks like rather the opposite should be the case. Thanks, happy to play with that! :D

@davide-f thanks a lot for the suggestions and sorry for putting this work on hold for a while. Great to come back to it :)

Implementation with sjoin_nearest() require a couple of additional imports, but it looks cleaner to me: https://github.com/ekatef/pypsa-earth/compare/fetch_isolated_to_network..fetch_isolated_to_network2
So, I'd opt for it instead of current usage of cKDTree. What do you think?

Have added filtering by AC carrier for the network to be merged with, agree that it's generally a good idea to find a way to deal also with DC networks. However, how realistic is a case of isolated DC nodes/network? Are you aware of any examples where it could be useful?

Cool! and that's great :D I much like it :)

I'm wondering that an efficient way to account for the country-specific bus and potentially the AC/DC problem may be to exploit the groupby to do the merge, as follows:

create a geodataframe for n.buses (already done basically)

add a column in that dataframe to denote the "isolated networks", such as gdf["is_isolated"]
If we want to do the AC/DC difference, the determine_network_topology is only able to well adopt the AC ones. To account for the DC ones, an option is to drop out of the DC isolated networks identified with the determine_topology, those where a link is connected. I drop a piece of code I did to investigate the issue in a network.

exploit the groupby function, such as:
gdf.groupby(["country"{, "carrier"}]).map(lambda x: gdp.sjoin_nearest(x.query("is_isolated"), x.query("not is_isolated"), {options as needed}))

As a note, the output of the s_join_nearest will create a column, like "index_right" or "index_left" that for every index of the geodataframe, specifies the closest bus.
That column is basically the map_isolated_node_by_country series that we need for the clustering process.

Would you be curious in investigating that?

I agree that isolated dc nodes would not make sense from a physical perspective, but I feel that it would make the workflow more robust as I believe we could have those type of nodes due to missing data and alike.

I've also done a quick check on a continent with the code below and it seems that only 4 DC nodes worldwide are actually isolated with the run I did.

---- Piece of code to identify the isolated buses to be adapted to the PR ---

is_isolated_network = n.buses["sub_network"].value_counts().map(lambda x: x <= 1) DC_links = n.links.query("carrier == 'DC'") links_buses = np.unique(list(DC_links["bus0"].values) + list(DC_links["bus1"].values)) is_isolated_bus = n.buses["sub_network"].map(is_isolated_network) # all buses having is_isolated_bus.loc[links_buses] = False is_isolated_bus

@davide-f super! :D Many thanks :D Have updated the implementation.

Ahh, right! We have also to account allocation buses by countries. Of course, I'm curious to look into it. Thanks a lot for the hints!!

May you have any ideas on isolated networks which spread across two-three countries? It would be very helpful for testing.

Absolutely agree regarding the need to account for isolated DC network for the sake of the workflow stability. Not sure is it's a well-grounded suggestion, but wouldn't it be a more natural approach for isolated DC buses to build an additional link to the closest non-isolated DC bus? My impression is that DC lines are often built to connect demand with renewable sources, and it may be probably a good idea to try to maintain topology. Does it sounds any reasonable for you?

Hello @davide-f! Thanks a lot for the implementation idea :D It has been funny to play around, and it looks like the approach should be transferable to subnetwork-to-subnetwork merging.

Have drafted a possible implementation here:
https://github.com/ekatef/pypsa-earth/compare/fetch_isolated_to_network..fetch_isolated_to_network3

Have I get your idea properly? Currently, there is looping over groups, while I do agree that it may be a way to apply some functional-programming magic over the grouped dataframe. Not sure however if it makes sense to look for it. In my experience, when map-lambda is perfect when you manage to keep it simple. It things get complicated, that may look great but usually it difficult to maintain. What is your feeling about this? 🙂

Thanks for the revision :)
I'll go into it soon, expected tomorrow

Hey @davide-f, when you would have time and resources, would be perfect to have your opinion on this.

It may be a bit too early for a proper review, but would be amazing to align on the concept 🙂

davide-f · 2023-11-13T18:17:44Z

As an additional comment, for a future PR, I think that we could revise in the future the behaviour of the isolated_nodes and extend to networks: drop/merge isolated networks with demand below a given threshold

davide-f

Sorry for the little delay, here is my feedback :D
Great!

davide-f · 2023-12-18T12:07:37Z

scripts/simplify_network.py

+    points_buses = np.array(
+        list(zip(n.buses.loc[i_connected].x, n.buses.loc[i_connected].y))
+    )
+    islands_points = np.array(
+        list(zip(n.buses.loc[i_islands].x, n.buses.loc[i_islands].y))
+    )
+
+    gds_buses = gpd.GeoSeries(map(Point, points_buses))
+    gds_islands = gpd.GeoSeries(map(Point, islands_points))
+
+    gdf_buses = gpd.GeoDataFrame(geometry=gds_buses)
+    gdf_islands = gpd.GeoDataFrame(geometry=gds_islands)


to create a geodatafarme (let's say gdf_buses)of the entire buses df we may use this function, that should be quite compact

Agree that it could be implemented in a more elegant way :) more Wrapped-up into a function transformation into a geopandas dataframe.

Have I understood your idea correctly? :)

davide-f · 2023-12-18T12:11:07Z

scripts/simplify_network.py

+    i_islands = n.buses[
+        (~n.buses.duplicated(subset=["sub_network"], keep=False))
+        & (n.buses.carrier == "AC")
+    ].index


Does this account identifies isolated networks? This seems to me that nodes that are interconnected by a line but disconnected from the entire network are not well identified or am I wrong?

We could extend the working principle to "isolated networks" that may be defined as subnetworks that do not represent the larger national power system whose demand is low (below threshold).

To keep the approach simple we can also skip that

Added a function to pick isolated networks using a number of the nodes as a criterium in the first approximation. Absolutely agree that it could make sense to use other heuristics, the current implementation just generalises the merging approach for n_isolated_buses > 1.

davide-f · 2023-12-18T12:17:03Z

scripts/simplify_network.py

+    nearest_bus_list = [
+        n.buses.loc[(n.buses.x == x) & (n.buses.y == y)]
+        for x, y in zip(gdf_map["geometry"].x, gdf_map["geometry"].y)
+    ]
+    nearest_bus_df = pd.concat(nearest_bus_list)


I believe that nearest_bus_df = n.buses.loc[gdf_map.index_right.values] or something like that.

gdf_map.index_right/index_left may contain the mapping between each island and its closest bus.

Agree, it feels much more natural to use indices for filtering. Added an bus_id column to the buses dataframe, and modified filtering.

davide-f · 2023-12-18T12:18:38Z

scripts/simplify_network.py

+    # each isolated node should be mapped into the closes non-isolated node
+    map_isolated_node_by_country = (
+        n.buses.assign(bus_id=n.buses.index)
+        .loc[nearest_bus_df.index]
+        .groupby("country")["bus_id"]
+        .first()
+        .to_dict()
+    )


Something like:

gdf_buses.groupby(["country"{, "carrier"}]).map(lambda x: gdp.sjoin_nearest(x.query("is_isolated"), x.query("not is_isolated"), {options as needed}))

or

gdf_buses.groupby(["country"{, "carrier"}]).map(lambda x: gdp.sjoin_nearest(x.loc[i_islands], x.loc[i_connected], {options as needed}))

may work.

In the case of neighboring buses across countries I'm unsure the isolated network/node gets connected to the closest bus of the same country, it may remain there. Have you tested that?

ekatef · 2023-12-18T12:47:01Z

Sorry for the little delay, here is my feedback :D Great!

Hey @davide-f, no problem, I was aware that you were very busy. Thanks a lot for the review!

It looks like we are well aligned on the concept, and happy to introduce the changes.

Yeah, the idea of this PR as far was to focus on isolated nodes only, keeping in mind that same approach may be used to deal with isolated networks, as well.

@carlosfv92 May you be interested to work on implementation for isolated networks? Happy to discuss on align on the implementation.

ekatef · 2024-03-30T20:50:42Z

Revised to simplify the implementation. @davide-f thank you so much for the discussion and the ideas! The current version looks definitely cleaner. Could you please review it, when you have time?

Summary of the major changes as compared with the previous iteration:

moved all load processing into the function which creates a buses geo-dataframe;
revised the threshold from to be a share of the national load instead an absolute value;
replaced all the functions to filter the island and backbone networks with the flags in the buses geo-dataframe.

ekatef · 2024-04-07T08:32:35Z

@davide-f when you have time, it would be perfect to have your review 🙂

davide-f

Great @ekatef :D
I think the functionality is quite there and way cleaner!
In the future we may reformulate but I think it is in great shape! There may not be the need to duplicate the information by subnetwork to each row of the buses, but I think it is good to go with minor changes! :)
Great work and patience!

davide-f · 2024-04-07T09:34:13Z

scripts/simplify_network.py

+    n_buses_gdf["is_multicnt_subntw"] = n_buses_gdf.sub_network.map(
+        n_buses_gdf.groupby(["sub_network"])["country"].apply(
+            lambda x: len(pd.unique(x)) > 1
+        )
+    )


Not sure if you need the double map; probably the following should work or something like that

Suggested change

n_buses_gdf["is_multicnt_subntw"] = n_buses_gdf.sub_network.map(

n_buses_gdf.groupby(["sub_network"])["country"].apply(

lambda x: len(pd.unique(x)) > 1

)

)

n_buses_gdf["is_multicnt_subntw"] = n_buses_gdf.sub_network.map(

n_buses_gdf.groupby(["sub_network"]).country.unique().count() > 1

)

May that work?

However, I'm wondering if we need to fill that value at the gdf and we can work with a df for subnetworks, like:

df_subn = n_buses_gdf.groupby(["subnetwork"]).apply(lambda x: { "is_multicnt_subntw": x.country.unique() > 1, "total_demand": x.load.sum(), } )

However, I see that in the buses gdf you filled a column with the demand of the subnetwork, so that may not be needed actually

Agree on the revision on map().apply(), thanks! :D

Not sure I get your idea with the sub-networks dataframe. I thought, we aim to keep all the data in a single dataframe in the spirit of the tidy data concept, which actually encourages a certain degree of duplication. It gives a lot of power and flexibility, while allowing to stay in the same "variables space". If we add an additional dataframe, it means a need to switch attention between two objects, and each of them has an own set of variables. To me, it sounds a bit more complicated.

I'm also not sure that having two objects would be more effective in terms of performance, as compared with having an additional column. Have I missed anything? :)

davide-f · 2024-04-07T09:50:02Z

scripts/simplify_network.py

+    gdf_backbone_buses = (
+        n_buses_gdf.query("is_backbone_sbntw")
+        .query("carrier=='AC'")
+        .query("load_in_subnetw>0")


Mmmm I'd drop this check but I'd pick the index with the highest load; for small countries with no demand this may trigger issues

Agree, revised :)

davide-f · 2024-04-07T09:50:37Z

scripts/simplify_network.py

+    nearest_bus_df = n.buses.loc[n.buses.index.isin(gdf_map.bus_id_right)]
+
+    i_lines_islands = n.lines.loc[n.lines.bus1.isin(gdf_islands.index)].index
+    n.mremove("Line", i_lines_islands)


This remove should not be needed and performed by the clustering no?

Normally yes, but our use case is not well captured by get_clustering_from_busmap( ), as it seems to expect that the lines in a cluster belong to the same sub-network. When I try to remove the line n.mremove("Line", i_lines_islands) leads to AssertionError In Bus cluster sub_network the values of attribute sub_network do not agree.

davide-f · 2024-04-07T09:53:26Z

scripts/simplify_network.py

+from scipy.spatial import cKDTree
+from shapely.geometry import Point


Are they still needed?

Fixed, thanks!

ekatef · 2024-04-07T22:38:37Z

Great @ekatef :D I think the functionality is quite there and way cleaner! In the future we may reformulate but I think it is in great shape! There may not be the need to duplicate the information by subnetwork to each row of the buses, but I think it is good to go with minor changes! :) Great work and patience!

Thanks a lot for the great support and the review @davide-f!

I'd say this duplication is rather a feature than a bug ;) There is the "tidy data" concept which assumes that each row should contain a full information for a single variable. That makes life way easier even if it implies some duplication. That is, if I have get your idea properly. Happy to discuss more and adjust the implementation, if needed :)

ekatef · 2024-04-21T17:24:48Z

@davide-f could you please check if you have any questions on the PR, when you would have time?

It seems to be needed, and I think it's simpler to finalise it instead to explain how to test it from a fork.

davide-f

Great @ekatef <3 amazing contribution and sorry for the delay...
I approve this PR and feel free to merge it with squash merge or normal merge as you see it more fair/clean :)

ekatef · 2024-04-22T21:39:30Z

Great @ekatef <3 amazing contribution and sorry for the delay... I approve this PR and feel free to merge it with squash merge or normal merge as you see it more fair/clean :)

No worries and thanks a lot for the outstanding support :D

ekatef added 3 commits October 19, 2023 00:20

Add a function to fetch isolated nodes to network

b992acd

Apply fetch-isolated

f389d02

Add import

ea0eb88

ekatef added 5 commits October 27, 2023 02:18

Add import

a399ae1

Generalize sub-setting of an array

e8262b8

Improve a message

52eb9a9

Remove comments

1af3d41

Merge remote-tracking branch 'origin/fetch_isolated_to_network' into …

4aabc91

…fetch_isolated_to_network

Add a return on condition in case there is no isolated nodes

6391b21

ekatef force-pushed the fetch_isolated_to_network branch from 4c96824 to 6391b21 Compare October 26, 2023 23:57

Merge branch 'main' into fetch_isolated_to_network

bc8b755

ekatef added 2 commits November 1, 2023 00:43

Clarify a comment

a92a8fc

Merge remote-tracking branch 'origin/fetch_isolated_to_network' into …

2ad6fa6

…fetch_isolated_to_network

ekatef marked this pull request as ready for review October 31, 2023 21:45

davide-f reviewed Nov 1, 2023

View reviewed changes

Add a TODO comment

4d46fe5

ekatef mentioned this pull request Nov 17, 2023

Linopy transition #796

Closed

8 tasks

davide-f mentioned this pull request Dec 2, 2023

Extend isolated node options to isolated networks #931

Closed

ekatef added 3 commits December 8, 2023 01:32

Add filtering by carrier for the network to merge into

e641447

Remove a not-necessary comment

febe2a7

Improve implementation of buses merging

dc9be87

davide-f reviewed Dec 18, 2023

View reviewed changes

ekatef added 13 commits March 30, 2024 20:59

Remove an outdated function

60ae7cd

Fix formatting

b7f50c5

Keep the backbone block together

f2a7840

Add a comment

f00c04b

Fix code structure

1702c3a

Fix filtering for floats

9b725cb

Revise naming

f1ec53b

Keep id column

6cf958b

Add a safety filtering

3953b70

Revise config defaults

a3086f7

Add an entry to the configtables

10cad77

Add a release note

171bd00

Revise naming

6f20a31

ekatef changed the title ~~Add functionality to merge each isolated bus to the closest non-disconnected bus~~ Add functionality to merge each isolated bus to the backbone network Apr 4, 2024

davide-f reviewed Apr 7, 2024

View reviewed changes

ekatef added 4 commits April 8, 2024 00:07

Merge branch 'main' into fetch_isolated_to_network

75019ac

Remove not needed imports

ad8d083

Implement Davide's suggestion

278bc80

Use a safer filtering

cc97b02

ekatef force-pushed the fetch_isolated_to_network branch from eb25080 to cc97b02 Compare April 7, 2024 22:14

Add a release note

f92638e

davide-f approved these changes Apr 22, 2024

View reviewed changes

Merge branch 'main' into fetch_isolated_to_network

a9b6f59

ekatef merged commit 653e017 into pypsa-meets-earth:main Apr 23, 2024
4 checks passed

ekatef deleted the fetch_isolated_to_network branch April 23, 2024 08:15

		from scipy.spatial import cKDTree
		from shapely.geometry import Point

Add functionality to merge each isolated bus to the backbone network #903

Add functionality to merge each isolated bus to the backbone network #903

Conversation

ekatef commented Oct 18, 2023 • edited Loading

Closes #852

Changes proposed in this Pull Request

Checklist

ekatef commented Oct 18, 2023 • edited Loading

ekatef commented Oct 26, 2023 • edited Loading

base.nc

elec_s.nc, current implementation

elec_s.nc, PR implementation

ekatef commented Oct 27, 2023

Edudiro commented Oct 27, 2023

ekatef commented Oct 27, 2023

ekatef commented Oct 31, 2023

davide-f left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ekatef Dec 8, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davide-f commented Nov 13, 2023

davide-f left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ekatef Jan 4, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ekatef commented Dec 18, 2023

ekatef commented Mar 30, 2024

ekatef commented Apr 7, 2024

davide-f left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ekatef commented Apr 7, 2024

ekatef commented Apr 21, 2024 • edited Loading

davide-f left a comment • edited Loading

Choose a reason for hiding this comment

ekatef commented Apr 22, 2024

ekatef commented Oct 18, 2023 •

edited

Loading

ekatef commented Oct 18, 2023 •

edited

Loading

ekatef commented Oct 26, 2023 •

edited

Loading

davide-f left a comment •

edited

Loading

ekatef Dec 8, 2023 •

edited

Loading

ekatef Jan 4, 2024 •

edited

Loading

davide-f left a comment •

edited

Loading

ekatef commented Apr 21, 2024 •

edited

Loading

davide-f left a comment •

edited

Loading