-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add deleted_market_actors
to data model
#575
Conversation
Now it will not crash anymore if new tables are introduced by BNetzA.
Thx for tackling this so quickly! Will test your code asap. |
Should we do this already now? The problem is that the download of Or we update the docs but mention that there is a problem with deleted_market_actors? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created everything from scratch and apparently it is not a duplicate error (this would throw UNIQUE ERROR) but at least one MastrNummer
is empty.
Have you already solved this?
Full traceback:
sqlalchemy.exc.IntegrityError: (sqlite3.IntegrityError) NOT NULL constraint failed: deleted_market_actors.MastrNummer
[SQL: INSERT INTO deleted_market_actors ("MarktakteurMastrNummer", "MarktakteurStatus", "DatumLetzteAktualisierung", "DatenQuelle", "DatumDownload") VALUES (?, ?, ?, ?, ?)]
[parameters: [('SNB987772021801', 'deaktiviert', '2023-01-26 05:59:19.376666', 'bulk', '2024-10-09'), ('GNB978355566796', 'deaktiviert', '2018-11-27 09:28:46.074061', 'bulk', '2024-10-09'), ('SNB970340354654', 'gelöscht', '2024-08-13 13:26:17.268495', 'bulk', '2024-10-09'), ('SNB944121551754', 'gelöscht', '2021-04-08 05:53:09.184645', 'bulk', '2024-10-09'), ('SNB980338627219', 'gelöscht', '2021-04-16 12:05:26.519289', 'bulk', '2024-10-09'), ('SNB966932879735', 'deaktiviert', '2021-10-21 05:48:47.170424', 'bulk', '2024-10-09'), ('SNB910882710327', 'gelöscht', '2022-02-10 12:41:53.557497', 'bulk', '2024-10-09'), ('GNB923972740678', 'gelöscht', '2022-05-17 14:57:43.378445', 'bulk', '2024-10-09') ... displaying 10 of 100000 total bound parameter sets ... ('SNB977935728722', 'deaktiviert', '2021-06-01 14:36:07.232856', 'bulk', '2024-10-09'), ('ABR974093584658', 'gelöscht', '2022-08-17 06:50:55.602585', 'bulk', '2024-10-09')]]
(Background on this error at: https://sqlalche.me/e/20/gkpj)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/nesnoj/git-repos/OpenEnergyPlatform/open-MaStR/open-MaStR_TESTING/open_mastr/mastr.py", line 236, in download
write_mastr_xml_to_database(
File "/home/nesnoj/git-repos/OpenEnergyPlatform/open-MaStR/open-MaStR_TESTING/open_mastr/xml_download/utils_write_to_database.py", line 61, in write_mastr_xml_to_database
add_table_to_database(
File "/home/nesnoj/git-repos/OpenEnergyPlatform/open-MaStR/open-MaStR_TESTING/open_mastr/xml_download/utils_write_to_database.py", line 223, in add_table_to_database
df = write_single_entries_until_not_unique_comes_up(
File "/home/nesnoj/git-repos/OpenEnergyPlatform/open-MaStR/open-MaStR_TESTING/open_mastr/xml_download/utils_write_to_database.py", line 286, in write_single_entries_until_not_unique_comes_up
df = df.drop_duplicates(
File "/home/nesnoj/miniconda3/envs/py310_open_mastr_TESTING/lib/python3.10/site-packages/pandas/core/frame.py", line 6818, in drop_duplicates
result = self[-self.duplicated(subset, keep=keep)]
File "/home/nesnoj/miniconda3/envs/py310_open_mastr_TESTING/lib/python3.10/site-packages/pandas/core/frame.py", line 6950, in duplicated
raise KeyError(Index(diff))
KeyError: Index(['MastrNummer'], dtype='object')
You're right. I got the same error but interpreted it as Not Unique, but that was a mistake from my side. No I did not work further than that. If you want to implement a "catch this error and delete entries without an ID" loop, feel free :) I'm on a conference till friday and can only contribute with a small amount of time. |
No need to extend the exception handling in this case: there are no empty values but the column name is supposed to be |
Summary
deleted_market_actors
to the data model (at all relevant hardcoded constants)try
-except
block so that unknown new tables do not result in a crash anymoreThere is still the problem, that
deleted_market_actors
has no primary key. Or at least in todays bulk download there are several entries with the same ID. It could be an error of todays download, but I doubt that.However since the download crashes right now for all users, I would still publish this bug fix as soon as possible.
Fixes #572
Fixes #574