Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suburban Widgets - Modify calculations #2357

Closed
atalyaalon opened this issue Mar 15, 2023 · 13 comments
Closed

Suburban Widgets - Modify calculations #2357

atalyaalon opened this issue Mar 15, 2023 · 13 comments
Assignees

Comments

@atalyaalon
Copy link
Collaborator

Add relevant junctions to suburban widgets calculations (using CBS's logic)

@atalyaalon
Copy link
Collaborator Author

@MichalOren FYI

@ziv17
Copy link
Collaborator

ziv17 commented Mar 18, 2023

My plan is to add to the road_segments table an array of sub-urban junction codes for the junctions that are placed in the ends of the segment, and between the ends.
The filter of accidents for a road_segment location will be:
accidents with this road segment plus accidents in a sub-urban junction that is located in the given road_segment.
Hi @MichalOren , @atalyaalon , Is this what we want?

@atalyaalon
Copy link
Collaborator Author

atalyaalon commented May 3, 2023

@ziv17 sounds good.
How are you going to associate a junction to a road segment? Using the junction km?
Are you going to relate multiple junctions to multiple road segments? (Make sense, since a junction contains at least 2 roads)

@ziv17
Copy link
Collaborator

ziv17 commented May 3, 2023

Hi @atalyaalon
Yes, according to the km of each of the roads in a junction.

Hi,
I do not know when the road_segments table is written, and whether the junctions data is available at that time.

I thought to create a new table, segment_junctions to hold the junctions that belong (=by km of each road in the junction) to each segment:
columns:
1 - int - segment_id
2 - int - junction id.
the key will be both columns (i.e. each pair appears only once).

The table will be filled from the roads structure in executor.py:716 (end of function get_files()).

On startup, we will read the segment_junctions table, and hold in memory a dict:
{segment: {set of junction ids}}. We will use the junction ids in the filter of road_segments locations.

What do you think?

@MichalOren
Copy link
Collaborator

Hi @ziv17
segment_junctions looks as a good idea.
from my previous experience i notice that junction data has 2 main problems :

  1. Junction_id can have more than one name
  2. Identical junction name have diffrent Junction_id
    (i have preperd qry if U want to run some tests)
    i wonder how this can effect building segment_junctions new table ?

@ziv17
Copy link
Collaborator

ziv17 commented May 7, 2023

Hi @atalyaalon
Yes, according to the km of each of the roads in a junction.

Hi,
I do not know when the road_segments table is written, and whether the junctions data is available at that time.

I thought to create a new table, segment_junctions to hold the junctions that belong (=by km of each road in the junction) to each segment:
columns:
1 - int - segment_id
2 - int - junction id.
the key will be both columns (i.e. each pair appears only once).

The table will be filled from the roads structure in executor.py:716 (end of function get_files()).

On startup, we will read the segment_junctions table, and hold in memory a dict:
{segment: {set of junction ids}}. We will use the junction ids in the filter of road_segments locations.

What do you think?

Hi @ziv17 segment_junctions looks as a good idea. from my previous experience i notice that junction data has 2 main problems :

1. Junction_id can have more than one name

2. Identical junction name have diffrent Junction_id
   (i have preperd qry if U want to run some tests)
   i wonder how this can effect building segment_junctions new table ?

Hi @MichalOren , thanks.
I will look at it. Can you send me the queries?

@atalyaalon
Copy link
Collaborator Author

@ziv17 Thanks!
Regarding your question:
road_segments is written here anyway/parsers/road_segments.py and is triggered manually, we receive the csv data from the CBS via email (once in 2-3 years, by request), it's indeed a good idea to request a new table and updated, I'll ask them regarding this and will also ask if it appears in data gov so we'll be able to use an API.

Regarding junctions:
segment_junctions table is a great idea.
As you mentioned we can use road_segments table AND SuburbanJunction table (this table does not consist of the specific km per road, but this can be added) to create the table mentioned (just a suggestion, you can also perform it in a different way).

Just note - if we want to fill the SuburbanJunction table fully I think we need to reload the entire CBS data, am I right or did I miss anything?

BTW If we find there are sub-urban junctions that don't belong to any segment that's also an important insight we should know and perhaps explore more.

@MichalOren
Copy link
Collaborator

Hi @ziv17
Try this one
Qry_To_Ziv.TXT

@ziv17
Copy link
Collaborator

ziv17 commented Jun 3, 2023

Hi,
I saw the data mismatches. I think it is not a big issue. Our code will work according to the codes (rather then the names). If we receive a name that is not found in the table, it is an error.

The full solution to this is to hold separate roads, streets, junctions, etc. data for each year, and to check each accident against that data of the accident year. I do not know whether it worth the effort.

@ziv17
Copy link
Collaborator

ziv17 commented Jun 3, 2023

Hi @atalyaalon ,
Thinking about the implementation again, I want to change the implementation a little:
Adding junctions list to segment:

  • Add road_junction_km table:
    • columns:
      • road: int
      • non_urban_intersection: int
      • km: int
    • primary key: road, non_urban_intersection
    • Table will be read from DB before processing CBS files, will be updated from files data ( The table will be filled from the roads structure in executor.py:716 (end of function get_files()). ), and be written at the end (like SuburbanJunction)
  • Add dict in memory that will hold the list of junctions for each segment:
    • {road_segment_id: [non_urban_intersection, non_urban_intersection, ...]}
    • Will be generated first time infographics is used, from road_junction_km, and road_segments tables.

@atalyaalon
Copy link
Collaborator Author

atalyaalon commented Jun 14, 2023

@ziv17 looks good.
A few comments:

  • We need to make sure that we're performing it on all historical CBS data so all junctions will be present on the first load.
  • When processing new CBS data, we need to make sure that we're not deleting past data, since as you know, not all junctions will be present in new files. I see you indeed performed it well on SuburbanJunction, hence performing it in a similar way sounds like a great way to me.
  • We need to make sure, as in SuburbanJunction, that when we meet duplicates as @MichalOren mentioned, we'll save only the latest name of the junction.
  • Will the dict in memory holds the list of ids for non_urban_intersection? or the names? (I assume ids is better?)

@ziv17
Copy link
Collaborator

ziv17 commented Jul 1, 2023

Hi @atalyaalon , Yes, it is donw like SuburbanJunction, and yes, the dict in memory holds the list of junction ids for each road segment.

@atalyaalon atalyaalon linked a pull request Aug 9, 2023 that will close this issue
@ziv17
Copy link
Collaborator

ziv17 commented May 9, 2024

Implemented by PRs
#2476
#2610
#2641

@ziv17 ziv17 closed this as completed May 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants