Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rooms within a property are causing confusion #661

Open
mortonc opened this issue May 14, 2024 · 1 comment
Open

Rooms within a property are causing confusion #661

mortonc opened this issue May 14, 2024 · 1 comment

Comments

@mortonc
Copy link

mortonc commented May 14, 2024

Hi!

I noticed that whenever an address has a sort of sub-unit e.g
"room 3, flat 22, 50 Downing Street, London"
It would mark the room as the house name which is inaccurate.

e.g

 ('flat 22', 'unit'),
 ('50', 'house_number'),
 ('downing street', 'road'),
 ('london', 'city')]```

Removing the room number does resolve this however I am interested in retaining this information. Is there a way to parse room numbers or alternative unit/sub-unit types?

@albarrentine
Copy link
Contributor

That seems like a non-standard edge case, and that address doesn't appear to exist. We handle some types of academic addresses that might have a building and rooms, but have never seen a room within a flat in an organic data set, so it's not generated in the training data. Of course anything's possible in the UK, but if it's only a few addresses I would just regex it out or relabel the house name after parsing. There are legitimate venue names that can be something like "Room 3", so the model's unlikely to be able to distinguish them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants