Parsing Czech Republic addresses #83

sch3399 · 2020-03-06T10:32:45Z

Hi team,
I have successfully installed pelias, but I have a problem with the autocomplete.
The query [street,city] /autocomplete/?text=Nerudova 20,Praha returns the correct result.

But the rotated query [city,street] /autocomplete/?text=Praha,Nerudova 20 does not return any result and parser pelias creates bad query decomposition

Is it possible to modify the configuration and get the same result as in the first case?

Search /search?text=Praha,Nerudova 20 without autocomplete the result is correct in both cases, but the parser is not pelies, but libpostal

Thank you for the advice

The text was updated successfully, but these errors were encountered:

missinglink · 2020-03-06T10:44:12Z

Unfortunately not @sch3399, this is a difficult address to parse because it has a few uncommon conventions:

The street name Nerudova has no prefix/suffix, if it ended with Ave or began with Rue de then it would be easier to parse.
Most parsers assume that the segments are specified left-to-right in decreasing granularity from address -> city -> country.

You'll notice that the libpostal result interprets Praha as a 'query', meaning that it thinks it's a venue name or similar, not a region.
Libpostal does, however do a better job at detecting that Nerudova is the street than the pelias native parser.

I'll move this issue to the pelias/parser repo as someone might be able to tackle this issue over there.

Some more info from you would be helpful:

How common are these street names with no prefix/suffix in the Czech Republic?
Is it a common convention for the people of Czech Republic to write their address with the city name first?
Please provide one or two examples for developers outside Czech Republic

sch3399 · 2020-03-06T13:11:36Z

The vast majority of streets in the Czech Republic have a one-word name without a prefix / suffix.
Korunní 810, Praha
Kájovská 68, Český Krumlov
Beethovenova 641/9, Brno
I don't know the likelihood of a reverse search [city, street], but it's not unusual.
Other neighboring states:
Divadelná 41/3, Trnava (Slovakia)
Szewska 6, Kraków (Poland)
Zadarska 17, Pula (Croatia)

Thank you very much for a possible solution

missinglink · 2020-04-17T17:48:03Z

@sch3399 I had a look at this today and I was able to get the parser working for the cases you provided.
If possible, could you please provide some more test cases?

see #88

sch3399 · 2020-04-22T07:50:49Z

@missinglink I am sending other test cases for the Czech Republic

Ostrava, U Koupaliště 1570/10
Hradec Králové, Karla Čapka 694/5
Kolín, Pražská 3
Neratovice, Jungmannova 676
Králíky, Bedřicha Smetany 561
Prachatice, Dlouhá 93
Ronov nad Doubravou, Nábřežní 180
Brno, Orlí 517/22
Nový Jičín, Dvořákova 713/11
Praha, V Šáreckém údolí 53/27
Praha, Nad Panenskou 164/4
Rožmitál pod Třemšínem, Kpt. Jaroše 403
Klatovy, Jiráskova 15
Frýdek-Místek, Radniční 1244
Zlín, Rašínova 70

missinglink transferred this issue from pelias/pelias Mar 6, 2020

missinglink changed the title ~~Different analysis text~~ Parsing Czech Republic addresses Mar 6, 2020

missinglink mentioned this issue Apr 17, 2020

add new CentralEuropeanStreetNameClassifier #88

Merged

missinglink closed this as completed in #88 Apr 24, 2020

missinglink mentioned this issue Apr 24, 2020

add some *failing* test cases for Czech Republic #92

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parsing Czech Republic addresses #83

Parsing Czech Republic addresses #83

sch3399 commented Mar 6, 2020

missinglink commented Mar 6, 2020 •

edited

Loading

sch3399 commented Mar 6, 2020

missinglink commented Apr 17, 2020 •

edited

Loading

sch3399 commented Apr 22, 2020

Parsing Czech Republic addresses #83

Parsing Czech Republic addresses #83

Comments

sch3399 commented Mar 6, 2020

missinglink commented Mar 6, 2020 • edited Loading

sch3399 commented Mar 6, 2020

missinglink commented Apr 17, 2020 • edited Loading

sch3399 commented Apr 22, 2020

missinglink commented Mar 6, 2020 •

edited

Loading

missinglink commented Apr 17, 2020 •

edited

Loading