السلام عليكم – The challenges of localizing addresses in Arabic. Your help is appreciated!

I need help. Do you speak and read Arabic and do you develop for Arabic applications or solutions that are using addresses? Then maybe I can ask for a few minutes of your time and for your local expertise.

Before I go into the details of the challenge we have, here is a bit of background.

The product I work on is the HERE Geocoder. Geocoder helps find addresses so you can show them on a map, calculate a route to it, show imagery around it, find places near it, use it in logistics solutions, geo-marketing, business intelligence and much more. It is a web service with a RESTful API.

We recently did improvements for Saudi Arabia, where an addressing system with a 13 digit long code is in place. It helps locate addresses in the rural areas where streets are often unnamed. For example “7538-65525-3802” is house number 7538 in the town Al Bahah. (This is a randomly picked address.)

As part of the improvement we want to get the labels and address lines right in Arabic. And this is where I need help.

While Geocoder outputs the individual address parts separately, it also offers a localized output of addresses for over 90 countries. For some countries the house number goes before, in other it goes behind the street name. Some need the zip code after the city name, some before it. And for example in Arabic the address needs to flow from right to left. The localization is applied to two output fields.

1) Label: an assembled address value built out of the parsed address components. It can be used to display the address formatted on a screen.

Here is the expected output of the label.
It’s the first bold line on the right. Below you can see the individual address components:

arabic_label

2) Address lines: Formatted address lines built out of the parsed address components. The first line consists of street name, including prefix, directionals and street type, and house number. The second line consists of the city name and postal code, plus in some countries the state name or abbreviation. They can be used to put the address onto an envelop or package and send something to the location.

And here is the expected address lines formatting as per the Saudi Post:

addressline-formatting_saudi-arabia

Now obviously Arabic is read from right to left and this is where is becomes challenging. Especially around the formatting of the postal code and add-on code. The 5+4 number next to the city name.

We have Arabic speaking colleagues on the team, but our computers and systems are English and being based in Germany doesn’t help either. Because when we look at the label and address lines on our Western, Latin script based systems, then things start flowing from right to left depending on the client we use to look at the JSON or XML output.

This is where I need help from anyone how has experience with developing for Arabic output. And running an Arabic system. Because different browser, text editors, and tools render the XML and JSON differently. Your help with looking at this and helping us understand whether we are getting it right is greatly appreciated!

As said above Geocoder returns either JSON or XML as the response format. Here are two files with a random example address and our current state of development:
[Response sample files are not longer available as development is successfully completed. Please see example request in the comments below.]

XML response

JSON response

If you believe you are an expert and have worked with JSON and/or XML data in Arabic, then please let me know whether the formatted output in the label and address lines attributes are correct. Can you work with this as is? Any pointers on the relevance of getting this right – i.e. correct from right to left – is welcome too.

Simply reply in the comments section below.

شكرا جزيلا

Advertisements

One thought on “السلام عليكم – The challenges of localizing addresses in Arabic. Your help is appreciated!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s