Blog Home RSS Kelvin Jackson

Localizing (Some) Keywords in Languages without Prepositions

2020-09-14

There are a wide variety of human languages in the world, and there is no technical reason why we shouldn't be able to localize programming languages into any of them. However, many English-based programming languages have keywords, such as in, which are modeled after English words that do not have direct translations in all other languages: Finnish, for instance, has no single word that corresponds to the English preposition in.

In that case, there are a couple of options. Assuming that we are designing the programming language from scratch, we can choose a non-word-based sequence for the function of the in keyword in other languages, or structure the programming language's syntax so that such a keyword is not needed. However, if we were localizing an existing language, such as Python, we probably would not want to add too much sugar on top of the existing syntax, or else we would risk confusing the user. Furthermore, adding a bunch of unfamiliar operators whose meanings cannot be easily inferred from standard mathematical notation would also make the programming language harder to learn.

Another option is to replace in with an expression meaning something like 'belonging to' or 'in the set/collection', both of which would be easy to translate into Finnish at least, and it would probably be relatively straightforward for the programmer to guess their meanings. However, those expressions tend to be longer, and while the added typing required by a six- or seven-letter keyword rather than a two-letter one is minimal (especially with modern IDEs), it does increase line length and make the code seem wordier, especially if you have to do this to multiple keywords.

In practice, I imagine the best solution would be a mix of all three options. In a few cases, the word-based version of a particular keyword (such as if) is so well established in the programming field that trying to replace it with something other than a literal translation would cause more trouble than it would be worth. But in other cases, such as the in keyword that I mentioned above in many of its uses, there's no reason that different syntax couldn't be created that would be more amenable to decent localization.

Designing for localization is about much more than simply making a list of the strings your project uses and replacing them with the set for another language. If you've ever built any localizable product, you probably already knew that, but it bears repeating as many times as you can stand to hear it. It requires a little bit more work up front, but your users will thank you in the long run.