community.roxen.com
Not logged in Date: May 13, 2008
 DEMO  DOCS  PIKE
 COMMUNITY  DOWNLOAD
Home Articles WebServer Multilingual Support www.roxen.com

Something about Languages

Author: Martin Nilsson <nilsson@roxen.com>
Last modified: 2000-10-04 11:31:46


This article will try to give some no nonsense information about the multilingual features of Roxen Webserver. Though it will be quite technical it should not be too hard to follow for people with only very limited knowledge of how Roxen Webserver works internally. This article was written for Roxen Webserver 2.0.46.

A webpage is requested

A request for a web page is first handles by the HTTP protocol module (assuming we're using HTTP of course). The HTTP protocol module can be found in server/protocols/http.pike. A request is the block of data sent from the browser to the web server. In it there is information like what page is requested, the name of the browser, defined cookies etc. In your browser you can (probably) find a setting called "languages", or similar, where you can choose an order of preferrence between languages sent to you from the server. The HTTP protocol module will find this information in the "accept-language"-part of the request

When the HTTP protocol gets a request it always creates a preferred language object in id->misc->pref_languages. If one or several accept-language's is found in the request, that information is copied into the preferred language object. That information is also added to the array id->misc["accept-language"]. The language object can then be asked what the most preferred language is by calling get_language(). To get an ordered list of all the languages call get_languages().

Martin Nilsson
The author, Martin Nilsson
<nilsson@roxen.com>

Better language selection

The "accept-language"-system has several shortcomings. It is for example very common that you would like to switch between languages on a site or that you would like to define a preferred lanuage per site. To help with these things there is a Roxen module called "Preferred Language Analyzer", server/modules/misc/preferred_languages.pike. It will prepend prestates and roxen cookies to the list of languages in the pref_languages object, i.e. the priorities are:

  1. Prestates
  2. Roxen cookies
  3. Browser settings

The preferred language analyzer also does some filtering of the languages. It always filters the list of languages so that only valid languages codes (like "en", "de", etc.) remains. Hence, if you use the prestate "(showcounters,en)" only "en" will be taken for a language. It is also possible to enter a list of languages present on the server, to filter against. If you only have your pages in english and swedish you can enter "en, sv" in the adminstration interface.

There is also an option to have the preferred language analyzer progate the language. What this really means is that the most preferred language is copied into id->misc->defines->theme_language, which in turn is used by many RXML tags. You can try this by adding preferred language analyzer, turn on the propagation and view the following page:



  <html><body>
    <number num="42">
  </body></html>

You can now change the language of the outputted fortytwo be e.g. adding the prestate (sv), which will yield "fyrtiotvå", or by changing the preferred language in your browser settings. You can access the theme_language variable through &page.theme-language;

Valid languages

As stated above only "valid" language codes gets through. The list of valid languages comes from the language files in the /server/languages/ folder. Every file in this directory (except for abstract.pike, which is an abstract language class) represent a language and are loaded into Roxen Webserver when it boots. These language files has the following methods:
array id() Returns an array with three elements; the languages ISO 639 language code, the name of the language in english and the name of the language in itself.
array aliases() Returns an array with identifiers that identifies the language. The id() array is a subset of this array.
string language(string code) Returns the name of the language with the language code code.
mapping list_languages() Returns a mapping that maps language codes the language names.
string number(int number) Returns the number as words.
string ordered(int number) Returns the number as an order number in words.
string date(int posixtime, void|mapping options) Returns the date as a string. Valid options in the options mapping are "full", "date" or "time". E.g. date(i,(["time":1])).
string month(int month) Returns the name of the month number given.
string day(int day) Returns the name of the weekday given.

These files are handled by /server/base_server/language.pike whose methods are accessible directly from the roxen object. One can either call roxen->language to call a language with fallback routines if you try to call a nonexisting language. If you know what you are doing you can call roxen->language_low instead to get a language object, as described above. The language methods are:
function language(string language, string function) Returns requested function from selected language, or fallback language. Fallback language can be set by setting the environment variable ROXEN_LANG. Defaults to "en".
array list_languages() Return a list of all language codes.
object language_low(string language) Returns the requested language object.

Change language

The preferred language analyzer does also contain some functionality to easy create language selectors, with wich you can switch between languages in different ways. Once the preferred language analyzer is loaded the emit tag can access the source "languages". Such an emit will iterate over a set of language codes and generate extra information about each of them. The set of language codes will be fetch from, in order

  1. The list provided in the langs attribute.
  2. The list found in id->misc->defines->present_languages.
  3. The list provided as the existing languages in the admin interface.
Note that it will not concatenate these lists, but take the first available. Inside the emit tag the following entities will be available:
code The langugage code.
en The language name in english.
local The language name as written in the language itself.
preurl A URL which makes this language the used one by altering prestates.
confurl A URL which makes the language the used one by altering the roxen cookie.
localized The language name as written in the currently selected language.


  <emit source="languages">
   <a href="&_.confurl;"><img src="/img/flags/&_.code;.gif" alt="&_.localized;" />
  </emit>

Selecting pages

After going through so much trouble to identify what language the user is interested in, it is time to deliver the goods. This can be achieved by the "Language module II", server/modules/misc/language2.pike

After a request has been processed by the protocol module and the first modules (such as the preferred language analyzer) it is time to fetch a page from a file system module. If no page is found Roxen Webserver contiues to call the directory modules. Being a directory module, the language module takes the prioritized list of languages and tries to load files as file.code.extension, e.g. if the language list is "en", "sv" and "de" and the requested file is index.html the language module will in order try to get "index.en.html", "index.sv.html", "index.de.html". The selected language will have its code copied to id->misc->defines->language and a list of all found languages is copied to the multiset id->misc->defines->present_languages.

Naturally some of the internal stuff is exposed to RXML through entitites:
page.language The current language, i.e. id->mics->defines->language.
page.theme-language The current language theme, i.e. id->misc->defines->theme_language.
client.accept-language The first language in the accept-language list, i.e. id->misc["accept-language"][0].
client.accept-languages All the accept-languages, i.e. id->misc["accept-language"]*", ".
client.language The most wanted language, i.e. id->misc->pref_languages->get_language().
client.languages The complete list of wanted languages, i.e. id->misc->pref_languages->get_languages()*", "

The future

Since most of this language system is completely new there will probably bugs, missing features and features that ought to be changed. You are all very welcome to contribute your ideas, problems and patches to the community and to Roxen IS. A good place to start is to look in the language files in the language directory and make sure that it is correct and up to date. There are a lot missing here since the format was expanded.



Request flow chart
Request flow chart