Node:Programmer i18n, Next:, Previous:Explaining gettext, Up:Internationalization



Internationalizing awk Programs

gawk provides the following variables and functions for internationalization:

TEXTDOMAIN
This variable indicates the application's text domain. For compatibility with GNU gettext, the default value is "messages".
_"your message here"
String constants marked with a leading underscore are candidates for translation at runtime. String constants without a leading underscore are not translated.
dcgettext(string [, domain [, category]])
This built-in function returns the translation of string in text domain domain for locale category category. The default value for domain is the current value of TEXTDOMAIN. The default value for category is "LC_MESSAGES".

If you supply a value for category, it must be a string equal to one of the known locale categories described in the previous section. You must also supply a text domain. Use TEXTDOMAIN if you want to use the current domain.

Caution: The order of arguments to the awk version of the dcgettext function is purposely different from the order for the C version. The awk version's order was chosen to be simple and to allow for reasonable awk-style default arguments.

dcngettext(string1, string2, number [, domain [, category]])
This built-in function returns the plural form used for number of the translation of string1 and string2 in text domain domain for locale category category. string1 is the English singular variant of a message, and string2 the English plural variant of the same message. The default value for domain is the current value of TEXTDOMAIN. The default value for category is "LC_MESSAGES".

The same remarks as for the dcgettext function apply.

bindtextdomain(directory [, domain])
This built-in function allows you to specify the directory in which gettext looks for .mo files, in case they will not or cannot be placed in the standard locations (e.g., during testing). It returns the directory in which domain is "bound."

The default domain is the value of TEXTDOMAIN. If directory is the null string (""), then bindtextdomain returns the current binding for the given domain.

To use these facilities in your awk program, follow the steps outlined in the previous section, like so:

  1. Set the variable TEXTDOMAIN to the text domain of your program. This is best done in a BEGIN rule (see The BEGIN and END Special Patterns), or it can also be done via the -v command-line option (see Command-Line Options):
    BEGIN {
        TEXTDOMAIN = "guide"
        ...
    }
    
  2. Mark all translatable strings with a leading underscore (_) character. It must be adjacent to the opening quote of the string. For example:
    print _"hello, world"
    x = _"you goofed"
    printf(_"Number of users is %d\n", nusers)
    
  3. If you are creating strings dynamically, you can still translate them, using the dcgettext built-in function:
    message = nusers " users logged in"
    message = dcgettext(message, "adminprog")
    print message
    

    Here, the call to dcgettext supplies a different text domain ("adminprog") in which to find the message, but it uses the default "LC_MESSAGES" category.

  4. During development, you might want to put the .mo file in a private directory for testing. This is done with the bindtextdomain built-in function:
    BEGIN {
       TEXTDOMAIN = "guide"   # our text domain
       if (Testing) {
           # where to find our files
           bindtextdomain("testdir")
           # joe is in charge of adminprog
           bindtextdomain("../joe/testdir", "adminprog")
       }
       ...
    }
    

See A Simple Internationalization Example, for an example program showing the steps to create and use translations from awk.