The ISO-639 standard defines all languages and standardizes codes for designing them as well as the English and French names of these languages.
Each language in the ISO-639 list has a three letter code. The most widely spread languages also have a two-letter code defined.
In Unix systems, the two-letter codes are used in locales files and thus these codes should be used when they exist for a given language. When there is no two-letter code, the three-letter code must be used in the locales file.
On Debian systems, the iso-codes
packages
includes a list of all ISO-639 material. This list must be used as a
reference when in doubt for the code of a given language.
Some languages may have variants, for instance depending on the region they're spoken in. The distinction must be made on the written form of the language: different variants in terms of spoken language are out of topic here.
When the variants differ on the way to write the same language (no different ISO codes exist), the country/region part of the locale should be used. Translators who are considering doing so must first contact the Debian Installer i18n coordinators. Such cases should be kept to a very low number of occurrences and should only be used when the two languages differ too much.
The currently accepted variants are:
zh_CN/zh_TW: Simplified Chinese and Traditional Chinese. These two ways of writing the Chinese language have their own translation teams (but only one mailing list). They should not be confused with the Mandarin, Cantonese or any other way to speak Chinese while the written form does not change.
pt/pt_BR: classical Portuguese is spoken in Portugal as well as nearly all former Portuguese colonies. Brazilian Portuguese largely differs from Portuguese and is of course used in Brazil. Two different translation sets exist for Debian packages though both teams work closely together (there is for instance only one mailing list)
pa_IN/pa_PK: the Punjabi language has only one spoken form, but two written forms: one using Indic-style glyphs (Gurmukhi), mostly used in India (Punjab state), and another one, using an Arabic script, mostly used in Pakistan.
FAQ about variants:
there are no en_GB/en_US variants in Debian. The use of either British or American spelling is not highly standardized in Debian. The installer team has however decided to refer to US-style spelling.
the Spanish team has chosen to keep only one variant even though some differences exist in the way Spanish is spoken in Spain and Latin America, for instance.