RecAPI
|
Language identifier names at functions like kRecGetLanguageInfo and kRecFindLanguageEx can be accessed or specified by their English names or by their coding in a range of standards. Note that some standards don't support all of our languages, these are noted below. There is a different problem with the Chinese, Portuguese and Serbian languages: we use two separate language codes in all of these 3 cases based on the country or the script used: Simplified and Traditional Chinese, Portugal and Brazilian Portuguese, Cyrillic and Latin Serbian. The different ISO 639-x standards don't make such a difference. The handling of these languages is described at "In case of language-pairs" below.
CSDK internal code: This is the last 3 letters of the name used by us as the LANGUAGES enum, i.e. the 3 letters following "LANG_". Note that the kRecFindLanguage family of functions accepts both 3-letter internal codes and the "LANG_" prefixed 8-letter ones. This identifier can be used to uniquely identify any of our languages. Standard ISO idetifiers (see later here) all lack some of our languages. The ISO 639-3 superset used by us also has the feature of uniquely identifying all of our languages so if you want to have a close-to-standard identifier then you may consider using that one instead of this internal code. The BCP 47 identifiers are also supported, they can also be used to uniquely identify all of our languages (see below).
ISO 639-1: Some of our languages are not supported by this standard, in such cases kRecGetLanguageInfo returns an empty string in Name_639_1
(see the LANGUAGE_INFO structure). In case of language-pairs the ISO 639-1 standard code is returned at both languages. The kRecFindLanguage family of functions supports both the 2-letter language codes (e.g. en) and the combined BCP 47 style language and country codes (e.g. en-US). Sometimes the country codes are needed for the correct mapping (e.g. pt-BR for Brazilian Portuguese, zh-TW for Traditional Chinese, etc.), while sometimes the script code is necessary (e.g. sr-Cyrl for Cyrillic Serbian and sr-Latn for Latin Serbian). The table below shows those combined codes that could be necessary, but other combinations not shown in the table are supported as well. Note that kRecGetLanguageInfo always returns the 2-letter basic code only in the Name_639_1
field. See also the Name_BCP_47
field (in the LANGUAGE_INFO structure).
ISO 639-2/B: Some of our languages are not supported by this standard, in such cases kRecGetLanguageInfo returns an empty string in Name_639_2B
. In case of language-pairs the ISO 639-2/B standard code is returned at the first language of the language-pair only, while an empty string is returned at the second language. Note that some of the identifier codes in our 639-2/B implementation do not refer to a specific language, but a language collection. These identifiers are:
ISO 639-3: This language identifier standard can map 1-to-1 to our languages, because it has a method to define local language codes. The used local codes are:
In case of language-pairs the ISO 639-3 standard code is returned at the first language of the language-pair only, while a local code is returned at the second language. The Sami language is a language collection that does not have an ISO 639-3 code but has an ISO 639-2/B code. In this case the capitalized 639-2/B code (SMI) is used by us as extended 639-3 code. This is not a standard 639-3 name, but adding it made the 639-3 superset able to identify all of our languages. (In previous versions MYN, NAH and WEN were also used as similar uppercase extensions for the Mayan, Nahuatl and Sorbian language collections; now we use the standard quc, nci and hsb language codes instead that are the 639-3 codes for the largest languages of those families. Simplified Chinese used to be coded with a local code, qcs, now zho is returned. Visayan used to be coded as qis, now it is changed to ceb. Note that all the previously used names can still be used with the kRecFindLanguage family of functions to maintain backwards compatibility.)
BCP 47: This is a newer standard that uniquely identifies all the languages supported by any of the previous 639-x standards. kRecGetLanguageInfo returns such language names (like en-US or sr-Latn-RS, etc.) in Name_BCP_47
. These names are mostly based on the 639-1 two-letter standard, but sometimes 3 letter language codes are also used (like hsb-DE). The country-code component is usually 2 letters, but rarely a 3 digit code is used (like pap-029). The kRecFindLanguageEx function supports these names both as BCP 47 codes and 639-1 ones for backwards compatibility.
Windows 3 letter code: This is the GetLocaleInfo(LOCALE_SABBREVLANGNAME) identifier that is sometimes called ISO 639x in the Microsoft documentations, however it does not match any of the ISO standards. It is mostly (but not always) based on the ISO 639-1 two-letter codes by adding a third character. It looks similar to the ISO 639-2/B and ISO 639-3 standards, but has some different codes (e. g. The Swedish language ISO identifier is swe while the Windows identifier is SVE.) Some of our languages are not supported by the Windows 3 letter codes, in such cases kRecGetLanguageInfo returns an empty string in Name_Win_3.
Windows locale ID (LCID): The kRecGetLanguageInfo function returns this identifier as an integer. The kRecFindLanguages and kRecFindLanguageEx functions can use this numeric value directly to find a language, while the kRecFindLanguage as well as the kRecSetUILang functions support a special string representation: "LCID_12345" i.e. the decimal value of the local ID prefixed by the "LCID_" string. In C/C++ it can be generated with the "LCID_%d" sprintf format. Some of our languages are not supported by the LCID codes, in such cases kRecGetLanguageInfo
returns the ID of English (0x0409) in LangID
.
EnglishName
field. With the simpler kRecFindLanguage function you don't have such choice; in case of conflicting languages it returns the most relevant of the possible languages (usually based on the character case) with a warning. If you're using the CSDK internal code with kRecFindLanguage it's best to use it with the "LANG_" prefix. It is suggested to use kRecFindLanguageEx instead of these functions. With kRecFindLanguageEx
you can stick to a single standard, and this is the most flexible function even when the general query with LANGCODE_ALL is used. In case of conflicting languages kRecFindLanguageEx
returns the most relevant of the possible languages (usually based on the character case) with a warning, and the conflicting other language can be learned as well.The below table shows all supported languages with the supported language names and identifiers. Some languages have multiple identifiers listed in cells below, in those cases the first ID is the one returned by kRecGetLanguageInfo
while the other IDs can also be used with the kRecFindLanguage family of functions.
Name | Alternate Name | enum name | Script | Continent | ISO 639-3 | ISO 639-2/B | ISO 639-1 | BCP 47 | Windows | Microsoft ID |
English | LANG_ENG | Latin | Europe, Oceania, North America | eng | eng | en | en-US | ENU, ENG, ENA, ENC, ENZ | 0409, 0809, 0C09, 1009, 1409 | |
German | LANG_GER | Latin | Europe | deu | ger | de | de-DE | DEU, GER | 0407 | |
French | LANG_FRE | Latin | Europe, North America | fra | fre | fr | fr-FR | FRA, FRC | 040C, 0C0C | |
Dutch | Flemish | LANG_DUT | Latin | Europe | nld | dut | nl | nl-NL | NLD | 0413 |
Norwegian | Bokmal, Nynorsk | LANG_NOR | Latin | Europe | nor | nor | no, nb, nn | no-NO | NOR, NON | 0414, 0814 |
Swedish | LANG_SWE | Latin | Europe | swe | swe | sv | sv-SE | SVE | 041D | |
Finnish | LANG_FIN | Latin | Europe | fin | fin | fi | fi-FI | FIN | 040B | |
Danish | LANG_DAN | Latin | Europe | dan | dan | da | da-DK | DAN | 0406 | |
Icelandic | LANG_ICE | Latin | Europe | isl | ice | is | is-IS | ISL | 040F | |
Portuguese | LANG_POR | Latin | Europe | por | por | pt, pt-PT | pt-PT | PTG | 0816 | |
Spanish | LANG_SPA | Latin | Europe, Latin America | spa | spa | es | es-ES | ESN, ESM | 0C0A, 080A | |
Catalan | Catalonian, Valencian | LANG_CAT | Latin | Europe | cat | cat | ca | ca-ES | CAT | 0403 |
Galician | Gallegan | LANG_GAL | Latin | Europe | glg | glg | gl | gl-ES | GLC | 0456 |
Italian | LANG_ITA | Latin | Europe | ita | ita | it | it-IT | ITA | 0410 | |
Maltese | LANG_MAL | Latin | Europe | mlt | mlt | mt | mt-MT | MLT | 043A | |
Greek | LANG_GRE | Greek | Europe | ell | gre | el | el-GR | ELL | 0408 | |
Polish | LANG_POL | Latin | Europe | pol | pol | pl | pl-PL | PLK | 0415 | |
Czech | LANG_CZH | Latin | Europe | ces | cze | cs | cs-CZ | CSY | 0405 | |
Slovak | LANG_SLK | Latin | Europe | slk | slo | sk | sk-SK | SKY | 041B | |
Hungarian | LANG_HUN | Latin | Europe | hun | hun | hu | hu-HU | HUN | 040E | |
Slovenian | LANG_SLN | Latin | Europe | slv | slv | sl | sl-SI | SLV | 0424 | |
Croatian | LANG_CRO | Latin | Europe | hrv | scr | hr | hr-HR | HRV, HRB | 041A, 001A | |
Romanian | Rumanian | LANG_ROM | Latin | Europe | ron | rum | ro | ro-RO | ROM | 0418 |
Albanian | LANG_ALB | Latin | Europe | sqi | alb | sq | sq-AL | SQI | 041C | |
Turkish | LANG_TUR | Latin | Europe, Asia | tur | tur | tr | tr-TR | TRK | 041F | |
Estonian | LANG_EST | Latin | Europe | est | est | et | et-EE | ETI | 0425 | |
Latvian | LANG_LAT | Latin | Europe | lav | lav | lv | lv-LV | LVI | 0426 | |
Lithuanian | LANG_LIT | Latin | Europe | lit | lit | lt | lt-LT | LTH | 0427 | |
Esperanto | LANG_ESP | Latin | International | epo | epo | eo | eo-001 | |||
Serbian(Latin) | Bosnian | LANG_SRL | Latin | Europe | qsl | "", sr-Latn, Lt-sr, bs | sr-Latn-RS | SRL, SRS, SRM, SRP, BSB | 081A, 181A, 241A, 2C1A, 701A, 141A, 681A | |
Serbian | LANG_SRB | Cyrillic | Europe | srp | srp, scc | sr, sr-Cyrl, Cy-sr | sr-Cyrl-RS | SRB, SRN, SRO, SRQ, BSC | 0C1A, 1C1A, 281A, 301A, 6C1A, 7C1A, 201A, 641A, 781A | |
Macedonian | LANG_MAC | Cyrillic | Europe | mkd | mac | mk | mk-MK | MKI | 042F | |
Moldavian | Moldovan | LANG_MOL | Cyrillic | Europe | mol | mol | mo, ro-MO | ro-MO | 0818 | |
Bulgarian | LANG_BUL | Cyrillic | Europe | bul | bul | bg | bg-BG | BGR | 0402 | |
Byelorussian | Belarusian, Belarusan | LANG_BEL | Cyrillic | Europe | bel | bel | be | be-BY | BEL | 0423 |
Ukrainian | LANG_UKR | Cyrillic | Europe | ukr | ukr | uk | uk-UA | UKR | 0422 | |
Russian | LANG_RUS | Cyrillic | Europe, Asia | rus | rus | ru | ru-RU | RUS | 0419 | |
Chechen | LANG_CHE | Cyrillic | Asia | che | che | ce | ce-RU | |||
Kabardian | LANG_KAB | Cyrillic | Asia | kbd | kbd | kbd-RU | ||||
Afrikaans | LANG_AFR | Latin | Africa | afr | afr | af | af-ZA | AFK | 0436 | |
Aymara | LANG_AYM | Latin | Latin America | aym | aym | ay | ay-BO | |||
Basque | LANG_BAS | Latin | Europe | eus | baq | eu | eu-ES | EUQ | 042D | |
Bemba | Ichibemba | LANG_BEM | Latin | Africa | bem | bem | bem-ZM | |||
Blackfoot | Siksika | LANG_BLA | Latin | North America | bla | bla | bla-CA | |||
Breton | LANG_BRE | Latin | Europe | bre | bre | br | br-FR | BRE | 047E | |
Brazilian | LANG_BRA | Latin | Latin America | qbp | "", pt-BR | pt-BR | PTB | 0416 | ||
Bugotu | Bughotu | LANG_BUG | Latin | Oceania | bgt | bgt-SB | ||||
Chamorro | LANG_CHA | Latin | Oceania | cha | cha | ch | ch-MP | |||
Tswana(Chuana) | Chuana, Setswana | LANG_CHU | Latin | Africa | tsn | tsn | tn | tn-ZA | TSN, TNA | 0432 |
Corsican | LANG_COR | Latin | Europe | cos | cos | co | co-FR | COS | 0483 | |
Crow | LANG_CRW | Latin | North America | cro | cro-US | |||||
Eskimo | Inuit | LANG_ESK | Latin | Europe, North America | qes, esx | esx-CA | ||||
Faroese | LANG_FAR | Latin | Europe | fao | fao | fo | fo-FO | FOS | 0438 | |
Fijian | LANG_FIJ | Latin | Oceania | fij | fij | fj | fj-FJ | |||
Frisian | LANG_FRI | Latin | Europe | fry | fry | fy | fy-NL | FYN | 0462 | |
Friulian | LANG_FRU | Latin | Europe | fur | fur | fur-IT | ||||
Gaelic(Irish) | Irish | LANG_GLI | Latin | Europe | gle | gle | ga, gd-IE | ga-IE | IRE | 083C |
Gaelic(Scottish) | Scottish | LANG_GLS | Latin | Europe | gla | gla | gd | gd-GB | GLA | 0491, 043C |
Ganda(Luganda) | Luganda | LANG_GAN | Latin | Africa | lug | lug | lg | lg-UG | ||
Guarani | LANG_GUA | Latin | Latin America | grn | grn | gn | gn-PY | 0474 | ||
Hani | LANG_HAN | Latin | Asia | hni | hni-CN | |||||
Hawaiian | LANG_HAW | Latin | Oceania | haw | haw | haw-US | 0475 | |||
Ido | LANG_IDO | Latin | International | ido | ido | io | io-001 | |||
Indonesian | LANG_IND | Latin | Asia | ind | ind | id | id-ID | IND | 0421 | |
Interlingua | LANG_INT | Latin | International | ina | ina | ia | ia-001 | |||
Kasub | Kashubian | LANG_KAS | Latin | Europe | csb | csb | csb-PL | |||
Kawa | Wa, Blang | LANG_KAW | Latin | Asia | wbm | wbm-MM | ||||
Kikuyu | Gikuyu | LANG_KIK | Latin | Africa | kik | kik | ki | ki-KE | ||
Kongo | LANG_KON | Latin | Africa | kon | kon | kg | kg-CD | |||
Kpelle | LANG_KPE | Latin | Africa | kpe | kpe | kpe-LR | ||||
Kurdish | LANG_KUR | Latin | Asia | kur | kur | ku | ku-Latn-TR | |||
Latin | LANG_LTN | Latin | International | lat | lat | la | la-001 | 0476 | ||
Luba | LANG_LUB | Latin | Africa | lua | lua | lua-CD | ||||
Luxembourgish | Luxembourgian, Letzeburgesch, Luxembourgeois | LANG_LUX | Latin | Europe | ltz | ltz | lb | lb-LU | LBX | 046E |
Malagasy | LANG_MLG | Latin | Africa | mlg | mlg | mg | mg-MG | |||
Malay | LANG_MLY | Latin | Asia | msa | may | ms | ms-MY | MSL | 043E | |
Malinke | Maninkakan, Maninka | LANG_MLN | Latin | Africa | emk, mlq | emk-GN | ||||
Maori | LANG_MAO | Latin | Oceania | mri | mao | mi | mi-NZ | MRI | 0481 | |
Mayan | K'iche' | LANG_MAY | Latin | Latin America | quc, myn | myn | myn-GT | 0486 | ||
Miao | Hmong, Mong | LANG_MIA | Latin | Asia | hmn | hmn | hmn-CN | |||
Minangkabau | Minankabaw | LANG_MIN | Latin | Asia | min | min | min-ID | |||
Mohawk | LANG_MOH | Latin | North America | moh | moh | moh-CA | MWK | 047C | ||
Nahuatl | Aztec | LANG_NAH | Latin | Latin America | nci, nah | nah | nah-MX | |||
Nyanja | Chewa, Chichewa | LANG_NYA | Latin | Africa | nya | nya | ny | ny-ZW | ||
Occidental | Interlingue | LANG_OCC | Latin | International | ile, occ | ile | ie | ie-001 | ||
Ojibway | Ojibwa | LANG_OJI | Latin | North America | oji | oji | oj | oj-CA | ||
Papiamento | LANG_PAP | Latin | Latin America | pap | pap | pap-029 | 0479 | |||
PidginEnglish | Tok Pisin | LANG_PID | Latin | Oceania | tpi | tpi | tpi-PG | |||
Provencal | Occitan | LANG_PRO | Latin | Europe | oci, prv | oci | oc | oc-FR | OCI | 0482 |
Quechua | LANG_QUE | Latin | Latin America | que, quz | que | qu | qu-PE | QUE | 086B | |
Rhaetic | Romansh | LANG_RHA | Latin | Europe | roh | roh | rm | rm-CH | RMC | 0417 |
Romany | LANG_ROY | Latin | Europe | rom | rom | rom-001 | ||||
Rwanda | Ruanda, Kinyarwanda | LANG_RUA | Latin | Africa | kin | kin | rw | rw-RW | KIN | |
Rundi | LANG_RUN | Latin | Africa | run | run | rn | rn-BI | |||
Samoan | LANG_SAM | Latin | Oceania | smo | smo | sm | sm-WS | |||
Sardinian | LANG_SAR | Latin | Europe | srd | srd | sc | sc-IT | |||
Shona | LANG_SHO | Latin | Africa | sna | sna | sn | sn-ZW | |||
Sioux | Dakota | LANG_SIO | Latin | North America | dak | dak | dak-US | |||
Sami | LANG_SMI | Latin | Europe | SMI | smi | smi-NO | "", SZI | 003B | ||
Sami(Lule) | Lule Sami | LANG_SML | Latin | Europe | smj | smj | smj-NO | SMJ | 103B | |
Sami(Northern) | Northern Sami | LANG_SMN | Latin | Europe | sme | sme | se | se-NO | SME | 043B |
Sami(Southern) | Southern Sami | LANG_SMS | Latin | Europe | sma | sma | sma-NO | SMA | 183B | |
Somali | LANG_SOM | Latin | Africa | som | som | so | so-SO | 0477 | ||
Sotho | Sesotho, Sutu | LANG_SOT | Latin | Africa | sot | sot | st | st-ZA | 0430 | |
Sundanese | LANG_SUN | Latin | Asia | sun | sun | su | su-Latn-ID | |||
Swahili | Kiswahili | LANG_SWA | Latin | Africa | swa | swa | sw | sw-KE | SWK | 0441 |
Swazi | Swati | LANG_SWZ | Latin | Africa | ssw | ssw | ss | ss-SZ | ||
Tagalog | Filipino, Pilipino | LANG_TAG | Latin | Asia | tgl, fil | tgl, fil | tl | tl-PH | FPO | 0464 |
Tahitian | LANG_TAH | Latin | Oceania | tah | tah | ty | ty-PF | |||
Pirez | Tinpo | LANG_TIN | Latin | Europe, Asia, Africa | qti | qti-001 | ||||
Tongan | Tonga | LANG_TON | Latin | Oceania | ton | ton | to | to-TO | ||
Tun | Tunia | LANG_TUN | Latin | Asia | tug | tug-TD | ||||
Visayan | Cebuano | LANG_VIS | Latin | Asia | ceb, qis | ceb-PH | ||||
Welsh | LANG_WEL | Latin | Europe | cym | wel | cy | cy-GB | CYM | 0452 | |
Sorbian(Wend) | Wend | LANG_WEN | Latin | Europe | hsb, dsb, wen | wen, hsb, dsb | "", sb | wen-DE | HSB, DSB | 042E |
Wolof | LANG_WOL | Latin | Africa | wol | wol | wo | wo-SN | WOL | 0488 | |
Xhosa | LANG_XHO | Latin | Africa | xho | xho | xh | xh-ZA | XHO | 0434 | |
Zapotec | LANG_ZAP | Latin | Latin America | zap | zap | zap-MX | ||||
Zulu | LANG_ZUL | Latin | Africa | zul | zul | zu | zu-ZA | ZUL | 0435 | |
Japanese | LANG_JPN | Asian | Asia | jpn | jpn | ja | ja-JP | JPN | 0411 | |
Chinese(S) | Simplified Chinese | LANG_CHS | Asian | Asia | zho, qcs | chi | zh, zh-CHS, zh-CN, zh-SG | zh-Hans-CN | CHS, ZHI | 0804, 0004, 1004 |
Chinese(T) | Traditional Chinese | LANG_CHT | Asian | Asia | qct | "", zh-CHT, zh-HK, zh-MO, zh-TW | zh-Hant-TW | CHT, ZHH, ZHM | 0404, 7C04, 0C04, 1404 | |
Korean | LANG_KRN | Asian | Asia | kor | kor | ko | ko-KR | KOR | 0412 | |
Thai | LANG_THA | Asian | Asia | tha | tha | th | th-TH | THA | 041E | |
Arabic | LANG_ARA | Right-to-left | Asia | ara | ara | ar | ar-SA | ARA | 0401 | |
Hebrew | LANG_HEB | Right-to-left | Europe, Asia | heb | heb | he | he-IL | HEB | 040D | |
Vietnamese | LANG_VIE | Latin | Asia | vie | vie | vi | vi-VN | VIT | 042A |