RecAPI
Language identifiers

Language identifier names at functions like kRecGetLanguageInfo and kRecFindLanguageEx can be accessed or specified by their English names or by their coding in a range of standards. Note that some standards don't support all of our languages, these are noted below. There is a different problem with the Chinese, Portuguese and Serbian languages: we use two separate language codes in all of these 3 cases based on the country or the script used: Simplified and Traditional Chinese, Portugal and Brazilian Portuguese, Cyrillic and Latin Serbian. The different ISO 639-x standards don't make such a difference. The handling of these languages is described at "In case of language-pairs" below.

CSDK internal code: This is the last 3 letters of the name used by us as the LANGUAGES enum, i.e. the 3 letters following "LANG_". Note that the kRecFindLanguage family of functions accepts both 3-letter internal codes and the "LANG_" prefixed 8-letter ones. This identifier can be used to uniquely identify any of our languages. Standard ISO idetifiers (see later here) all lack some of our languages. The ISO 639-3 superset used by us also has the feature of uniquely identifying all of our languages so if you want to have a close-to-standard identifier then you may consider using that one instead of this internal code. The BCP 47 identifiers are also supported, they can also be used to uniquely identify all of our languages (see below).

ISO 639-1: Some of our languages are not supported by this standard, in such cases kRecGetLanguageInfo returns an empty string in Name_639_1 (see the LANGUAGE_INFO structure). In case of language-pairs the ISO 639-1 standard code is returned at both languages. The kRecFindLanguage family of functions supports both the 2-letter language codes (e.g. en) and the combined BCP 47 style language and country codes (e.g. en-US). Sometimes the country codes are needed for the correct mapping (e.g. pt-BR for Brazilian Portuguese, zh-TW for Traditional Chinese, etc.), while sometimes the script code is necessary (e.g. sr-Cyrl for Cyrillic Serbian and sr-Latn for Latin Serbian). The table below shows those combined codes that could be necessary, but other combinations not shown in the table are supported as well. Note that kRecGetLanguageInfo always returns the 2-letter basic code only in the Name_639_1 field. See also the Name_BCP_47 field (in the LANGUAGE_INFO structure).

ISO 639-2/B: Some of our languages are not supported by this standard, in such cases kRecGetLanguageInfo returns an empty string in Name_639_2B. In case of language-pairs the ISO 639-2/B standard code is returned at the first language of the language-pair only, while an empty string is returned at the second language. Note that some of the identifier codes in our 639-2/B implementation do not refer to a specific language, but a language collection. These identifiers are:

ISO 639-3: This language identifier standard can map 1-to-1 to our languages, because it has a method to define local language codes. The used local codes are:

In case of language-pairs the ISO 639-3 standard code is returned at the first language of the language-pair only, while a local code is returned at the second language. The Sami language is a language collection that does not have an ISO 639-3 code but has an ISO 639-2/B code. In this case the capitalized 639-2/B code (SMI) is used by us as extended 639-3 code. This is not a standard 639-3 name, but adding it made the 639-3 superset able to identify all of our languages. (In previous versions MYN, NAH and WEN were also used as similar uppercase extensions for the Mayan, Nahuatl and Sorbian language collections; now we use the standard quc, nci and hsb language codes instead that are the 639-3 codes for the largest languages of those families. Simplified Chinese used to be coded with a local code, qcs, now zho is returned. Visayan used to be coded as qis, now it is changed to ceb. Note that all the previously used names can still be used with the kRecFindLanguage family of functions to maintain backwards compatibility.)

BCP 47: This is a newer standard that uniquely identifies all the languages supported by any of the previous 639-x standards. kRecGetLanguageInfo returns such language names (like en-US or sr-Latn-RS, etc.) in Name_BCP_47. These names are mostly based on the 639-1 two-letter standard, but sometimes 3 letter language codes are also used (like hsb-DE). The country-code component is usually 2 letters, but rarely a 3 digit code is used (like pap-029). The kRecFindLanguageEx function supports these names both as BCP 47 codes and 639-1 ones for backwards compatibility.

Windows 3 letter code: This is the GetLocaleInfo(LOCALE_SABBREVLANGNAME) identifier that is sometimes called ISO 639x in the Microsoft documentations, however it does not match any of the ISO standards. It is mostly (but not always) based on the ISO 639-1 two-letter codes by adding a third character. It looks similar to the ISO 639-2/B and ISO 639-3 standards, but has some different codes (e. g. The Swedish language ISO identifier is swe while the Windows identifier is SVE.) Some of our languages are not supported by the Windows 3 letter codes, in such cases kRecGetLanguageInfo returns an empty string in Name_Win_3.

Windows locale ID (LCID): The kRecGetLanguageInfo function returns this identifier as an integer. The kRecFindLanguages and kRecFindLanguageEx functions can use this numeric value directly to find a language, while the kRecFindLanguage as well as the kRecSetUILang functions support a special string representation: "LCID_12345" i.e. the decimal value of the local ID prefixed by the "LCID_" string. In C/C++ it can be generated with the "LCID_%d" sprintf format. Some of our languages are not supported by the LCID codes, in such cases kRecGetLanguageInfo returns the ID of English (0x0409) in LangID.

Note:
There are some conflicting values in different language code standards, so take care when using them with the kRecFindLanguage family of functions:
  • CRO means Croatian as an internal code, while cro means Crow as a 639-3 code
  • LAT means Latvian as an internal code, while lat means Latin as a 639-3 or 639-2B code
  • MAY means Mayan as an internal code, while may means Malay as a 639-2B code
  • ROM means Romanian as an internal or Windows code, while rom means Romany as a 639-3 or 639-2B code
  • SRP means LatinSerbian as a Windows code, while srp means CyrillicSerbian as a 639-3 or 639-2B code
  • pt means both Portugal and BrazilianPortuguese as a 639-1 code
  • sr means both Cyrillic and LatinSerbian as a 639-1 code
  • zh means both Simplified and TraditionalChinese as a 639-1 code
With kRecFindLanguages it's suggested to stick to a single standard and place the language name in the corresponding field of the LANGUAGE_INFO structure, and not the common EnglishName field. With the simpler kRecFindLanguage function you don't have such choice; in case of conflicting languages it returns the most relevant of the possible languages (usually based on the character case) with a warning. If you're using the CSDK internal code with kRecFindLanguage it's best to use it with the "LANG_" prefix. It is suggested to use kRecFindLanguageEx instead of these functions. With kRecFindLanguageEx you can stick to a single standard, and this is the most flexible function even when the general query with LANGCODE_ALL is used. In case of conflicting languages kRecFindLanguageEx returns the most relevant of the possible languages (usually based on the character case) with a warning, and the conflicting other language can be learned as well.

The below table shows all supported languages with the supported language names and identifiers. Some languages have multiple identifiers listed in cells below, in those cases the first ID is the one returned by kRecGetLanguageInfo while the other IDs can also be used with the kRecFindLanguage family of functions.

Name Alternate Name enum name Script Continent ISO 639-3 ISO 639-2/B ISO 639-1 BCP 47 Windows Microsoft ID
English   LANG_ENG Latin Europe, Oceania,
North America
eng eng en en-US ENU, ENG, ENA,
ENC, ENZ
0409, 0809, 0C09,
1009, 1409
German   LANG_GER Latin Europe deu ger de de-DE DEU, GER 0407
French   LANG_FRE Latin Europe, North America fra fre fr fr-FR FRA, FRC 040C, 0C0C
Dutch   LANG_DUT Latin Europe nld dut nl nl-NL NLD 0413
Norwegian   LANG_NOR Latin Europe nor nor no, nb, nn nb-NO NOR, NON 0414, 0814
Swedish   LANG_SWE Latin Europe swe swe sv sv-SE SVE 041D
Finnish   LANG_FIN Latin Europe fin fin fi fi-FI FIN 040B
Danish   LANG_DAN Latin Europe dan dan da da-DK DAN 0406
Icelandic   LANG_ICE Latin Europe isl ice is is-IS ISL 040F
Portuguese   LANG_POR Latin Europe por por pt, pt-PT pt-PT PTG 0816
Spanish   LANG_SPA Latin Europe, Latin America spa spa es es-ES ESN, ESM 0C0A, 080A
Catalan Catalonian LANG_CAT Latin Europe cat cat ca ca-ES CAT 0403
Galician Gallegan LANG_GAL Latin Europe glg glg gl gl-ES GLC 0456
Italian   LANG_ITA Latin Europe ita ita it it-IT ITA 0410
Maltese   LANG_MAL Latin Europe mlt mlt mt mt-MT MLT 043A
Greek   LANG_GRE Greek Europe ell gre el el-GR ELL 0408
Polish   LANG_POL Latin Europe pol pol pl pl-PL PLK 0415
Czech   LANG_CZH Latin Europe ces cze cs cs-CZ CSY 0405
Slovak   LANG_SLK Latin Europe slk slo sk sk-SK SKY 041B
Hungarian   LANG_HUN Latin Europe hun hun hu hu-HU HUN 040E
Slovenian   LANG_SLN Latin Europe slv slv sl sl-SI SLV 0424
Croatian   LANG_CRO Latin Europe hrv scr hr hr-HR HRV, HRB 041A, 001A
Romanian Rumanian LANG_ROM Latin Europe ron rum ro ro-RO ROM 0418
Albanian   LANG_ALB Latin Europe sqi alb sq sq-AL SQI 041C
Turkish   LANG_TUR Latin Europe, Asia tur tur tr tr-TR TRK 041F
Estonian   LANG_EST Latin Europe est est et et-EE ETI 0425
Latvian   LANG_LAT Latin Europe lav lav lv lv-LV LVI 0426
Lithuanian   LANG_LIT Latin Europe lit lit lt lt-LT LTH 0427
Esperanto   LANG_ESP Latin International epo epo eo eo-001    
Serbian(Latin) Bosnian LANG_SRL Latin Europe qsl   sr, sr-Latn,
Lt-sr, bs
sr-Latn-RS SRL, SRS, SRM,
SRP, BSB
081A, 181A, 241A,
2C1A, 701A, 141A,
681A
Serbian   LANG_SRB Cyrillic Europe srp srp, scc sr, sr-Cyrl,
Cy-sr
sr-Cyrl-RS SRB, SRN, SRO,
SRQ, BSC
0C1A, 1C1A, 281A,
301A, 6C1A, 7C1A,
201A, 641A, 781A
Macedonian   LANG_MAC Cyrillic Europe mkd mac mk mk-MK MKI 042F
Moldavian   LANG_MOL Cyrillic Europe mol mol mo, ro-MO ro-MO   0818
Bulgarian   LANG_BUL Cyrillic Europe bul bul bg bg-BG BGR 0402
Byelorussian Belarusian,
Belarusan
LANG_BEL Cyrillic Europe bel bel be be-BY BEL 0423
Ukrainian   LANG_UKR Cyrillic Europe ukr ukr uk uk-UA UKR 0422
Russian   LANG_RUS Cyrillic Europe, Asia rus rus ru ru-RU RUS 0419
Chechen   LANG_CHE Cyrillic Asia che che ce ce-RU    
Kabardian   LANG_KAB Cyrillic Asia kbd kbd   kbd-RU    
Afrikaans   LANG_AFR Latin Africa afr afr af af-ZA AFK 0436
Aymara   LANG_AYM Latin Latin America aym aym ay ay-BO    
Basque   LANG_BAS Latin Europe eus baq eu eu-ES EUQ 042D
Bemba Ichibemba LANG_BEM Latin Africa bem bem   bem-ZM    
Blackfoot Siksika LANG_BLA Latin North America bla bla   bla-CA    
Breton   LANG_BRE Latin Europe bre bre br br-FR BRE 047E
Brazilian   LANG_BRA Latin Latin America qbp   pt, pt-BR pt-BR PTB 0416
Bugotu Bughotu LANG_BUG Latin Oceania bgt     bgt-SB    
Chamorro   LANG_CHA Latin Oceania cha cha ch ch-MP    
Tswana(Chuana) Chuana,
Setswana
LANG_CHU Latin Africa tsn tsn tn tn-ZA TSN, TNA 0432
Corsican   LANG_COR Latin Europe cos cos co co-FR COS 0483
Crow   LANG_CRW Latin North America cro     cro-US    
Eskimo Inuit LANG_ESK Latin Europe, North America qes, esx     esx-CA    
Faroese   LANG_FAR Latin Europe fao fao fo fo-FO FOS 0438
Fijian   LANG_FIJ Latin Oceania fij fij fj fj-FJ    
Frisian   LANG_FRI Latin Europe fry fry fy fy-NL FYN 0462
Friulian   LANG_FRU Latin Europe fur fur   fur-IT    
Gaelic(Irish) Irish LANG_GLI Latin Europe gle gle ga, gd-IE ga-IE IRE 083C
Gaelic(Scottish) Scottish LANG_GLS Latin Europe gla gla gd gd-GB GLA 0491, 043C
Ganda(Luganda) Luganda LANG_GAN Latin Africa lug lug lg lg-UG    
Guarani   LANG_GUA Latin Latin America grn grn gn gn-PY   0474
Hani   LANG_HAN Latin Asia hni     hni-CN    
Hawaiian   LANG_HAW Latin Oceania haw haw   haw-US   0475
Ido   LANG_IDO Latin International ido ido io io-001    
Indonesian   LANG_IND Latin Asia ind ind id id-ID IND 0421
Interlingua   LANG_INT Latin International ina ina ia ia-001    
Kasub Kashubian LANG_KAS Latin Europe csb csb   csb-PL    
Kawa Wa, Blang LANG_KAW Latin Asia wbm     wbm-MM    
Kikuyu Gikuyu LANG_KIK Latin Africa kik kik ki ki-KE    
Kongo   LANG_KON Latin Africa kon kon kg kg-CD    
Kpelle   LANG_KPE Latin Africa kpe kpe   kpe-LR    
Kurdish   LANG_KUR Latin Asia kur kur ku ku-Latn-TR    
Latin   LANG_LTN Latin International lat lat la la-001   0476
Luba   LANG_LUB Latin Africa lua lua   lua-CD    
Luxembourgish Luxembourgian,
Letzeburgesch,
Luxembourgeois
LANG_LUX Latin Europe ltz ltz lb lb-LU LBX 046E
Malagasy   LANG_MLG Latin Africa mlg mlg mg mg-MG    
Malay   LANG_MLY Latin Asia msa may ms ms-MY MSL 043E
Malinke Maninkakan,
Maninka
LANG_MLN Latin Africa emk, mlq     emk-GN    
Maori   LANG_MAO Latin Oceania mri mao mi mi-NZ MRI 0481
Mayan K'iche' LANG_MAY Latin Latin America quc myn   quc-GT   0486
Miao Hmong LANG_MIA Latin Asia hmn hmn   hmn-CN    
Minangkabau Minankabaw LANG_MIN Latin Asia min min   min-ID    
Mohawk   LANG_MOH Latin North America moh moh   moh-CA MWK 047C
Nahuatl Aztec LANG_NAH Latin Latin America nci nah   nci-MX    
Nyanja Chewa, Chichewa LANG_NYA Latin Africa nya nya ny ny-ZW    
Occidental   LANG_OCC Latin International ile, occ ile ie ie-001    
Ojibway Ojibwa LANG_OJI Latin North America oji oji oj oj-CA    
Papiamento   LANG_PAP Latin Latin America pap pap   pap-029   0479
PidginEnglish Tok Pisin LANG_PID Latin Oceania tpi tpi   tpi-PG    
Provencal Occitan LANG_PRO Latin Europe oci, prv oci oc oc-FR OCI 0482
Quechua   LANG_QUE Latin Latin America que, quz que qu qu-PE QUE 086B
Rhaetic Romansh LANG_RHA Latin Europe roh roh rm rm-CH RMC 0417
Romany   LANG_ROY Latin Europe rom rom   rom-001    
Rwanda Ruanda,
Kinyarwanda
LANG_RUA Latin Africa kin kin rw rw-RW KIN  
Rundi   LANG_RUN Latin Africa run run rn rn-BI    
Samoan   LANG_SAM Latin Oceania smo smo sm sm-WS    
Sardinian   LANG_SAR Latin Europe srd srd sc sc-IT    
Shona   LANG_SHO Latin Africa sna sna sn sn-ZW    
Sioux Dakota LANG_SIO Latin North America dak dak   dak-US    
Sami   LANG_SMI Latin Europe SMI smi   smi-NO "", SZI 003B
Sami(Lule) Lule Sami LANG_SML Latin Europe smj smj   smj-NO SMJ 103B
Sami(Northern) Northern Sami LANG_SMN Latin Europe sme sme se se-NO SME 043B
Sami(Southern) Southern Sami LANG_SMS Latin Europe sma sma   sma-NO SMA 183B
Somali   LANG_SOM Latin Africa som som so so-SO   0477
Sotho Sesotho, Sutu LANG_SOT Latin Africa sot sot st st-ZA   0430
Sundanese   LANG_SUN Latin Asia sun sun su su-Latn-ID    
Swahili Kiswahili LANG_SWA Latin Africa swa swa sw sw-KE SWK 0441
Swazi Swati LANG_SWZ Latin Africa ssw ssw ss ss-SZ    
Tagalog Filipino LANG_TAG Latin Asia tgl, fil tgl, fil tl tl-PH FPO 0464
Tahitian   LANG_TAH Latin Oceania tah tah ty ty-PF    
Pirez Tinpo LANG_TIN Latin Europe, Asia,
Africa
qti     qti-001    
Tongan   LANG_TON Latin Oceania ton ton to to-TO    
Tun Tunia LANG_TUN Latin Asia tug     tug-TD    
Visayan Cebuano LANG_VIS Latin Asia ceb, qis     ceb-PH    
Welsh   LANG_WEL Latin Europe cym wel cy cy-GB CYM 0452
Sorbian(Wend) Wend LANG_WEN Latin Europe hsb, dsb wen, hsb, dsb "", sb hsb-DE HSB, DSB 042E
Wolof   LANG_WOL Latin Africa wol wol wo wo-SN WOL 0488
Xhosa   LANG_XHO Latin Africa xho xho xh xh-ZA XHO 0434
Zapotec   LANG_ZAP Latin Latin America zap zap   zap-MX    
Zulu   LANG_ZUL Latin Africa zul zul zu zu-ZA ZUL 0435
Japanese   LANG_JPN Asian Asia jpn jpn ja ja-JP JPN 0411
Chinese(S) Simplified Chinese LANG_CHS Asian Asia zho, qcs chi zh, zh-Hans,
zh-CHS, zh-CN,
zh-SG
zh-CN CHS, ZHI 0804, 0004, 1004
Chinese(T) Traditional Chinese LANG_CHT Asian Asia qct   zh, zh-Hant,
zh-CHT, zh-HK,
zh-MO, zh-TW
zh-TW CHT, ZHH, ZHM 0404, 7C04, 0C04,
1404
Korean   LANG_KRN Asian Asia kor kor ko ko-KR KOR 0412
Thai   LANG_THA Asian Asia tha tha th th-TH THA 041E
Arabic   LANG_ARA Right-to-left Asia ara ara ar ar-SA ARA 0401
Hebrew   LANG_HEB Right-to-left Europe, Asia heb heb he he-IL HEB 040D
Vietnamese   LANG_VIE Latin Asia vie vie vi vi-VN VIT 042A