Abron [abr] Ghana

No Statistics available for this language. more details ...

Aceh [ace] Indonesia (Sumatra)

Corpus: ace_community_2017
Sentences: 4539, Types: 12771, Tokens: 84860
URLs: 1133 more details ...

Acholi [ach] Uganda

Corpus: ach_community_2017
Sentences: 186, Types: 1713, Tokens: 5292
URLs: 49 more details ...

Afar [aar] Ethiopia

Corpus: aar_community_2017
Sentences: 56, Types: 586, Tokens: 899
URLs: 3 more details ...

Ahirani [ahr] India

No Statistics available for this language. more details ...

Akan [aka] Ghana

No Statistics available for this language. more details ...

Alur [alz] Dem. Rep. of Congo

No Statistics available for this language. more details ...

Amharic [amh] Ethiopia

Corpus: amh_community_2017
Sentences: 45092, Types: 139904, Tokens: 671748
URLs: 7755 more details ...

Anaang [anw] Nigeria

Corpus: anw_community_2017
Sentences: 30362, Types: 67236, Tokens: 479884
URLs: 14860 more details ...

Assamese [asm] India

Corpus: asm_community_2017
Sentences: 63627, Types: 121419, Tokens: 905966
URLs: 2651 more details ...

Awadhi [awa] India

No Statistics available for this language. more details ...

Aymara [aym] Bolivia

Corpus: aym_community_2017
Sentences: 3264, Types: 9600, Tokens: 49740
URLs: 1836 more details ...

Bagheli [bfy] India

No Statistics available for this language. more details ...

Bakhtiari [bqi] Iran

No Statistics available for this language. more details ...

Bali [ban] Indonesia (Java and Bali)

Corpus: ban_community_2017
Sentences: 99, Types: 421, Tokens: 1490
URLs: 46 more details ...

Baluchi [bal] Pakistan

No Statistics available for this language. more details ...

Bamanankan [bam] Mali

Corpus: bam_community_2017
Sentences: 1266, Types: 5212, Tokens: 24535
URLs: 304 more details ...

Banjar [bjn] Indonesia (Kalimantan)

Corpus: bjn_community_2017
Sentences: 7240, Types: 22439, Tokens: 126268
URLs: 1119 more details ...

Baoulé [bci] Côte d’Ivoire

No Statistics available for this language. more details ...

Bashkort [bak] Russian Federation

Corpus: bak_community_2019
Sentences: 3332, Types: 11450, Tokens: 58078
URLs: 296 more details ...

Batak Dairi [btd] Indonesia (Sumatra)

No Statistics available for this language. more details ...

Batak Mandailing [btm] Indonesia (Sumatra)

No Statistics available for this language. more details ...

Batak Simalungun [bts] Indonesia (Sumatra)

No Statistics available for this language. more details ...

Batak Toba [bbc] Indonesia (Sumatra)

No Statistics available for this language. more details ...

Bedawiyet [bej] Sudan

No Statistics available for this language. more details ...

Bemba [bem] Zambia

Corpus: bem_community_2017
Sentences: 81, Types: 422, Tokens: 606
URLs: 32 more details ...

Bengali [ben] Bangladesh

Corpus: ben_community_2019
Sentences: 7030, Types: 21050, Tokens: 108625
URLs: 506 more details ...

Berom [bom] Nigeria

No Statistics available for this language. more details ...

Betawi [bew] Indonesia (Java and Bali)

Corpus: bew_community_2017
Sentences: 311828, Types: 425386, Tokens: 4328236
URLs: 62635 more details ...

Bhili [bhb] India

No Statistics available for this language. more details ...

Bhojpuri [bho] India

No Statistics available for this language. more details ...

Bikol [bik] Philippines

Corpus: bik_community_2017
Sentences: 12023, Types: 26115, Tokens: 265360
URLs: 1060 more details ...

Bodo [brx] India

No Statistics available for this language. more details ...

Bosnian [bos] Bosnia and Herzegovina

Corpus: bos_community_2017
Sentences: 659405, Types: 492801, Tokens: 14318352
URLs: 128753 more details ...

Bouyei [pcc] China

No Statistics available for this language. more details ...

Brahui [brh] Pakistan

No Statistics available for this language. more details ...

Bugis [bug] Indonesia (Sulawesi)

Corpus: bug_community_2017
Sentences: 583, Types: 1901, Tokens: 7452
URLs: 466 more details ...

Bundeli [bns] India

No Statistics available for this language. more details ...

Burmese [mya] Myanmar

Corpus: mya_community_2017
Sentences: 1883, Types: 13700, Tokens: 27543
URLs: 1236 more details ...

Cebuano [ceb] Philippines

Corpus: ceb_community_2017
Sentences: 729461, Types: 387017, Tokens: 11495546
URLs: 702495 more details ...

Central Atlas Tamazight [tzm] Morocco

No Statistics available for this language. more details ...

Central Bikol [bcl] Philippines

Corpus: bcl_community_2017
Sentences: 15726, Types: 38027, Tokens: 318931
URLs: 3430 more details ...

Kalanga [kck] Zimbabwe

Corpus: kck_community_2019
Sentences: 996, Types: 4156, Tokens: 16798
URLs: 26 more details ...

Central Kanuri [knc] Nigeria

No Statistics available for this language. more details ...

Central Khmer [khm] Cambodia

Corpus: khm_community_2017
Sentences: 1773, Types: 10143, Tokens: 19124
URLs: 895 more details ...

Central Kurdish [ckb] Iraq

Corpus: ckb_community_2017
Sentences: 4978, Types: 19063, Tokens: 101004
URLs: 427 more details ...

Chechen [che] Russian Federation

Corpus: che_community_2017
Sentences: 35293, Types: 42830, Tokens: 439749
URLs: 26189 more details ...

Chhattisgarhi [hne] India

No Statistics available for this language. more details ...

Chiga [cgg] Uganda

No Statistics available for this language. more details ...

Chittagonian [ctg] Bangladesh

No Statistics available for this language. more details ...

Chokwe [cjk] Dem. Rep. of Congo

No Statistics available for this language. more details ...

Chuanqiandian Cluster Miao [cqd] China

No Statistics available for this language. more details ...

Chuvash [chv] Russian Federation

Corpus: chv_community_2017
Sentences: 71333, Types: 119208, Tokens: 917180
URLs: 11649 more details ...

Dan [dnj] Côte d’Ivoire

No Statistics available for this language. more details ...

Dari [prs] Afghanistan

Corpus: prs_community_2017
Sentences: 71489, Types: 127345, Tokens: 1809533
URLs: 6408 more details ...

Deccan [dcc] India

No Statistics available for this language. more details ...

Dholuo [luo] Kenya

No Statistics available for this language. more details ...

Dhundari [dhd] India

No Statistics available for this language. more details ...

Dimli [diq] Turkey

Corpus: diq_community_2017
Sentences: 31099, Types: 67631, Tokens: 442831
URLs: 5337 more details ...

Dinka [din] Sudan

No Statistics available for this language. more details ...

Dogri [doi] India

No Statistics available for this language. more details ...

Domari [rmt] Iran

No Statistics available for this language. more details ...

Eastern Balochi [bgp] Pakistan

No Statistics available for this language. more details ...

Eastern Maninkakan [emk] Guinea

Corpus: emk_community_2017
Sentences: 30, Types: 147, Tokens: 233
URLs: 5 more details ...

Eastern Tamang [taj] Nepal

No Statistics available for this language. more details ...

Eastern Yiddish [ydd] Israel

Corpus: ydd_community_2017
Sentences: 21819, Types: 32472, Tokens: 489592
URLs: 9 more details ...

Ebira [igb] Nigeria

No Statistics available for this language. more details ...

Edo [bin] Nigeria

No Statistics available for this language. more details ...

Ekegusii [guz] Kenya

No Statistics available for this language. more details ...

Éwé [ewe] Ghana

Corpus: ewe_community_2017
Sentences: 10173, Types: 15532, Tokens: 198511
URLs: 988 more details ...

Fang [fan] Guinea

No Statistics available for this language. more details ...

Filipino [fil] Philippines

No Statistics available for this language. more details ...

Fon [fon] Benin

Corpus: fon_community_2017
Sentences: 19, Types: 174, Tokens: 445
URLs: 1 more details ...

Fulah [ful] Cameroon

Corpus: ful_community_2017
Sentences: 1002, Types: 5337, Tokens: 18654
URLs: 94 more details ...

Galician [glg] Spain

Corpus: glg_community_2017
Sentences: 221512, Types: 235244, Tokens: 5216089
URLs: 39196 more details ...

Gamo [gmv] Ethiopia

No Statistics available for this language. more details ...

Gan Chinese [gan] China

Corpus: gan_community_2017
Sentences: 9897, Types: 12707, Tokens: 45618
URLs: 1027 more details ...

Ganda [lug] Uganda

Corpus: lug_community_2017
Sentences: 78609, Types: 178380, Tokens: 1382770
URLs: 4842 more details ...

Garhwali [gbm] India

No Statistics available for this language. more details ...

Garo [grt] India

No Statistics available for this language. more details ...

Gikuyu [kik] Kenya

Corpus: kik_community_2017
Sentences: 815, Types: 3586, Tokens: 9848
URLs: 326 more details ...

Goan Konkani [gom] India

Corpus: gom_community_2017
Sentences: 39645, Types: 71960, Tokens: 583042
URLs: 126 more details ...

Godwari [gdx] India

No Statistics available for this language. more details ...

Gogo [gog] Tanzania

No Statistics available for this language. more details ...

Gondi [gon] India

No Statistics available for this language. more details ...

Gorontalo [gor] Indonesia (Sulawesi)

No Statistics available for this language. more details ...

Guarani [grn] Paraguay, Bolivia

Corpus: grn_community_2017
Sentences: 14612, Types: 39627, Tokens: 214276
URLs: 2253 more details ...

Hadiyya [hdy] Ethiopia

No Statistics available for this language. more details ...

Haitian [hat] Haiti

Corpus: hat_community_2017
Sentences: 23384, Types: 32792, Tokens: 335466
URLs: 14585 more details ...

Halh Mongolian [khk] Mongolia

Corpus: khk_community_2017
Sentences: 19593, Types: 43349, Tokens: 351010
URLs: 2182 more details ...

Haryanvi [bgc] India

No Statistics available for this language. more details ...

Hassaniyya [mey] Mauritania

No Statistics available for this language. more details ...

Hausa [hau] Nigeria

Corpus: hau_community_2017
Sentences: 41589, Types: 40232, Tokens: 1017657
URLs: 2492 more details ...

Haya [hay] Tanzania

No Statistics available for this language. more details ...

Hazaragi [haz] Afghanistan

No Statistics available for this language. more details ...

Hiligaynon [hil] Philippines

Corpus: hil_community_2017
Sentences: 69, Types: 399, Tokens: 1390
URLs: 3 more details ...

Hmong Daw [mww] China

No Statistics available for this language. more details ...

Hmong [hmn] China

No Statistics available for this language. more details ...

Ho [hoc] India

No Statistics available for this language. more details ...

Hunsrik [hrx] Brazil

No Statistics available for this language. more details ...

Ibibio [ibb] Nigeria

Corpus: ibb_community_2017
Sentences: 66, Types: 217, Tokens: 383
URLs: 17 more details ...

Igbo [ibo] Nigeria

Corpus: ibo_community_2017
Sentences: 7742, Types: 14660, Tokens: 158694
URLs: 1429 more details ...

Ilocano [ilo] Philippines

Corpus: ilo_community_2017
Sentences: 18026, Types: 43542, Tokens: 438419
URLs: 3402 more details ...

Izon [ijc] Nigeria

No Statistics available for this language. more details ...

Jambi Malay [jax] Indonesia

No Statistics available for this language. more details ...

Javanese [jav] Indonesia (Java and Bali)

Corpus: jav_community_2017
Sentences: 83798, Types: 134699, Tokens: 1506187
URLs: 18815 more details ...

Jula [dyu] Burkina Faso

Corpus: dyu_community_2017
Sentences: 8, Types: 17, Tokens: 75
URLs: 1 more details ...

Kabardian [kbd] Russian Federation

Corpus: kbd_community_2017
Sentences: 9242, Types: 34481, Tokens: 116016
URLs: 1170 more details ...

Kabuverdianu [kea] Cape Verde Islands

Corpus: kea_community_2017
Sentences: 258, Types: 1108, Tokens: 2269
URLs: 99 more details ...

Kabyle [kab] Algeria

Corpus: kab_community_2017
Sentences: 4778, Types: 21054, Tokens: 103625
URLs: 945 more details ...

Kalenjin [kln] Kenya

No Statistics available for this language. more details ...

Kamba [kam] Kenya

No Statistics available for this language. more details ...

Kanauji [bjj] India

No Statistics available for this language. more details ...

Kangri [xnr] India

No Statistics available for this language. more details ...

Kannada [kan] India

Corpus: kan_community_2017
Sentences: 102397, Types: 273390, Tokens: 1590451
URLs: 10550 more details ...

Kanuri [kau] Nigeria

No Statistics available for this language. more details ...

Kashkay [qxq] Iran

No Statistics available for this language. more details ...

Kashmiri [kas] India

Corpus: kas_community_2017
Sentences: 448, Types: 2439, Tokens: 4749
URLs: 230 more details ...

Khams Tibetan [khg] China

No Statistics available for this language. more details ...

Kimbundu [kmb] Angola

No Statistics available for this language. more details ...

Kimîîru [mer] Kenya

No Statistics available for this language. more details ...

Kipsigis [sgc] Kenya

No Statistics available for this language. more details ...

Kituba [ktu] Dem. Rep. of Congo

No Statistics available for this language. more details ...

Kituba [mkw] Congo

Corpus: mkw_community_2017
Sentences: 143476, Types: 218126, Tokens: 2025305
URLs: 39683 more details ...

Kongo [kon] Dem. Rep. of Congo

Corpus: kon_community_2017
Sentences: 337, Types: 1448, Tokens: 3740
URLs: 179 more details ...

Konkani [knn] India

Corpus: knn_community_2017
Sentences: 14111, Types: 45989, Tokens: 319022
URLs: 2686 more details ...

Koongo [kng] Dem. Rep. of Congo

Corpus: kng_community_2017
Sentences: 41, Types: 178, Tokens: 292
URLs: 24 more details ...

Kpelle [kpe] Guinea

No Statistics available for this language. more details ...

Kumaoni [kfy] India

No Statistics available for this language. more details ...

Kurdish [kur] Kurdistan, Iraq, Turkey

Corpus: kur_community_2017
Sentences: 36729, Types: 141732, Tokens: 789948
URLs: 4912 more details ...

Kurux [kru] India

No Statistics available for this language. more details ...

Kyrgyz [kir] Kyrgyzstan

Corpus: kir_community_2017
Sentences: 251608, Types: 281359, Tokens: 4382407
URLs: 33270 more details ...

Lahnda [lah] Pakistan

No Statistics available for this language. more details ...

Laki [lki] Iran

No Statistics available for this language. more details ...

Lambadi [lmn] India

No Statistics available for this language. more details ...

Lango [laj] Uganda

No Statistics available for this language. more details ...

Lao [lao] Laos

Corpus: lao_community_2017
Sentences: 9715, Types: 30833, Tokens: 223430
URLs: 4040 more details ...

Limburgish [lim] Netherlands

Corpus: lim_community_2019
Sentences: 0, Types: 0, Tokens: 0
URLs: 0 more details ...

Lingala [lin] Dem. Rep. of Congo

Corpus: lin_community_2017
Sentences: 1300, Types: 5060, Tokens: 24488
URLs: 214 more details ...

Lomwe [ngl] Mozambique

Corpus: ngl_community_2017
Sentences: 76, Types: 270, Tokens: 470
URLs: 36 more details ...

Luba-Kasai [lua] Dem. Rep. of Congo

No Statistics available for this language. more details ...

Luba-Katanga [lub] Dem. Rep. of Congo

No Statistics available for this language. more details ...

Lubukusu [bxk] Kenya

No Statistics available for this language. more details ...

Lugbara [lgg] Uganda

Corpus: lgg_community_2017
Sentences: 95, Types: 368, Tokens: 665
URLs: 40 more details ...

Maasai [mas] Kenya

No Statistics available for this language. more details ...

Maasina Fulfulde [ffm] Mali

No Statistics available for this language. more details ...

Madura [mad] Indonesia (Java and Bali)

Corpus: mad_community_2017
Sentences: 32, Types: 94, Tokens: 280
URLs: 7 more details ...

Magahi [mag] India

No Statistics available for this language. more details ...

Maguindanao [mdh] Philippines

No Statistics available for this language. more details ...

Mahasu Pahari [bfz] India

No Statistics available for this language. more details ...

Maithili [mai] India

No Statistics available for this language. more details ...

Makasar [mak] Indonesia (Sulawesi)

No Statistics available for this language. more details ...

Makhuwa [vmw] Mozambique

No Statistics available for this language. more details ...

Makhuwa-Meetto [mgh] Mozambique

No Statistics available for this language. more details ...

Makonde [kde] Tanzania

Corpus: kde_community_2017
Sentences: 96, Types: 340, Tokens: 705
URLs: 39 more details ...

Malagasy [mlg] Madagascar

Corpus: mlg_community_2017
Sentences: 92762, Types: 126134, Tokens: 1191086
URLs: 37324 more details ...

Malay [msa] Thailand, Malaysia

Corpus: msa_community_2017
Sentences: 756962, Types: 341656, Tokens: 15392221
URLs: 89140 more details ...

Malay [zlm] Malaysia (Peninsular)

No Statistics available for this language. more details ...

Malayalam [mal] India

Corpus: mal_community_2017
Sentences: 602268, Types: 1004168, Tokens: 7046155
URLs: 62070 more details ...

Mandingo [man] Senegal

No Statistics available for this language. more details ...

Mandinka [mnk] Senegal

No Statistics available for this language. more details ...

Manyika [mxc] Zimbabwe

No Statistics available for this language. more details ...

Marwari [mwr] India

No Statistics available for this language. more details ...

Masaaba [myx] Uganda

No Statistics available for this language. more details ...

Mazanderani [mzn] Iran

Corpus: mzn_community_2017
Sentences: 26727, Types: 46533, Tokens: 459204
URLs: 10350 more details ...

Meitei [mni] India

No Statistics available for this language. more details ...

Mende [men] Sierra Leone

No Statistics available for this language. more details ...

Min Dong Chinese [cdo] China

Corpus: cdo_community_2017
Sentences: 3280, Types: 10200, Tokens: 55617
URLs: 584 more details ...

Min Nan Chinese [nan] China

Corpus: nan_community_2017
Sentences: 1777, Types: 7500, Tokens: 36430
URLs: 1145 more details ...

Mina [myi] India

No Statistics available for this language. more details ...

Minangkabau [min] Indonesia (Sumatra)

Corpus: min_community_2017
Sentences: 186423, Types: 93939, Tokens: 1927011
URLs: 170993 more details ...

Mundari [unr] India

No Statistics available for this language. more details ...

Muong [mtq] Viet Nam

No Statistics available for this language. more details ...

Musi [mui] Indonesia (Sumatra)

No Statistics available for this language. more details ...

Mòoré [mos] Burkina Faso

Corpus: mos_community_2017
Sentences: 107, Types: 382, Tokens: 693
URLs: 43 more details ...

Ndau [ndc] Zimbabwe

No Statistics available for this language. more details ...

Ndebele [nde] Zimbabwe

No Statistics available for this language. more details ...

Ndonga [ndo] Namibia

Corpus: ndo_community_2017
Sentences: 13495, Types: 37191, Tokens: 308836
URLs: 1253 more details ...

Ngbaka [nga] Dem. Rep. of Congo

No Statistics available for this language. more details ...

Nigerian Fulfulde [fuv] Nigeria

No Statistics available for this language. more details ...

Nigerian Pidgin [pcm] Nigeria

No Statistics available for this language. more details ...

Nimadi [noe] India

No Statistics available for this language. more details ...

Northern Hindko [hno] Pakistan

No Statistics available for this language. more details ...

Northern Khmer [kxm] Thailand

No Statistics available for this language. more details ...

Northern Luri [lrc] Iran

No Statistics available for this language. more details ...

Northern Qiandong Miao [hea] China

No Statistics available for this language. more details ...

Northern Sotho [nso] South Africa

Corpus: nso_community_2017
Sentences: 4746, Types: 10166, Tokens: 108522
URLs: 430 more details ...

Norwegian [nor] Norway

Corpus: nor_community_2017
Sentences: 69299, Types: 95606, Tokens: 984898
URLs: 28907 more details ...

Nuosu [iii] China

No Statistics available for this language. more details ...

Nyakyusa-Ngonde [nyy] Tanzania

No Statistics available for this language. more details ...

Nyanja [nya] Malawi

Corpus: nya_community_2017
Sentences: 896, Types: 4343, Tokens: 13712
URLs: 162 more details ...

Nyankore [nyn] Uganda

Corpus: nyn_community_2017
Sentences: 5, Types: 15, Tokens: 36
URLs: 1 more details ...

Occitan [oci] France

Corpus: oci_community_2017
Sentences: 166147, Types: 227980, Tokens: 3515311
URLs: 33483 more details ...

Oluluyia [luy] Kenya

No Statistics available for this language. more details ...

Oriya [ori] India

Corpus: ori_community_2017
Sentences: 39314, Types: 71426, Tokens: 504296
URLs: 5127 more details ...

Oromo [orm] Ethiopia

Corpus: orm_community_2017
Sentences: 2466, Types: 11758, Tokens: 41872
URLs: 397 more details ...

Pahari-Potwari [phr] Pakistan

No Statistics available for this language. more details ...

Pampangan [pam] Philippines

Corpus: pam_community_2017
Sentences: 17250, Types: 41952, Tokens: 347606
URLs: 5593 more details ...

Pangasinan [pag] Philippines

Corpus: pag_community_2017
Sentences: 6012, Types: 11854, Tokens: 79080
URLs: 3784 more details ...

Pontic [pnt] Greece

Corpus: pnt_community_2017
Sentences: 1564, Types: 6290, Tokens: 25581
URLs: 398 more details ...

Pulaar [fuc] Senegal

Corpus: fuc_community_2017
Sentences: 124, Types: 1022, Tokens: 2642
URLs: 63 more details ...

Pular [fuf] Guinea

No Statistics available for this language. more details ...

Pwo Eastern Karen [kjp] Myanmar

No Statistics available for this language. more details ...

Quechua [que] Bolivia, Peru

Corpus: que_community_2017
Sentences: 21139, Types: 38495, Tokens: 266643
URLs: 13583 more details ...

Quiché [quc] Guatemala

No Statistics available for this language. more details ...

Rajasthani [raj] India

No Statistics available for this language. more details ...

Rangpuri [rkt] Bangladesh

No Statistics available for this language. more details ...

Rohingya [rhg] Myanmar

No Statistics available for this language. more details ...

Romany [rom] Romania

Corpus: rom_community_2019
Sentences: 1435, Types: 6646, Tokens: 28320
URLs: 237 more details ...

Rundi [run] Burundi

Corpus: run_community_2017
Sentences: 17361, Types: 56856, Tokens: 363797
URLs: 1812 more details ...

Rwanda [kin] Rwanda

Corpus: kin_community_2017
Sentences: 54359, Types: 127640, Tokens: 1156170
URLs: 4555 more details ...

S'gaw Karen [ksw] Myanmar

Corpus: ksw_community_2017
Sentences: 448, Types: 2415, Tokens: 4749
URLs: 230 more details ...

Sadri [sck] India

No Statistics available for this language. more details ...

Santali [sat] India

No Statistics available for this language. more details ...

Sasak [sas] Indonesia (Nusa Tenggara)

No Statistics available for this language. more details ...

Sena [seh] Mozambique

Corpus: seh_community_2017
Sentences: 27, Types: 118, Tokens: 167
URLs: 10 more details ...

Seraiki [skr] Pakistan

Corpus: skr_community_2017
Sentences: 87, Types: 885, Tokens: 1897
URLs: 42 more details ...

Serer-Sine [srr] Senegal

No Statistics available for this language. more details ...

Shan [shn] Myanmar

No Statistics available for this language. more details ...

Shekhawati [swv] India

No Statistics available for this language. more details ...

Shona [sna] Zimbabwe

Corpus: sna_community_2017
Sentences: 48339, Types: 122437, Tokens: 792698
URLs: 4881 more details ...

Sidamo [sid] Ethiopia

No Statistics available for this language. more details ...

Sindhi [snd] Pakistan

Corpus: snd_community_2017
Sentences: 7431, Types: 22097, Tokens: 140754
URLs: 588 more details ...

Soga [xog] Uganda

No Statistics available for this language. more details ...

Somali [som] Somalia

Corpus: som_community_2017
Sentences: 170575, Types: 216881, Tokens: 4320798
URLs: 22918 more details ...

Songe [sop] Dem. Rep. of Congo

No Statistics available for this language. more details ...

Soninke [snk] Mali

Corpus: snk_community_2017
Sentences: 124, Types: 454, Tokens: 866
URLs: 40 more details ...

Southern Balochi [bcc] Pakistan

No Statistics available for this language. more details ...

Southern Dong [kmc] China

No Statistics available for this language. more details ...

Southern Kurdish [sdh] Iran

No Statistics available for this language. more details ...

Southern Ndebele [nbl] South Africa

Corpus: nbl_community_2017
Sentences: 318, Types: 2643, Tokens: 5424
URLs: 29 more details ...

Southern Sotho [sot] South Africa, Lesotho

Corpus: sot_community_2017
Sentences: 9773, Types: 17421, Tokens: 238709
URLs: 542 more details ...

Sukuma [suk] Tanzania

Corpus: suk_community_2017
Sentences: 47, Types: 163, Tokens: 391
URLs: 13 more details ...

Sunda [sun] Indonesia (Java and Bali)

Corpus: sun_community_2017
Sentences: 50340, Types: 84722, Tokens: 913176
URLs: 6187 more details ...

Surgujia [sgj] India

No Statistics available for this language. more details ...

Surjapuri [sjp] India

No Statistics available for this language. more details ...

Susu [sus] Guinea

Corpus: sus_community_2017
Sentences: 9, Types: 79, Tokens: 108
URLs: 5 more details ...

Swahili [swa] Tanzania

Corpus: swa_community_2019
Sentences: 1116, Types: 3417, Tokens: 19168
URLs: 72 more details ...

Swati [ssw] South Africa, Swaziland

Corpus: ssw_community_2017
Sentences: 380, Types: 3132, Tokens: 5895
URLs: 60 more details ...

Sylheti [syl] Bangladesh

No Statistics available for this language. more details ...

Tachawit [shy] Algeria

No Statistics available for this language. more details ...

Tachelhit [shi] Morocco

No Statistics available for this language. more details ...

Tagalog [tgl] Philippines

Corpus: tgl_community_2017
Sentences: 979689, Types: 472209, Tokens: 20664580
URLs: 106820 more details ...

Tajiki [tgk] Tajikistan

Corpus: tgk_community_2017
Sentences: 707117, Types: 514746, Tokens: 14147320
URLs: 78474 more details ...

Tamashek [tmh] Niger

No Statistics available for this language. more details ...

Tarifit [rif] Morocco

No Statistics available for this language. more details ...

Tausug [tsg] Philippines

No Statistics available for this language. more details ...

Teso [teo] Uganda

No Statistics available for this language. more details ...

Thai [tha] Thailand

Corpus: tha_community_2017
Sentences: 57013, Types: 196199, Tokens: 793101
URLs: 21460 more details ...

Themne [tem] Sierra Leone

Corpus: tem_community_2017
Sentences: 8, Types: 15, Tokens: 40
URLs: 1 more details ...

Tibetan [bod] China

Corpus: bod_community_2017
Sentences: 7525, Types: 22081, Tokens: 32178
URLs: 4379 more details ...

Tigrigna [tir] Ethiopia, Eritrea

Corpus: tir_community_2017
Sentences: 1379, Types: 5700, Tokens: 17837
URLs: 181 more details ...

Tigré [tig] Eritrea

No Statistics available for this language. more details ...

Tiv [tiv] Nigeria

Corpus: tiv_community_2017
Sentences: 3, Types: 20, Tokens: 60
URLs: 1 more details ...

Tonga [toi] Zambia, Zimbabwe

No Statistics available for this language. more details ...

Tsonga [tso] South Africa

Corpus: tso_community_2017
Sentences: 10571, Types: 17493, Tokens: 238504
URLs: 446 more details ...

Tswa [tsc] Mozambique

No Statistics available for this language. more details ...

Tswana [tsn] South Africa, Botswana

Corpus: tsn_community_2017
Sentences: 28276, Types: 34977, Tokens: 687676
URLs: 2772 more details ...

Tulu [tcy] India

No Statistics available for this language. more details ...

Tumbuka [tum] Malawi

Corpus: tum_community_2017
Sentences: 240, Types: 1645, Tokens: 4989
URLs: 31 more details ...

Turkmen [tuk] Turkmenistan

Corpus: tuk_community_2017
Sentences: 121, Types: 998, Tokens: 1895
URLs: 97 more details ...

Tày [tyz] Viet Nam

No Statistics available for this language. more details ...

Umbundu [umb] Angola

No Statistics available for this language. more details ...

Uyghur [uig] China

Corpus: uig_community_2017
Sentences: 61043, Types: 138389, Tokens: 1018481
URLs: 4259 more details ...

Uzbek [uzb] Uzbekistan

Corpus: uzb_community_2017
Sentences: 663119, Types: 706425, Tokens: 10900014
URLs: 65787 more details ...

Varhadi-Nagpuri [vah] India

No Statistics available for this language. more details ...

Vasavi [vas] India

No Statistics available for this language. more details ...

Venda [ven] South Africa

Corpus: ven_community_2017
Sentences: 9279, Types: 14412, Tokens: 179877
URLs: 375 more details ...

Vlaams [vls] Belgium

Corpus: vls_community_2017
Sentences: 36393, Types: 75458, Tokens: 693658
URLs: 4740 more details ...

Waray-Waray [war] Philippines

Corpus: war_community_2017
Sentences: 808036, Types: 399359, Tokens: 13358684
URLs: 793771 more details ...

Western Balochi [bgn] Pakistan

No Statistics available for this language. more details ...

Western Panjabi [pnb] Pakistan

Corpus: pnb_community_2017
Sentences: 63683, Types: 64365, Tokens: 1052347
URLs: 26859 more details ...

Wolaytta [wal] Ethiopia

No Statistics available for this language. more details ...

Wolof [wol] Senegal

Corpus: wol_community_2017
Sentences: 9988, Types: 22011, Tokens: 254548
URLs: 1628 more details ...

Xhosa [xho] South Africa

Corpus: xho_community_2019
Sentences: 63387, Types: 172520, Tokens: 972301
URLs: 4227 more details ...

Yao [yao] Malawi

No Statistics available for this language. more details ...

Yilumbu [lup] Gabon

Corpus: lup_community_2017
Sentences: 12246, Types: 16227, Tokens: 87930
URLs: 1 more details ...

Yombe [yom] Dem. Rep. of Congo

No Statistics available for this language. more details ...

Yoruba [yor] Nigeria

Corpus: yor_community_2017
Sentences: 10703, Types: 28265, Tokens: 210852
URLs: 1961 more details ...

Zande [zne] Dem. Rep. of Congo

No Statistics available for this language. more details ...

Zarma [dje] Niger

No Statistics available for this language. more details ...

Zaza [zza] Turkey

No Statistics available for this language. more details ...

Zhuang [zha] China

Corpus: zha_community_2017
Sentences: 2306, Types: 6395, Tokens: 20110
URLs: 642 more details ...

Zulu [zul] South Africa

Corpus: zul_community_2017
Sentences: 146216, Types: 350500, Tokens: 2361635
URLs: 4415 more details ...