Neural models for morphological generation, analysis and lemmatization in 22 languages
dc.contributor.affiliation | University of Helsinki-Alnajjar, Khalid | |
dc.contributor.author | Alnajjar, Khalid | |
dc.date.accessioned | 2025-04-29T13:59:20Z | |
dc.date.issued | 2020-07-01 | |
dc.date.issued | 2020-07-01 | |
dc.description | Morphological models for generation, lemmatization and analysis in 22 languages. The models are trained in OpenNMT-py https://github.com/OpenNMT/OpenNMT-py. Feed one word at a time, split into characters (kissa -> k i s s a) Supported languages: German (deu), Kven (fkv), Komi-Zyrian (kpv), Mokhsa (mdf), Mansi (mns), Erzya (myv), Norwegian Bokmål (nob), Russian (rus), South Sami (sma), Lule Sami (smj), Skolt Sami (sms), Võro (vro), Finnish (fin), Komi-Permyak (koi), Latvian (lav), Eastern Mari (mhr), Western Mari (mrj), Namonuito (nmt), Olonets-Karelian (olo), Pite Sami (sje), Northern Sami (sme), Inari Sami (smn) and Udmurt (udm) Cite: Hämäläinen, M., Partanen, N., Rueter, J., & Alnajjar, K. (2021). Neural Morphology Dataset and Models for Multiple Languages, from the Large to the Endangered. In Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa 2021) | |
dc.identifier | https://doi.org/10.5281/zenodo.3926769 | |
dc.identifier.uri | https://datakatalogi.helsinki.fi/handle/123456789/4178 | |
dc.rights.license | cc-by-4.0 | |
dc.subject | morphology | |
dc.subject | fst | |
dc.subject | endangered languages | |
dc.subject | neural models | |
dc.title | Neural models for morphological generation, analysis and lemmatization in 22 languages | |
dc.type | dataset |