In 2020, the International Colloquium on Grammatical Inference is hosting a shared task on morphological inflection alongside its 2020 meeting. An example of English inflection is the conversion of the lemma run to its present participle, running.
To participate in the shared task, you will build a system that can solve inflection problems. All submitted systems will be compared on a held-out test set.
A word’s form reflects syntactic and semantic features that the word expresses. For example, each English count noun has both singular and plural forms (robot/robots, process/processes), known as the inflected forms of the noun. A Polish verb may have nearly 100 inflected forms.
The data sparsity inherent in morphologically rich languages creates a challenge for computational modeling. Fortunately, inflected forms tend to be systematically related to one another. This is why English speakers can usually predict the singular form from the plural and vice-versa, even for words they have never seen before.
Given a lemma (the dictionary form of a word) and morphological features specifying a form,1 generate the target inflected form. The task will be familiar to participants in the SIGMORPHON shared tasks on inflection, with some modifications and a focus on diversity in languages.
We have chosen a diverse set of 10 languages with rich inflection. All of the datasets have been scraped from Wiktionary or language-specific digitized texts and undergone additional processing at the Center for Language and Speech Processing at Johns Hopkins University. The data are formatted according to the schema described in Sylak-Glassman et al. (2015).