7 août 2013
info:eu-repo/semantics/OpenAccess
Yves Scherrer et al., « Modernizing historical Slovene words with character-based SMT », HAL-SHS : linguistique, ID : 10670/1.7w8lqf
We propose a language-independent word normalization method exemplified on modernizing historical Slovene words. Our method relies on character-based statistical machine translation and uses only shallow knowledge. We present the relevant lexicons and two experiments. In one, we use a lexicon of historical word--contemporary word pairs and a list of contemporary words; in the other, we only use a list of historical words and one of contemporary ones. We show that both methods produce significantly better results than the baseline.