Dialectal to Standard Arabic Paraphrasing to Improve Arabic-English Statistical Machine Translation

Wael Sameer Salloum & Nizar Y. Habash
This paper is interested in improving the quality of Arabic-English statistical machine translation (SMT) on highly dialectal Arabic text using morphological knowledge. We present a light-weight rule-based approach to producing Modern Standard Arabic (MSA) paraphrases of dialectal Arabic out-of-vocabulary words and low frequency words. Our approach extends an existing MSA analyzer with a small number of morphological clitics and transfer rules. The generated paraphrase lattices are input to a state-of-the-art phrase-based SMT system resulting in...
This data repository is not currently reporting usage information. For information on how your repository can submit usage information, please see our documentation.