2007 AND THE SIMPLE WIKIPEDIA DUMPDESCRIBED ABOVE. AS IS COMMON PRA...
6, 2007 and the Simple Wikipedia dump
described above. As is common practice in
from July 24, 2008. The Simple English
translation-based retrieval, we utilised the IBM
Wikipedia is an English Wikipedia targeted
translation model 1. The only pre-processing steps
at non-native speakers of English which uses
performed for all parallel datasets were tokenisa-
simpler words than the English Wikipedia.
tion and stop word removal.
5