1 RETRIEVAL BASED ON TRANSLATION MODELSIN THE SAME ANSWER SENTENCE),...
5.1 Retrieval based on Translation Models
in the same answer sentence), while about a half
The second experiment aims at providing an ex-
(52%) are a weak match (only one query term
trinsic evaluation of the translation probabilities
matched in the answer sentence) and 16 % are in-
by employing them in an answer finding task.
direct answers which do not explicitly contain the
In order to perform retrieval, we use a rank-
answer but provide enough information for deduc-
ing function similar to the one proposed by Xue
ing it. Moreover, the Microsoft QA corpus is not
et al. (2008), which builds upon previous work
limited to a specific topic and entirely indepen-
on translation-based retrieval models and tries to
dent from the datasets used to build our translation
overcome some of their flaws:
models.
The original corpus contained some inconsis-
P (w|D) (2)
P (q|D) =
Ytencies due to duplicated data and non-labelled
w∈q
entries. After cleaning, we obtained a corpus of
P (w|D) = (1 − λ)P
mx