1 STATISTICAL TRANSLATION MODELS FORHAVE USED PARALLEL DATA EXTRACTE...

2. - TÀI LIỆU BÁO CÁO KHOA HỌC COMBINING LEXICAL SEMANTIC RESOURCES WITH QUESTION & ANSWER ARCHIVES FOR TRANSLATION BASED ANSWER FINDING DOC

Khoa học TÀI LIỆU BÁO CÁO KHOA HỌC COMBINING LEXICAL SEMANTIC RESOURCES WITH QUESTION & ANSWER ARCHIVES FOR TRANSLATION BASED ANSWER FINDING DOC

Nội dung
Đáp án tham khảo

2.1 Statistical Translation Models for

have used parallel data extracted from the retrieval

Retrieval

corpus itself. The translation models obtained are

Statistical translation models for retrieval have

therefore domain and collection-specific, which

first been introduced by Berger and Lafferty

introduces a bias in the evaluation and makes

(1999). These models attempt to address syn-

it difficult to assess to what extent the transla-

onymy and polysemy problems by encoding sta-

tion model may be re-used for other tasks and

tistical word associations trained on monolingual

document collections. We henceforth propose a

parallel corpora. This method offers several ad-

new approach for building monolingual transla-

vantages. First, it bases upon a sound mathe-

tion models relying on domain-independent lexi-

matical formulation of the retrieval model. Sec-

cal semantic resources. Moreover, we extensively

ond, it is not as computationally expensive as

compare the results obtained by these models with

other semantic retrieval models, since it only re-

models obtained from a different type of dataset,

lies on a word translation table which can easily

namely Question & Answer archives.

be computed before retrieval. The main draw-

back lies in the availability of suitable training data