SECTION 3.1 INTRODUCES THE QUERY SNOWBALL (QSB)METHOD WHICH COMPUTES T...

2010) - BÁO CÁO KHOA HỌC: "QUERY SNOWBALL: A CO-OCCURRENCE-BASED APPROACH TO MULTI-DOCUMENT SUMMARIZATION FOR QUESTION ANSWERING" POT

2010) - BÁO CÁO KHOA HỌC: "QUERY SNOWBALL: A CO-OCCURRENCE-BASED APPROACH TO MULTI-DOCUMENT SUMMARIZATION FOR QUESTION ANSWERING" POT

Khoa học BÁO CÁO KHOA HỌC: "QUERY SNOWBALL: A CO-OCCURRENCE-BASED APPROACH TO MULTI-DOCUMENT SUMMARIZATION FOR QUESTION ANSWERING" POT

Nội dung
Đáp án tham khảo

2010):

)

cally by automatically detecting sentence bound-

Sim(u, v

Q

)f

M M R

(S) =γ(∑Sim(u, v

D

) +∑

aries based on Japanese punctuation marks, but we

u

∈

S

also used regular-expression-based heuristics to de-

−(1−γ) ∑Sim(u

i

, u

j

) (4)

{

(u

i

,u

j

)

|

i

6

=j

and

u

i

,u

j

∈

S

}

tect glossary of terms in articles. As the descrip-

where v

_D

is the vector representing the source docu-

tions of these glossaries are usually very useful for

ments, v

Q

is the vector representing the query terms,

answering BIOGRAPHY and DEFINITION ques-

Sim is the cosine similarity, and γ is a parameter.

tions, we treated each term description (generally

1

https://traloihay.net

multiple sentences) as a single sentence.

Thus, the first term of this function reflects how the

pairs all contribute significantly to the performance

sentences reflect the entire documents; the second

of QSBP. Note that we are using the ACLIA data as

term reflects the relevance of the sentences to the

summarization test collections and that the official

query; and finally the function penalizes redundant

QA results of ACLIA should not be compared with

sentences. We set γ to 0.8 and the scaling factor

ours.

QSBP and QSBP(idf) achieve 0.312 and 0.313 in

used in the algorithm to 0.3 based on a preliminary

experiment with a part of the ACLIA1 development

F3 score, and the differences between the two are

not statistically significant. Table 3 shows the F3

data. We also tried incorporating sentence position

information (Radev, 2001) to our MMR baseline but

scores for each question type. It can be observed

this actually hurt performance in our preliminary ex-

that QSBP is the top performer for BIO, DEF and

REL questions on average, while QSBP(idf) is the

periments.

top performer for EVENT and WHY questions on