3 IMPACT ON AN EXISTING QA SYSTEMCOLUMNS LIST THE YEAR OF THE TREC T...

Question

7.3 Impact on an existing QA Systemcolumns list the year of the TREC test set used,the number of questions in the set (we only useTables 9 and 10 show how our algorithm in-questions for which we know that there is an an-creases performance of our QuALiM system, seeswer in the corpus), the number of questions fore.g. (Kaisser et al., 2006). Section 6 in this pa-which one or more patterns exist, how often atper describes via formulas 2 and 3 how answerleast one pattern returned the correct answer, howcandidates are ranked. This ranking is combinedoften we get an overall correct result by takingwith the existing QA system’s candidate rankingall patterns and their confidence values into ac-by simply using it as an additional feature thatcount, accuracy@1 of the overall system, and ac-boosts candidates proportionally to their confi-curacy@1 computed only for those questions fordence score. The difference between both tableswhich we have at least one pattern available (foris that the first uses all 1658 questions in our testall other questions the system returns no result.)sets for the evaluation, whereas the second con-As can be seen, on evaluation set 1 our methodsiders only those 1122 questions for which ouroutperforms the baseline by 300%, on evaluationsystem was able to learn a pattern. Thus for Tableset 2 by 311%, taking accuracy if a pattern exists10 questions which the system had no chance ofas a basis.answering due to limited training data are omitted.As can be seen, accuracy@1 increases by 4.9% onTest Q Qs with Min one Overall Accuracy Acc. ifset number patterns correct correct overall patternthe complete test set and by 11.5% on the partial2002 429 321 43 14 0.033 0.0442003 354 237 28 10 0.028 0.042set.2004 204 142 19 6 0.029 0.0422005 319 214 21 7 0.022 0.033Note that the QA system used as a baseline is2006 352 208 20 7 0.020 0.034at an advantage in at least two respects: a) It hasSum 1658 1122 131 44 0.027 0.039important web-based components and as such hasTable 7: Baseline performance based on evaluation setaccess to a much larger body of textual informa-

3 IMPACT ON AN EXISTING QA SYSTEMCOLUMNS LIST THE YEAR OF THE TREC T...

7.3 Impact on an existing QA System

columns list the year of the TREC test set used,

the number of questions in the set (we only use

Tables 9 and 10 show how our algorithm in-

questions for which we know that there is an an-

creases performance of our QuALiM system, see

swer in the corpus), the number of questions for

e.g. (Kaisser et al., 2006). Section 6 in this pa-

which one or more patterns exist, how often at

per describes via formulas 2 and 3 how answer

least one pattern returned the correct answer, how

candidates are ranked. This ranking is combined

often we get an overall correct result by taking

with the existing QA system’s candidate ranking

all patterns and their confidence values into ac-

by simply using it as an additional feature that

count, accuracy@1 of the overall system, and ac-

boosts candidates proportionally to their confi-

curacy@1 computed only for those questions for

dence score. The difference between both tables

which we have at least one pattern available (for

is that the first uses all 1658 questions in our test

all other questions the system returns no result.)

sets for the evaluation, whereas the second con-

As can be seen, on evaluation set 1 our method

siders only those 1122 questions for which our

outperforms the baseline by 300%, on evaluation

system was able to learn a pattern. Thus for Table

set 2 by 311%, taking accuracy if a pattern exists

10 questions which the system had no chance of

as a basis.

answering due to limited training data are omitted.

As can be seen, accuracy@1 increases by 4.9% on

the complete test set and by 11.5% on the partial

set.

Note that the QA system used as a baseline is

at an advantage in at least two respects: a) It has

important web-based components and as such has

access to a much larger body of textual informa-

Bạn đang xem 7. - BÁO CÁO KHOA HỌC ANSWER SENTENCE RETRIEVAL BY MATCHING DEPENDENCY PATHS ACQUIRED FROM QUESTION ANSWER SENTENCE PAIRS PDF