1 EVALUATION METRICS THE EVALUATION OF QA SYSTEMS IS DETERMINED ACC...

QUESTION ANSWERING SYSTEM AND ITS APPLICATION

Nội dung
Đáp án tham khảo

4.1 Evaluation Metrics

The evaluation of QA systems is determined according to the  Mean Reciprocal Rank (MRR): criteria for judging an answer. The following list captures The Mean Reciprocal Rank (MRR), which was first used for some possible criteria for answer evaluation [1]: TREC8, is used to calculate the answer rank (relevance): (1) Relevance: the answer should be a response to the

where n is the number of test question. MRR= ∑ 1 questions and r

is the rank of the first r

(2) Correctness: the answer should be factually correct. correct answer for the i-th test

i=1

(3) Conciseness: the answer should not contain extraneous or question. irrelevant information.  Confidence Weighted Score (CWS): (4) Completeness: the answer should be complete (not a part The confidence about the correctness of an answer is of the answer). evaluated using another metric called Confidence Weighted (5) Justification: the answer should be supplied with Score (CWS), which was defined for TREC11: sufficient context to allow a user to determine why this was chosen as an answer to the question. CWS= ∑ p

questions and p

is the precision of nBased on the aforementioned criteria, there are three different the answers at positions from 1 to i in judgments for an answer extracted from a document: the ordered list of answers. - “Correct”: if the answer is responsive to a question in a correct way - (criteria 1 & 2).

1 EVALUATION METRICS THE EVALUATION OF QA SYSTEMS IS DETERMINED ACC...

4.1 Evaluation Metrics

Bạn đang xem 4. - QUESTION ANSWERING SYSTEM AND ITS APPLICATION