3 ANSWER EXTRACTIONA SUBJECT OF DEBATE. THESE ARE SOME TOP RESULTS

Question

4.3 Answer extractiona subject of debate. These are some top results:— U M good : “Most Classicists would agree that, whetherIn this phase, the clustered documents are filteredthere was ever such a composer as &#34;Homer&#34; or not, thebased on the user model and answer sentences areHomeric poems are the product of an oral tradition [. . . ]located and formatted for presentation.Could the Iliad and Odyssey have been oral-formulaic po-UM-based filtering The documents in the clus-ems, composed on the spot by the poet using a collection ofter tree are filtered according to their reading diffi-memorized traditional verses and phases?”culty: only those compatible with the UM’s read-— U M med : “No reliable ancient evidence for Homer –ing level are retained for further analysis 6 .[. . . ] General ancient assumption that same poet wrote Il-iad and Odyssey (and possibly other poems) questioned bySemantic similarity Within each of the retainedmany modern scholars: differences explained biographi-documents, we seek the sentences which are se-cally in ancient world (e g wrote Od. in old age); but simi-mantically most relevant to the query by applyinglarities could be due to imitation.”the metric in (Alfonseca et al., 2001): we rep-— U M poor : “Homer wrote The Iliad and The Odysseyresent each document sentence p and the query(at least, supposedly a blind bard named &#34;Homer&#34; did).”q as word sets P = {pw 1 , . . . , pw m } and Q =In the three results, the problem of attribution of{qw 1 , . . . , qw n }. The distance from p to q is thenthe Iliad is made clearly visible: document pas-dist q (p) = P 1 ≤i≤m min j [d(pw i , qw j )], wheresages provide a context which helps to explain thed(pw i , qw j ) is the word-level distance betweencontroversy at different levels of difficulty.pw i and qw j based on (Jiang and Conrath, 1997).6 EvaluationRanking Given the query q, we thus locatein each document D the sentence p ∗ such thatSince YourQA does not single out one correct an-p ∗ = argmin p∈D [dist q (p)]; then, dist q (p ∗ ) be-swer phrase, TREC evaluation metrics are not suit-comes the document score. Moreover, each clus-able for it. A user-centred methodology to assess5The likelihood is estimated using the formula:how individual information needs are met is moreLi,D = Pappropriate. We base our evaluation on (Su, 2003),w∈DC(w, D) · log(P (w|lmi)), where w is aword in the document, C(w, d) is the number of occurrenceswhich proposes a comprehensive search engineof w in D and P (w|lmi) is the probability with which wevaluation model, defining the following metrics:occurs in lmi6However, if their number does not exceed a given thresh-

3 ANSWER EXTRACTIONA SUBJECT OF DEBATE. THESE ARE SOME TOP RESULTS

4.3 Answer extraction

a subject of debate. These are some top results:

— U M good : “Most Classicists would agree that, whether

In this phase, the clustered documents are filtered

there was ever such a composer as "Homer" or not, the

based on the user model and answer sentences are

Homeric poems are the product of an oral tradition [. . . ]

located and formatted for presentation.

Could the Iliad and Odyssey have been oral-formulaic po-

UM-based filtering The documents in the clus-

ems, composed on the spot by the poet using a collection of

ter tree are filtered according to their reading diffi-

memorized traditional verses and phases?”

culty: only those compatible with the UM’s read-

— U M med : “No reliable ancient evidence for Homer –

ing level are retained for further analysis 6 .

[. . . ] General ancient assumption that same poet wrote Il-

iad and Odyssey (and possibly other poems) questioned by

Semantic similarity Within each of the retained

many modern scholars: differences explained biographi-

documents, we seek the sentences which are se-

cally in ancient world (e g wrote Od. in old age); but simi-

mantically most relevant to the query by applying

larities could be due to imitation.”

the metric in (Alfonseca et al., 2001): we rep-

— U M poor : “Homer wrote The Iliad and The Odyssey

resent each document sentence p and the query

(at least, supposedly a blind bard named "Homer" did).”

q as word sets P = {pw 1 , . . . , pw m } and Q =

In the three results, the problem of attribution of

{qw 1 , . . . , qw n }. The distance from p to q is then

the Iliad is made clearly visible: document pas-

dist q (p) = P 1 ≤i≤m min j [d(pw i , qw j )], where

sages provide a context which helps to explain the

d(pw i , qw j ) is the word-level distance between

controversy at different levels of difficulty.

pw i and qw j based on (Jiang and Conrath, 1997).

6 Evaluation

Ranking Given the query q, we thus locate

in each document D the sentence p ∗ such that

Since YourQA does not single out one correct an-

p ∗ = argmin p∈D [dist q (p)]; then, dist q (p ∗ ) be-

swer phrase, TREC evaluation metrics are not suit-

comes the document score. Moreover, each clus-

able for it. A user-centred methodology to assess

The likelihood is estimated using the formula:

how individual information needs are met is more

L

= P

appropriate. We base our evaluation on (Su, 2003),

C(w, D) · log(P (w|lm

)), where w is a

word in the document, C(w, d) is the number of occurrences

which proposes a comprehensive search engine

of w in D and P (w|lm

) is the probability with which w

evaluation model, defining the following metrics:

occurs in lm

However, if their number does not exceed a given thresh-

Bạn đang xem 4. - BÁO CÁO KHOA HỌC ADAPTIVITY IN QUESTION ANSWERING WITH USER MODELLING AND A DIALOGUE INTERFACE PPTX

— U M _good : “Most Classicists would agree that, whether

— U M _med : “No reliable ancient evidence for Homer –

ing level are retained for further analysis ⁶ .

— U M _poor : “Homer wrote The Iliad and The Odyssey

q as word sets P = {pw ₁ , . . . , pw _m } and Q =

{qw ₁ , . . . , qw _n }. The distance from p to q is then

dist _q (p) = ^P ₁ _≤i≤m min _j [d(pw _i , qw _j )], where

d(pw i , qw _j ) is the word-level distance between

pw _i and qw _j based on (Jiang and Conrath, 1997).

in each document D the sentence p ^∗ such that

p ^∗ = argmin _p∈D [dist q (p)]; then, dist _q (p ^∗ ) be-