4. Overall utility (U ): the search session as a
extensive evaluation and implementing a dialogue
whole is assessed via a 7-point Likert scale.
interface to improve the system’s interactivity.
We performed our evaluation by running 24
queries (some of which in Tab. 2) on Google and
References
YourQA and submitting the results –i.e. Google
result page snippets and YourQA passages– of
E. Alfonseca, M. DeBoni, J.-L. Jara-Valencia, and
S. Manandhar. 2001. A prototype question answer-
both to 20 evaluators, along with a questionnaire.
ing system using syntactic and semantic information
The relevance results (P 1 and P 2 ) in Tab. 1 show a
for answer retrieval. In Text REtrieval Conference.
P 1 P 2 S 1 S 2 U
L. Ardissono, L. Console, and I. Torre. 2001. An adap-
tive system for the personalized access to news. AI
Google 0,39 0,63 4,70 4,61 4,59
Commun., 14(3):129–147.
YourQA 0,51 0,79 5,39 5,39 5,57
K. Collins-Thompson and J. P. Callan. 2004. A lan-
guage modeling approach to predicting reading dif-
Table 1: Evaluation results
ficulty. In Proceedings of HLT/NAACL.
Bạn đang xem 4. - BÁO CÁO KHOA HỌC ADAPTIVITY IN QUESTION ANSWERING WITH USER MODELLING AND A DIALOGUE INTERFACE PPTX