OVERALL UTILITY (U )

4. Overall utility (U ): the search session as a

extensive evaluation and implementing a dialogue

whole is assessed via a 7-point Likert scale.

interface to improve the system’s interactivity.

We performed our evaluation by running 24

queries (some of which in Tab. 2) on Google and

References

YourQA and submitting the results –i.e. Google

result page snippets and YourQA passages– of

E. Alfonseca, M. DeBoni, J.-L. Jara-Valencia, and

S. Manandhar. 2001. A prototype question answer-

both to 20 evaluators, along with a questionnaire.

ing system using syntactic and semantic information

The relevance results (P 1 and P 2 ) in Tab. 1 show a

for answer retrieval. In Text REtrieval Conference.

P 1 P 2 S 1 S 2 U

L. Ardissono, L. Console, and I. Torre. 2001. An adap-

tive system for the personalized access to news. AI

Google 0,39 0,63 4,70 4,61 4,59

Commun., 14(3):129–147.

YourQA 0,51 0,79 5,39 5,39 5,57

K. Collins-Thompson and J. P. Callan. 2004. A lan-

guage modeling approach to predicting reading dif-

Table 1: Evaluation results

ficulty. In Proceedings of HLT/NAACL.