THE ANSWER ITSELF (USUALLY A PHRASE), IS PRESENTEDIN BOLD. ADDITIONALL...
1): The answer itself (usually a phrase), is presented
in bold. Additionally, a paragraph relating the an-
Lin et al. (2003) performed a study with
swer to the question is shown, and in this paragraph
32 computer science students comparing four
one sentence containing the answer is highlighted.
types of answer context: exact answer, answer-
Note also, that each paragraph contains a link that
in-sentence, answer-in-paragraph, and answer-in-
takes the user to the Wikipedia article, should he/she
document. Since they were interested in interface
want to know more about the subject. The intention
design, they worked with a system that answered
behind this mode of presentation is to prominently
all questions correctly. They found that 53% of all
display the piece of information the user is most in-
participants preferred paragraph-sized chunks, 23%
terested in, but also to present context information
preferred full documents, 20% preferred sentences,
and to furthermore provide options for the user to
and one participant preferred exact answer.
find out more about the topic, should he/she want to.
Web search engines typically show results as a
list of titles and short snippets that summarize how
3 Finding Supportive Wikipedia
the retrieved document is related to the query terms,
Paragraphs
often called query-biased summaries (Tombros and
Sanderson, 1998). Recently, Kaisser et al. (2008)
We use Lucene (Hatcher and Gospodneti´c, 2004) to
conducted a study to test whether users would pre-
index the publically available Wikipedia dumps (see
fer search engine results of different lengths (phrase,
https://traloihay.net). The text inside the
sentence, paragraph, section or article) and whether
dump is broken down into paragraphs and each para-
the optimal response length could be predicted by
graph functions as a Lucene document. The data of
human judges. They find that judges indeed pre-
each paragraph is stored in three fields: Title, which
fer different response lengths for different types of
contains the title of the Wikipedia article the para-
queries and that these can be predicted by other
graph is from, Headers, which lists the title and all
judges.
section and subsection headings indicating the posi-
In this demo, we opted for a slightly different, yet
tion of the paragraph in the article and Text, which
related approach: The system does not decide on
stores the text of the article. An example can be seen
in Table 1.
Additionally, during question analysis, certain
question constituents are marked as either Topic or
Title “Tom Cruise”Focus (see Moldovan et al., (1999)). For the earlier
Headers “Tom Cruise/Relationships and personalexample question “Tom Cruise” becomes the Topic
life/Katie Holmes”while “married” is marked Focus
2