THE ACQUISITION OF ALASKA BY THE UNITEDMODULE, ALL NAMED ENTITIES O...

4. The acquisition of Alaska by the United

module, all named entities of the expected answer

States of America from Russia in 1867 is

known as “Seward’s Folly”.

types are treated as answer candidates. For ques-

tions with an unknown answer type, all NPs in

The remaining three sentences introduce vari-

the candidate sentence are considered. Then those

ous forms of syntactic and semantic transforma-

paths in the answer sentence that are connected

tions. In order to capture a wide range of possible

to an answer candidate are compared against the

ways on how answer sentences can be formulated,

corresponding paths in the question, in a similar

in our model a candidate sentence is not evalu-

fashion as in (Cui et al., 2005). The candidate

ated according to its similarity with the question.

whose paths show the highest matching score is

Instead, its similarity to known answer sentences

selected. (Shen and Klakow, 2006) also describe

(which were presented to the system during train-

a method that is primarily based on similarity

ing) is evaluated. This allows to us to capture a

scores between dependency relation pairs. How-

much wider range of syntactic and semantic trans-

ever, their algorithm computes the similarity of

formations.

paths between key phrases, not between words.

Furthermore, it takes relations in a path not as in-

3 Overview of the Algorithm

dependent from each other, but acknowledges that

they form a sequence, by comparing two paths

Our algorithm uses input data containing pairs of

with the help of an adaptation of the Dynamic

the following:

Time Warping algorithm (Rabiner et al., 1991).

(Molla, 2006) presents an approach for the ac-

NLQs/Questions NLQs that describe the users’

quisition of question answering rules by apply-

information need. For the experiments car-

ing graph manipulation methods. Questions are

ried out in this paper we use questions from

represented as dependency graphs, which are ex-

the TREC QA track 2002-2006.

tended with information from answer sentences.

Relevant textual content This is a piece of text

These combined graphs can then be used to iden-

that is relevant to the user query in that it

tify answers. Finally, in (Wang et al., 2007), a

contains the information the user is search-

quasi-synchronous grammar (Smith and Eisner,

ing for. In this paper, we use sentences ex-