. THE PROCESSING STEPS DESCRIBED IN THEQUERY ANSWER SENTENCEPHRAS...

2009). The processing steps described in the

Query Answer Sentencephrase phrase

next sections build on its output. For reasons of

“Alaska territory” “territory of Alaska”

brevity, we skip a detailed explanations in this pa-

“purchased” “acquisition”

per and focus only on its key part: the alignment

ANSWER “1867”

of words with very different surface structures.

For more details we would like to point the reader

In our approach, this is a two step process.

to the aforementioned work.

First we align on a word level, then the output

of the word alignment process is used to iden-

In the above example, the alignment of “pur-

chased” and “acquisition” is the most problem-

Klein and Manning, 2003a), so at this point they

atic, because the surface structures of the two

are simply loaded from file. Step 4 is the key step

in our algorithm. From the previous steps, we

words clearly are very different. For such cases

know where the key constituents from the ques-

we experimented with a number of alignment

tion as well as the answer are located in the an-

strategies based on WordNet. These approaches

are similar in that each picks one word that has to

swer sentence. This enables us to compute the

dependency paths in the answer sentences’ parse

be aligned from the question at a time and com-

tree that connect the answer with the key con-

pares it to all of the non-stop words in the answer

stituents. In our example the answer is “1867”

sentence. Each of the answer sentence words is

assigned a value between zero and one express-

and the key constituents are “acquisition” and

“Alaska.” Knowing the syntactic relationships

ing its relatedness to the question word. The

highest scoring word, if above a certain thresh-

(captured by their dependency paths) between the

answer and the key phrases enables us to capture

old, is selected as the closest semantic match.

Most of these approaches make use of Word-

one syntactic possibility of how answer sentences

to queries of the form When+was+NP+VERB can

Net::Similarity, a Perl software package that mea-

be formulated.

sures semantic similarity (or relatedness) between

a pair of word senses by returning a numeric value

As can be seen in Step 5 a flat syntactic ques-

tion representation is stored, together with num-

that represents the degree to which they are sim-

bers assigned to each constituent. The num-

ilar or related (Pedersen et al., 2004). Addition-

ally, we developed a custom-built method that as-

bers for those constituents for which alignments

sumes that two words are semantically related if

in the answer sentence were sought and found

any kind of pointer exists between any occurrence

are listed together with the resulting dependency

paths. Path 3 for example denotes the path from

of the words root form in WordNet. For details of

constituent 3 (the NP “Alaska”) to the answer. If

these experiments, please refer to (Kaisser, 2009).

no alignment could be found for a constituent,

In our experiments the custom-built method per-

formed best, and was therefore used for the exper-

null is stored instead of a path. Should two or

more alternative constituents be identified for one

iments described in this paper. The main reasons

question constituent, additional patterns are cre-

for this are:

ated, so that each contains one of the possibilities.