SECTION 3.3. A BENEFIT, SINCE IT GIVES MORE OPPORTUNITY FOR ENFORC-

5.1 Probability-based Scores

Our A

NSWER

S

ELECTION

component assigns scores

While the overall percentage improvement was

to candidate answers on the basis of the number of

small, note that only second–place answers were

terms and term-term syntactic relationships from the

candidates for re-ranking, and 43% of these were

original question found in the answer passage

systems – the more occurrences of a candidate an-

(where the candidate answer and wh-word(s) in the

swer in retrieved passages the higher the answer’s

question are identified terms). The resulting num-

score is made to be. Consequently, at the very least,

a string-matching operation is needed for checking

bers are in the range 0-1, but are not true probabili-

ties (e.g. where answers with a score of 0.7 would be

equivalence, but other techniques are used to vary-

ing degrees.

correct 70% of the time). While the generated

scores work well to rank candidates for a given

It has long been known in IR that stemming or lem-

question, inter-question comparisons are not gener-

matization is required for successful term matching,

ally meaningful. This made the learning of a deci-

and in NLP applications such as QA, resources such

sion tree (Algorithm A) quite difficult, and we

expect that when addressed, will give better per-

as WordNet (Miller, 1995) are employed for check-

ing synonym and hypernym relationships; Extended

formance to the Constraints process (and maybe a

simpler algorithm). This in turn will make it more

WordNet (Moldovan & Novischi, 2002) has been

feasible to re-rank the top 10 (say) original answers,

used to establish lexical chains between terms.

instead of the current 2.

However, the Constraints work reported here has

highlighted the need for more extensive equivalence