SECTION A BENEFIT, SINCE IT GIVES MORE OPPORTUNITY FOR ENFORC-

3.4 Profiting From Inversions

As shown in the previous section, not all questions

Broadly speaking, our goal is to keep or re-rank the

have easily generated inverted forms (even by a hu-

candidate answer hit-list on account of inversion

man). However, we do not need to explicate the

results. Suppose that a question Q is inverted

inverted form in natural language in order to process

around pivot term T, and for each candidate answer

the inverted question.

C

i

, a list of “inverted” answers {C

ij

} is generated as

described in the previous section. If T is on one of

In our system, a question is processed by the

the {C

ij

}, then we say that C

i

is validated. Valida-

tion is not a guarantee of keeping or improving C

i

’s

Q

UESTION

P

ROCESSING

module, which produces a

position or score, but it helps. Most cases of failure

structure called a QFrame, which is used by the sub-

to validate are called refutation; similarly, refutation

sequent S

EARCH

and A

NSWER

S

ELECTION

modules.

The QFrame contains the list of terms and phrases in

of C

i

is not a guarantee of lowering its score or posi-

the question, along with their properties, such as

tion.

POS and NE-type (if it exists), and a list of syntactic

relationship tuples. When we have a candidate an-

It is an open question how to adjust the results of the

swer in hand, we do not need to produce the inverted

initial candidate answer list in light of the results of

English question, but merely the QFrame that would

the inversion. If the scores associated with candi-

have been generated from it. Figure 1 shows that

date answers (in both directions) were true prob-

the C

ONSTRAINTS

M

ODULE

takes the QFrame as one

abilities, then a Bayesian approach would be easy to

develop. However, they are not in our system. In

of its inputs, as shown by the link from QP in QS1

addition, there are quite a few parameters that de-

to CM. This inverted QFrame can be generated by a

scribe the inversion scenario.

set of simple transformations, substituting the pivot

term in the bag of words with a candidate answer

Suppose Q generates a list of the top-N candidates

<C

AND

A

NS

>, the original answer type with the type

{C

i

}, with scores {S

i

}. If this inversion method

of the pivot term, and in the relationships the pivot

were not to be used, the top candidate on this list,

term with its type and the original answer type with

<C

AND

A

NS

>. When relationships are evaluated, a

C

1

, would be the emitted answer. The question gen-

type token will match any instance of that type. Fig-

erated by inverting about T and substituting C

i

is

ure 2 shows a simplified view of the original

QT

i

. The system is fixed to find the top 10 passages

QFrame for “What was the capital of Germany in

responsive to QT

i

, and generates an ordered list C

ij

1945?”, and Figure 3 shows the corresponding In-

of candidate answers found in this set.

verted QFrame. C

OUNTRY

is determined to be a

better type to invert than Y

EAR

, so “Germany” be-

Each inverted question QT

i

is run through our sys-

comes the pivot. In Figure 3, the token

tem, generating inverted answers {C

ij

}, with scores

<C

AND

A

NS

> might take in turn “Berlin”, “Mos-

{S

ij

}, and whether and where the pivot term T shows

cow”, “Prague” etc.

up on this list, represented by a list of positions {P

i

},

where P

i

is defined as:

Keywords: {1945, Germany, capital}

AnswerType: C

APITAL

P

i

= j if C

ij

= T, for some j

P

i

= -1 otherwise

Relationships: {(Germany, capital), (capital,

C

APITAL

), (capital, 1945)}

We added to the candidate list the special answer

nil, representing “no answer exists in the corpus.”

Figure 2. Simplified QFrame

As described earlier, we had observed from training

Keywords: {1945, <C

AND

A

NS

>, capital}

data that failure to validate candidates of certain

AnswerType: C

OUNTRY

types (such as Person) would not necessarily be a

Relationships: {(C

OUNTRY

, capital), (capital,

real refutation, so we established a set of types

<C

AND

A

NS

>), (capital, 1945)}

SOFT

R

EFUTATION

which would contain the broadest

of our types. At the other end of the spectrum, we

Figure 3. Simplified Inverted QFrame.

observed that certain narrow candidate types such as

UsState would definitely be refuted if validation

The output of QS2 after processing the inverted

didn’t occur. These are put in set

MUST

C

ONSTRAIN

.

QFrame is a list of answers to the inverted question,

Our goal was to develop an algorithm for recomput-

which by extension of the nomenclature we call “in-

ing all the original scores {S

i

} from some combina-

verted answers.” If no term in the question has an

tion (based on either arithmetic or decision-trees) of

identifiable type, inversion is not possible.

{S

i

} and {S

ij

} and membership of

SOFT

R

EFUTATION

o P

i

was the rank of the validating answer to ques-

and

MUST

C

ONSTRAIN

. Reliably learning all those

tion QT

i

o A

i

was the score of the validating answer to QT

i

.

weights, along with set membership, was not possi-

ble given only several hundred questions of training

Algorithm A. Answer re-ranking using con-

data. We therefore focused on a reduced problem.

straints validation data.

We observed that when run on TREC question sets,