2.4 KINDS OF CONSTRAINT NETWORK THEREFORE WE COLLECTED ALL OF THE “C...

3.2.4 Kinds of constraint network

Therefore we collected all of the “creative” people

There are an unlimited number of possible con-

in the TREC9 question set, and divided them up into

straint networks that can be constructed. We have

classes by profession, so we had, for example, male

experimented with the following:

singers Bob Marley, Ray Charles, Billy Joel and

Timelines. People and even artifacts have life-

Alice Cooper; poets William Wordsworth and

cycles. The examples in this paper exploit these.

Langston Hughes; painters Picasso, Jackson Pollock

1

Painting is only an example of an activity in these constraints.

2

This set did not contain definition questions, which, by our Any other achievement that is usually associated with adulthood inspection, lend themselves readily to reciprocation. can be used.

What year did X have W

i

?

and Vincent Van Gogh, etc. – twelve such groupings

Who had W

i

?

in all. For each set, we entered the individuals in the

“Google Sets” interface

The top 5 answers to each of these are returned,

(https://traloihay.net), which finds “similar”

again as long as they pass a confidence threshold.

entities to the ones entered. For example, from our

We added a sixth answer “NIL” to each of the date

set of male singers it found: Elton John, Sting, Garth

sets, with a confidence equal to the rejection thresh-

Brooks, James Taylor, Phil Collins, Melissa

old. (NIL is the code used in TREC ever since

Etheridge, Alanis Morissette, Annie Lennox, Jack-

TREC10 to indicate the assertion that there is no

son Browne, Bryan Adams, Frank Sinatra and Whit-

answer in the corpus.) We used a two stage con-

ney Houston.

straint-satisfaction process:

Altogether, we gathered 276 names of creative

Stage 1: For each work W

i

for subject X, we

individuals this way, after removing duplicates,

added together its original confidence to the confi-

items that were not names of individuals, and names

dence of the answer X in the answer set of the recip-

that did not occur in our test corpus (the AQUAINT

rocal question (if it existed – otherwise we added

corpus). We then used our system manually to help

zero). If the total did not exceed a learned threshold

us develop “ground truth” for a randomly selected

(.50) the work was rejected.

subset of 109 names. This ground truth served both

Stage 2. For each subject, with the remaining

as training material and as an evaluation key. We

candidate works we generated all possible combina-

split the 109 names randomly into a set of 52 for

tions of the date answers. We rejected any combina-

training and 57 for testing. The training process

tion that did not satisfy the following constraints:

used a hill-climbing method to find optimal values

for three internal rejection thresholds. In developing

DIED >= BORN + 7

the ground truth we might have missed some in-

DIED <= BORN + 100

stances of assertions we were looking for, so the

WORK >= BORN + 7

reported recall (and hence F-measure) figures should

WORK <= BORN + 100

be considered to be upper bounds, but we believe the

WORK <= DIED

calculated figures are not far from the truth.

DIED <= WORK + 100