SECTION 4 HIGHLIGHTS THE MANAGEMENT OF THE INTER-FORMANCE OF A SYSTEM...

15) Weapons: Chemical, Bilogical, Materials, Stockpiles, Facilities, Access

Figure 2: Example of a Dialogue Scenario.

3 Modeling the Dialogue Topic

The notion of topic signatures was first introduced

in (Lin and Hovy, 2000). For each subtopic in a sce-

Our experiments in interactive Q/A were based on

nario, given (a) documents relevant to the sub-topic

several scenarios that were presented to us as part

and (b) documents not relevant to the subtopic, a sta-

of the ARDA Metrics Challenge Dialogue Work-

tistical method based on the likelihood ratio is used

shop. Figure 2 illustrates one of these scenarios. It

to discover a weighted list of the most topic-specific

is to be noted that the general background consists

concepts, known as the topic signature. Later work

of a list of subject areas, whereas the scenario is a

by (Harabagiu, 2004) demonstrated that topic sig-

narration in which several sub-topics are identified

natures can be further enhanced by discovering the

(e.g. production of toxins or exportation of materi-

most relevant relations that exist between pairs of

als). The creation of scenarios for interactive Q/A

concepts. However, both of these types of topic rep-

requires several different types of domain-specific

resentations are limited by the fact that they require

knowledge and a level of operational expertise not

the identification of topic-relevant documents prior

available to most system developers. In addition to

to the discovery of the topic signatures. In our ex-

identifying a particular domain of interest, scenar-

periments, we were only presented with a set of doc-

ios must specify the set of relevant actors, outcomes,

uments relevant to a particular scenario; no further

and related topics that are expected to operate within

relevance information was provided for individual

the domain of interest, the salient associations that

subject areas or sub-topics.

may exist between entities and events in the sce-

In order to solve the problem of finding relevant

nario, and the specific timeframe and location that

documents for each subtopic, we considered four

bound the scenario in space and time. In addition,

different approaches:

real-world scenarios also need to identify certain op-

erational parameters as well, such as the identity of

Approach 1: All documents in the CNS col-

the scenario’s sponsor (i.e. the organization spon-

lection were initially clustered using K-Nearest

soring the research) and audience (i.e. the organiza-

Neighbor (KNN) clustering (Dudani, 1976).

tion receiving the information), as well as a series of

Each cluster that contained at least one key-

evidence conditions which specify how much verifi-

word that described the sub-topic was deemed

cation information must be subject to before it can

relevant to the topic.

be accepted as fact. We assume the set of sub-topics

mentioned in the general background and the sce-

Approach 2: Since individual documents may

nario can be used together to define a topic structure

contain discourse segments pertaining to differ-

ent sub-topics, we first used TextTiling (Hearst,

that will govern future interactions with the Q/A sys-

tem. In order to model this structure, the topic rep-