4, HIKARIDAI, SEIKA-CHO, SORAKU-GUN, KYOTO, JAPANABSTRACTON THEIR TA...

Question

2-4, Hikaridai, Seika-cho, Soraku-gun, Kyoto, Japan

Abstract

on their target domains, interfaces, and interactions

to draw out additional information from users to ac-

We have been investigating an interactive

complish set tasks, as is shown in Table 1. In this

approach for Open-domain QA (ODQA)

table, text and speech denote text input and speech

and have constructed a spoken interactive

input, respectively. The term “addition” represents

ODQA system, SPIQA. The system de-

additional information queried by the QA systems.

rives disambiguating queries (DQs) that

This additional information is separate to that de-

draw out additional information. To test

rived from the user’s initial questions.

the efficiency of additional information re-

quested by the DQs, the system recon-

structs the user’s initial question by com-

bining the addition information with ques-

Table 1: Domain and data structure for QA systems

tion. The combination is then used for an-

target domain specific open

swer extraction. Experimental results re-

data structure knowledge DB unstructured text

vealed the potential of the generated DQs.

without addition CHAT-80 SAIQAtext with addition MYCIN (SPIQA

^∗

)without addition Harpy VAQA

1 Introduction

speechwith addition JUPITER (SPIQA

^∗

)

Open-domain QA (ODQA), which extracts answers

∗SPIQA is our system.

from large text corpora, such as newspaper texts, has

been intensively investigated in the Text REtrieval

Conference (TREC). ODQA systems return an ac-

To construct spoken interactive ODQA systems,

tual answer in response to a question written in a

the following problems must be overcome: 1. Sys-

natural language. However, the information in the

tem queries for additional information to extract an-

first question input by a user is not usually sufficient

swers and effective interaction strategies using such

to yield the desired answer. Interactions for col-

queries cannot be prepared before the user inputs the

lecting additional information to accomplish QA are

question. 2. Recognition errors degrade the perfor-

needed. To construct more precise and user-friendly

mance of QA systems. Some information indispens-

ODQA systems, a speech interface is used for the

able for extracting answers is deleted or substituted

interaction between human beings and machines.

with other words.

Our goal is to construct a spoken interactive

ODQA system that includes an automatic speech

Our spoken interactive ODQA system, SPIQA,

recognition (ASR) system and an ODQA system.

copes with the first problem by adopting disam-

To clarify the problems presented in building such

biguating users’ questions using system queries. In

a system, the QA systems constructed so far have

addition, a speech summarization technique is ap-

been classified into a number of groups, depending

plied to handle recognition errors.

2 Spoken Interactive QA system: SPIQA

ODQA engine

The ODQA engine, SAIQA, has four compo-

Figure 1 shows the components of our system, and

nents: question analysis, text retrieval, answer hy-

the data that flows through it. This system com-

pothesis extraction, and answer selection.

prises an ASR system (SOLON), a screening filter

DDQ module

that uses a summarization method, and ODQA en-

When the ODQA engine cannot extract an appro-

gine (SAIQA) for a Japanese newspaper text corpus,

priate answer to a user’s question, the question is

a Deriving Disambiguating Queries (DDQ) module,

considered to be “ambiguous.” To disambiguate the

and a Text-to-Speech Synthesis (TTS) engine (Fi-

initial questions, the DDQ module automatically de-

nalFluet).

rives disambiguating queries (DQs) that require in-

formation indispensable for answer extraction. The

Additional

Question

info.

New question

reconstructor

situations in which a question is considered ambigu-

Recognition

Question/

ous are those when users’ questions exclude indis-

result

Additional info.

ODQA engine

Screening

ASR

First

filter

(SAIQA)

pensable information or indispensable information

question

User

_Answer/

Answer

sentence

is lost through ASR errors. These instances of miss-

DDQ speech

TTS

derived?

Yes

sentence generator

ing information should be compensated for by the

No

DDQ

users.

module

To disambiguate a question, ambiguous phrases

Figure 1: Components and data flow in SPIQA.

within it should be identified. The ambiguity of

each phrase can be measured by using the struc-

tural ambiguity and generality score for the phrase.

ASR system

The structural ambiguity is based on the dependency

Our ASR system is based on the Weighted Finite-

structure of the sentence; phrase that is not modified

State Transducers (WFST) approach that is becom-

by other phrases is considered to be highly ambigu-

ing a promising alternative formulation for the tra-

ous. Figure 2 has an example of a dependency struc-

ditional decoding approach. The WFST approach

ture, where the question is separated into phrases.

offers a unified framework representing various

Each arrow represents the dependency between two

knowledge sources in addition to producing an op-

phrases. In this example, “the World Cup” has no

timized search network of HMM states. We com-

bined cross-word triphones and trigrams into a sin-

gle WFST and applied a one-pass search algorithm

Which country in Southeast Asia won the world cup ?

to it.

Figure 2: Example of dependency structure.

Screening filter

modifiers and needs more information to be identi-

To alleviate degradation of the QA’s perfor-

fied. “Southeast Asia” also has no modifiers. How-

mance by recognition errors, fillers, word fragments,

ever, since “the World Cup”appears more frequently

and other distractors in the transcribed question, a

than “Southeast Asia” in the retrieved corpus, “the

screening filter that removes these redundant and

World Cup” is more difficult to identify. In other

irrelevant information and extracts meaningful in-

words, words that frequently occur in a corpus rarely

formation is required. The speech summarization

help to extract answers in ODQA systems. There-

approach (C. Hori et. al., 2003) is applied to the

fore, it is adequate for the DDQ module to generate

screening process, wherein a set of words maximiz-

questions relating to “World Cup” in this example,

ing a summarization score that indicates the appro-

such as “What kind of World Cup?” , “What year

priateness of summarization is extracted automati-

was the World Cup held?”.

cally from a transcribed question, and these words

The structural ambiguity of the

n

-th phrase is de-

are then concatenated together. The extraction pro-

fined as

cess is performed using a Dynamic Programming

A

D

(P

n

) = log1−

^N

_i=1:i=n

D(P

i

, P

n

),

(DP) technique.

where the complete question is separated into

N

system. The question transcriptions were processed

with a screening filter and input into the ODQA

phrases, and

D(P

_i

, P

_n

)

is the probability that phrase

P

n

will be modified by phrase

P

i